of preexisting materials or of data that are selected, coordinated, or arranged in such a way that the resulting work as a whole constitutes an original work of authorship" (Section 101, (Definitions), 1976 General Revision). Compilations are copyrightable under Section 103 of the 1976 General Revision, but the copyright is in the organization of the materials and not in any used materials that are in the public domain or are already copyrighted. Copyright in the compilation does not imply any exclusive right in the preexisting used materials. As examples, a telephone book, a gazetteer, and an almanac are all compilations in which copyright subsists primarily in the organization of the materials and not in the individual materials contained therein.

This type of work has been given copyright protection in human-readable form as a type of literary work, one of the categories of protectable subject matter.

As the House Report 94-1476 makes clear (on page 54),

"The term 'literary works' does not connote any criterion of literary merit or qualitative value: it includes catalogs, dictionaries, and similar factual, reference, or instructional works and compilations of data ..."

The House Report goes on to state that "computer data bases" are also literary works with the implication that they are copyrightable, but for certainty about that question, the caveat "in the absence of Section 117" should be added. In the long run, however, Section 117 is certain to be excised or significantly altered, and therefore the caveat will be rendered moot. There seems to be no serious opposition to the copyrightability of compilations in computer-readable form.

Other literary works of a factual nature for example, encyclopedias and other reference works, may be used and treated as data bases even though copyright may subsist in the literary expressions in the entire works. A work of this type may be either a "collective work" like an encyclopedia, or a reference work on a specialized subject by a single author, e.g. Nimmer on Copyright. Copyrightability in the computer-readable form of the work is just as clear for these works as it is for compilations. The following discussion will concern computer-readable data bases in general without regard to their subcategory as either compilations, collective works, or literary works of a single author. The important connecting element of all of them is how they are used.

5.6.1 Publication Only in Computer-Readable Form

There may be some question as to what constitutes publication of a computer-readable data base that has not been published previously in a paper edition. It is assumed that the date of publication of a computer-readable data base that has been published previously in a paper edition without any change in content is the same date as that for the paper edition. Display Only, Single Licensee: The particular situation of interest here is that in which the data base is made available only through user terminals attached to a central computer. This is a typical method of permitting accessibility. It is assumed that the central computer is owned either by the copyright proprietor or by a distributor who has obtained the data base from the proprietor under an exclusive license.

Now, if either the proprietor or the exclusive licensee make the data base available by display only at the terminals and do not permit printouts to change hands, no publication has occurred. The basis of this statement is the definition of "publication," in Section 101, and the explanatory material in House Report 94-1476 at page 138 and Senate Report 94-473 at page 121. (The pertinent sentences from both reports beginning "Under the definition in Section 101. .." are identical):

First, the definition states that "display of a work does not of itself constitute publication." Thus the proprietor's display is not publication. However, the definition also states that "the offering to distribute copies ... to a group of persons for purposes of further distribution ... or public display, constitutes publication." Thus, distribution to a single exclusive licensee for display purposes only is not publication (since a single individual is not a group).

Suppose the proprietor distributed the data base to two or more licensees for display only. Whether this constitutes publication depends on how many licensees constitutes "a group" The answer to this question had best be left to the Judiciary or to further Congressional interpretation. Printouts at Terminals: If users at terminals are permitted to make printouts of retrieved material, without any "explicit or implicit restrictions with respect to disclosure of the contents," then publication has occurred. The argument could be made that if restrictions are placed on disclosure or distribution of the printouts, then no publication has occurred. However, since the concept of "publication" is no longer central to copyright, extended analysis of particular situations is unwarrented at this point. In any event, it would be expected, if there is a likelihood that a printout would be considered "published," that a proprietor or a licensee would be sure to have the computer mark each printout with a complete notice of copyright to insure that proprietary rights were protected under Chapter 4 of the 1976 General Revision. Identity of the Publication: The question of exactly what has been published remains to be discussed. The printouts, if provided under no restriction, are published material. The physical printout belongs to the user who paid for it. The copyright ownership of the printouts belongs to the proprietor of the data base. This is not unusual. When a book is purchased at retail, the buyer owns the book and the publisher continues to own the copyright in the content.

The argument could be made that only the printouts have been published and the data base has not been published. After all, only the printouts have changed hands; and it is assumed here that the proprietor or his exclusive licensee have retained control of the full data base. In the manner in which data base systems are operated, a user identifies a particular set of categories of information in which he is interested and queries the data base. The data base system responds with the number of items in the set, and on command, the text retrieved is shown on a CRT terminal. If the user is satisfied with the text retrieved, he requests a printout. It would seem that the printout is a "derivative work," similar to an abridgment or condensation (see Section 101 for definition), and there appears to be no requirement that a published derivative work be based on a published preexisting work. On the other hand, each printout may be different, depending on the specific query which the user has entered into the computer. Thus, the published "derivative works" may be one of a kind. Needed Clarification: It seems reasonable to suggest that a clarification of what constitutes publication of a computer-readable data base is in order. For example, a reasonable understanding is that a computer-readable data base is to be considered "published" in its entirety if it is offered to the public on a query basis such that any item in the data base is capable of being retrieved and printed out and the printouts become the physical property of the users on the basis of

restricted disclosure. Furthermore, "publication" occurs in this situation whether the offering to users is made by the proprietor or his licensee.

Additional clarification appears to be needed, also, in the definition of how many persons constitute "a group of persons" as the number of distributors to whom a work has to be offered in order to be published. Furthermore, it does not seem to be clear if a work is "published" if it is offered to a group of persons on a restricted-disclosure basis for further distribution on a restricted-disclosure basis.

5.6.2 Statutory Deposit to the Library of Congress

As was indicated in Section 5.5 above, there are valid public policy considerations that suggest the maximum disclosure of copyrighted works in return for copyright protection. There is no reason to exempt computer-readable data bases from these considerations.

The Library of Congress could be viewed in this connection as an archival location where anyone could view and peruse nearly any computerreadable work published with copyright notice. This would be an immense aid to scholarship, to historical review, and to the generation of new ideas for the future, as it has been with works in the older technological media.

The issue, then, is the form in which computer-readable data bases should be deposited under Section 407 in order to maximize their

availability, minimize storage and handling problems for the Library, not provide a hardship of supply to the proprietors and not strain fair use.

It is not immediately clear, on these criteria, whether the initial deposit should be a printout or a magnetic tape, but it seems reasonable to suggest that it should be the complete data base, not just identifying descriptions, regardless of which medium is chosen. The advantage of the printout is that any reader could peruse it without straining fair use. Microfilm could be used to reduce size and bulkiness. The advantage of the magnetic tape is that the data base is published in that medium; and it is a medium in which it is available for a scholar's manipulation and use, assuming it were an outdated tape that the proprietor no longer saw as an immediately marketable product that the scholar ought to buy by signing on the proprietor's computer system.

Many data bases are updated frequently, and it seems reasonable to suggest that a yearly update, containing only the new material added during the preceding year and the old material dropped, is not a burdensome requirement. The deposit of a complete data base, under the circumstances of continuous updating, could conceivably be required at least once in a period of several years, for example, ten.

In Section 4.7 of this report, the question of monopoly was discussed, and it was noted that the existence of an economic monopoly depends on the availability of substitutable works. In works produced for the general consumer, there may be high substitutability among individual works.

However, an important distinction must be noted between the respective market behaviors of the general consumer and the researcher-consumer of copyrighted works. The general consumer typically selects competitively for purchase or use one (or a few) of a class of relatively substitutable works while rejecting all others. The researcher in any professional field desires to be comprehensive in the full-text as well as in the data base literature of his field. Thus, the researcher (or his library surrogate) cannot reject totally anything pertinent, and his marketplace behavior with respect to competitive producers cannot be analogous to the general consumer. The question may be asked whether there is a greater potential for a market monopoly in this situation. If such is the case, a question that may be asked is what form of intervention should be pursued by consumers collectively or by the Government.

With respect to scientific journal articles, the situation is ameliorated through the formation of professional societies which serve as the collective good to circumvent the implicit market failure. Furthermore, the social ethic of research is that all those involved, even in different organizations, benefit from the unimpeded flow of information.

This ethic may tend to lower the prices of journals produced by scientific societies rather than raise them. Therefore, any independent entrepreneur of a proprietary journal may find that the subscription prices that can be charged are limited by competition from journals of non-profit societies. The fact that the primary producer community and the final user community of scientific journal articles are essentially the same population may be a key factor in preventing monopoly pricing.

With respect to bibliographic and other specialized data bases, a different situation exists. In contrast to the situation with scientific journal articles, there is very little in the publication of continual updates of a data base that can be translated by a professional researcher into either financial or symbolic remuneration unless the work is a full-time business. Thus the producer and consumer communities need not be the same population and this particular negative feedback restraint on the subscription price of journals need not hold for data bases. It is not surprising, therefore, to find that (excluding Government production) a significant fraction of data bases used for research purposes are produced and distributed for profit as proprietary products.

The development of computer-based information retrieval systems based on machine-readable data bases has added an additional complicating factor. First, the development of a computer-readable data base (with continual updating to insure an indefinite life) requires a certain investment in data collection, organization, manipulation, and digital conversion. Clearly, those organizations that already have computeraided publishing systems to help produce hard-copy informational products may be able to generate computer-readable data bases as relatively inexpensive by-products. Secondly, a parameter of usefulness of a data base is the comprehensiveness of its coverage of a specific field; and conceivably, only the largest organization with well-established lines of data supply and customer acceptance may be able to satisfy this need.

Thus, the possibility exists that in some field of research, by virtue of economy of scale, an established system of suppliers and customers and already amortized costs of entry in the market, a single organization may achieve a virtual market monopoly over a class of nonsubstitutable computer-readable data bases. An anti-trust suit concerning this very problem is now under litigation in the field of computer-based legal information retrieval.

Additional sources of monopoly control and a potential solution are described in Appendix A, Section A.4.4.5 of this report. The following is excerpted from that Section:

"In some instances, publishers of data bases have leased them
exclusively for use in one computerized information service
system... Exclusive licensing of data bases may tend to
foster the monopolization of data base search services by one
or two giant systems. Whether the prevention of such a monopoly

