Lapas attēli
PDF
ePub

The associations that I am representing here today consider the problems of access to government information so pressing that in January we formed an Inter-Association Working Group on Government Information Policy. This group has begun identifying key issues that need to be addressed by legislation. Our draft working document, Goals for Revising U.S.C. Title 44 to Enhance Public Access to Federal Government Information, is attached. Also attached for your information is a document that identifies the essential components for enhanced public access to government information, and the responsibilities of all partners in the life cycle of government information (Attachments 5 and 6).

In closing, we believe that Congress, the Administration, and the courts should use electronic technologies to enhance the public's access to government information, not to diminish it. The channels of public access to government information must remain open, efficient, and technologically relevant. Libraries and your constituents are doing their part by investing in technologies to assist them in accessing electronic information. The federal government must fulfill its part of the partnership by investing in systems and services that provide the public with government information, and by assuring that valuable information created today will be preserved for future generations.

[blocks in formation]

2) Scientific American article regarding archiving electronic files, "Ensuring the Longevity of Digital Documents," (January 1995).

3) AALL Resolution on the U.S. Congressional Serial Set and the Bound Congressional Record.

4) Joint library association letter to the Public Printer on draft Study to Identify Measures Necessary for a Successful Transition to a More Electronic Federal Depository Library Program (April 26, 1996).

5) Goals for Revising U.S. C. Title 44 to Enhance Public Access to Federal Government Information, Draft Working Document prepared by the Inter-Association Working Group on Government Information Policy (April 1997).

6) Enhanced Library Access and Dissemination of Federal Government Information: A Framework for Future Discussion, Working Document of the American Association of Law Libraries, the American Library Association, the Association of Research Libraries, and the Special Libraries Association (June 26, 1995).

ORGANIZATIONAL BIOGRAPHIES

Attachment 1

AMERICAN ASSOCIATION OF LAW LIBRARIES (AALL)

The American Association of Law Libraries is a nonprofit educational organization with over 5,000 members nationwide. Our members respond to the legal and governmental information needs of legislators, judges, and other public officials at all levels of government, corporations and small businesses, law professors and students, attorneys, and members of the general public.

AMERICAN LIBRARY ASSOCIATION (ALA)

The American Library Association is a nonprofit educational organization of 58,000 librarians, library educators, information specialists, library trustees, and friends of libraries representing public, school, academic, state, and specialized libraries dedicated to the improvement of library and information services. A new five-year initiative, ALA Goal 2000, aims to have ALA and librarianship be as closely associated with the public's right to a free and open information society--intellectual participation--as it is with the idea of intellectual freedom.

ASSOCIATION OF RESEARCH LIBRARIES (ARL)

The Association of Research Libraries is a not-for-profit organization representing 120 research libraries in the United States and Canada. Its mission is to identify and influence forces affecting the future of research libraries in the process of scholarly communication. ARL programs and services promote equitable access to, and effective use of, recorded knowledge in support of teaching, research, scholarship, and community service.

CHIEF OFFICERS OF STATE LIBRARY AGENCIES (COSLA)

The Chief Officers of State Library Agencies is an independent organization of the chief officers of state and territorial agencies designated as the state library administrative agency and responsible for statewide library development. Its purpose is to identify and address issues of common concern and national interest; to further state library agency relationships with federal government and national organizations; and to initiate cooperative action for the improvement of library services to the people of the United States.

SPECIAL LIBRARIES ASSOCIATION (SLA)

The Special Libraries Association is an international professional association serving more than 14,000 members of the information profession, including special librarians, information managers, brokers, and consultants. The Association has 56 regional/state chapters in the U.S., Canada, Europe, and the Arabian Gulf States and 28 divisions representing subject interests or specializations. Special libraries/information centers can be found in organizations with specialized or focused information needs, such as corporations, law firms, news organizations, government agencies, associations, colleges, museums, and hospitals.

URBAN LIBRARIES COUNCIL

The Urban Libraries Council (ULC) is an association of large public libraries and corporations which serve them, organized to solve common problems, better understand new opportunities, and conduct applied research which improves professional practice. Full membership in ULC is open to public libraries in metropolitan areas, and to the corporations which serve them. Current library members (115+) provide public library services to over half the population of the United States.

Attachment 2

Reprinted with permission. Copyright 1995 by Scientific American, Inc. All rights reserved.

SCIENTIFIC AMERICAN

T

Ensuring the Longevity
of Digital Documents

The digital medium is replacing paper in a
dramatic record-keeping revolution. But such
documents
may be lost unless we act now

The year is 2045, and my grandchildren as vet unborn) are exploring the attic of my house (as ver unbought. They find a letter dated 1995 ar a CD-ROM. The letter says the disk contains a document that provides the key to obtaining my fortune (as vet unearned). My grandchildren are understandabiy excited. but they have never before seen a CD-except in old movies. Even if they can find a suitable disk drive, how will they run the software necessary to interpret what is on the disk? How can they read my obsolete digital document?

This umaginary scenario reveals some fundamental problems with digital documents. Without the explanatory letter, my grandchildren would have no reason to think the disk in my atac was worth deciphering. The letter possesses the enviable quality of being readable with no machinery, tools or special knowledge beyond that of English. Because digital information can be copied and recopied perfectly, it is often extolled for its supposed longevity. The truth. however, is that because of changing hardware and software, only the letter will be immediately intelligible 50 years from now.

Information technology is revolutionizing our concept of record keeping in an upheaval as great as the introduction of printing, if not of wriang itself. The current generaton of digital records has unique historical significance. Yet these

42

by Jeff Rothenberg

documents are far more fragile than
paper. placing the chronicle of our en-
tire period in jeopardy.

My concern is not unjustified. There
have aiready been several potentai dis-
asters. A 1990 House of Representa-
nives report describes the narrow es-
cape of the 1960 U.S. Census data. The
tabulations were originally stored on
tapes that became obsolete faster than
expected as revised recording formats
supplanted easing ones although most
of the informanon was successfully
transferred to newer media). The report
notes other close calls as well, involving
tapes of the Department of Health and
Human Services: files from the Nation-
al Commission on Marijuana and Drug
Abuse, the Public Land Law Review
Commission and other agencies: the
Combat Area Casualty file containing
P.O.W. and M.I.A. records for the Viet-
nam War; and herbicide information
needed to analyze the impact of Agent
Orange. Scientific data are in similar
jeopardy, as irreplaceable records of
numerous experiments conducted by
the National Aeronautics and Space Ad-
ministration and other organizations
age into oblivion

So far the undisputed losses are few. But the significance of many digital documents-those we consider too unimportant to archive-may become apparent only long after they become unreadable. Unfortunately, many of the traditional methods developed for ar

SCIENTIFIC AMERICAN January 1995

chiving printed matter are not applicable to elecToruc files. The content and historical value of thousands of records. databases and personal documents may be uremevably lost to future generations if we do not take steps to preserve them now.

From Here to Eternity

Ithough digital information is thec

A recally invulnerable to the ravag

es of ame. the physical media on which it is stored are far from eternal. If the opacal CD in my attic were a magnenc disk, attempang to read it would probably be fuale. Stray magnetic fields, oxidation and matenal decay can easily erase such disks. The contents of most digital media evaporate long before words written on high-quality paper. They often become unusably obsolete even sooner, as media are superseded by new, incompatible formats-how many readers remember eight-inch floppy disks? It is only slightly facetious to say that digital information lasts forever-or five years, whichever comes first.

Yet neither the physical fragility of digital media nor their lemminglike tendency toward obsolescence constitutes the worst of my grandchildren's problems. My progeny must not only extract the content of the disk but must also interpret it correctly. To understand their predicament, we need to examine the nature of digital storage. Digital infor

EFF ROTHENBERG

mation can be saved on any medium that is able to represent the binary digits ("bits") 0 and 1. We will call an intended. meaningful sequence of bits. with no intervening spaces. punctuation or formatung, a bit stream.

Remeving a bit stream requires a hardware device. such as a disk drive. and special circuitry for reading the physical representanon of the bits from the medium. Accessing the device from a given computer also requires a "driver" program. After the bit stream is retrieved, it must still be interpreted. This task is not straightforward, because a given bit stream can represent almost anything-from a sequence of integers

[blocks in formation]
[graphic][subsumed][subsumed][subsumed][subsumed][merged small][merged small][merged small]
[merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

EXPECTED LIFETIMES of common digital storage media are estimated conservatively to guarantee that none of the data are lost. (Analog tapes, such as those used for audio recordings, remain plavable for many years because they record more robust signals that degrade more gradually.) The estimated time to obsolescence for each medium refers to a particular recording format.

components. we must know the length of a bite.

One way to convey the length is to encode a "key" at the beginning of the bit stream. But this key must itself be represented by a byte of some length. A reader therefore needs another key to understand the first one. Computer scientists call the solution to such a recursive problem a "bootstrap" (from

cally separate elements. These elements are linked to one another by internal references. which consist of pointers to other elements or of patterns to be matched. Printed documents exhibit similar schemes. in which page numbers serve as pointers.

Interpreting a Bit Stream

my grandchildren manage

the fanciful image of pulling oneself up Supped the bit stream from the CDby the bootstraps). In this case, a bootstrap must provide some context, which humans can read. that explains how to interpret the digital storage medium. For my grandchildren, the letter accom. panying the disk serves this role.

After a bit stream is correctly parsed. we face another recursive problem. A byte can represent a number or an alphabenc character according to a code. To interpret such bytes, therefore, we need to know their coding scheme. But if we try to idenafy this scheme by inserang a code identifier in the bit stream itself, we will need another code identifier to interpret the first one. Again, human-readable context must serve as a bootstrap.

Even more problematic. bit streams may also contain complex cross-referencing informanon. The stream is often stored as a collection, or file, of bits that contains logically related but physi

ROM. Only then will they face their real challenge: interpreting the information embedded in the bit stream. Most files contain information that is meaningful solely to the software that created them. Word-processing files embed format instructions describing typography, layout and structure (titles, chapters and so on). Spreadsheet files embed formulas relating their cells. So-called hypermedia files contain information identifying and linking text, graphics, sound and temporal data.

For convenience, we call such embedded information-and all other aspects of a bit stream's representation. including byte length, character code and structure-the encoding of a document file. These files are essentially programs: instructions and data that can be interpreted only by appropriate software. A file is not a document in its own right

it merely describes a document that comes into eastence when the file is interpreted by the program that produced it. Without this program (or equivalent software, the document is a cryptic hostage of its own encoding.

Tnai-and-error might decode the intended text if the document is a simple sequence of characters. But if it is com plex. such a brute-force approach is unlikely to succeed. The meaning of a file is not inherent in the bits themselves, any more than the meaning of this sentence is inherent in its words. To understand any document, we must know what its content signes in the language of its intended reader. Unfortu nately, the intended reader of a document file is a program. Documents such as multimedia presentations are impossible to read without appropriate software: unlike printed words, they cannot just be held up to the light.

Is it necessary to run the specific program that created a document? In some cases, similar software may at least parnally be able to interpret the file. Still, it is naive to think that the encoding of any document-however natural it seems to us-will remain readable by future software for very long. Informanon technology continuaily creates new schemes, which often acandon their predecessors instead of subsuming them.

A good example of this phenomenon occurs in word processing. Most such programs allow writers to save their work as simple text, using the current seven-bit Amencan Standard Code for Information Interchange (or ASCI). Such text would be reiauvely easy to decode in the future i seven-bit ASCI remains the text standard of choice. Yet ASCII is by no means the only popular text standard, and there are proposals to extend it to a 16-bit code (to encompass non-English alphabets). Future readers may therefore not be able to guess the correct text standard. To complicate matters, authors rarely save their work as pure text. As Avra Michelson, then at the National Archives, and I pointed out in 1992. authors often format digital documents quite early in the writing process and add figures and foomotes to provide more readable and complete drafts.

If "reading" a document means simply extracang its content-without its onginal form-then we may not need to run the onginal software. But content can be lost in subtle ways. Translating word-processing formats, for instance. often displaces or eliminates headings, capnons or footnotes. Is this merely a loss of structure, or does it impinge on content? If we transform a spreadsheet into a table, deleting the formulas that

44-892 97-5

« iepriekšējāTurpināt »