INTRODUCTION

Communication among researchers is crucial in any science. It enables colleagues in the field to replicate and criticize findings, as well as to incorporate them into new work. At least since the days of Robert Boyle (1627-1691), a founder of the Royal Society and in his lifetime its most notable and influential fellow, chemical scientists have been at the forefront of efforts to facilitate and expedite such communication. As science has become ever larger and more complex and the technologies of information storage and retrieval more advanced, these scientists have had to develop and refine specialized information systems to meet their research needs.

 

SOCIETIES, CONFERENCES, AND JOURNALS

Gathering together for talks and demonstrations is one of the oldest means of communication, and it remains important to this day. The Royal Society and other academies formalized this tradition in the seventeenth century by holding such meetings in the same place at regular intervals.

Many scientists regularly communicated their results to one another by private letters or through mutually trusted scientific correspondents or “intelligencers” like the Parisian cleric, Marin Mersenne (1588-1648). The Royal Society’s secretary, Henry Oldenburg (~1618-77), served a similar function in gathering scientific intelligence from all over Britain, its colonies, and the Continent. Excerpts from these letters along with accounts of demonstrations before the Royal Society were printed up regularly as the Philosophical Transactions, thus giving birth to the scientific paper.

This chronology begins in the eighteenth century as scientists became sufficiently specialized and numerous to support journals and found societies limited to a single field, such as chemistry. Even more specialized chemical journals and societies continue to be founded, but space limitations require us to include only those from countries most active in chemistry and only the first chemical organization established in each country.

In the latter part of the nineteenth century, railroads and transoceanic steamships made feasible the international conferences that have proved so critical to an international system of chemical information. Another organizational development has been the founding in the last fifty years or so of societies to fill the needs of chemical information professionals themselves. Preeminent among such societies is the American Chemical Society’s (ACS) Division of Chemical Information, founded in 1948 as the Division of Chemical Literature.

 

NOMENCLATURE, SYMBOLS, AND STRUCTURAL DIAGRAMS

Chemical words, symbols, and structures, as published in journal articles and reference works, have remained the major focus of science information systems. But by the eighteenth century, the disorganized state of chemical nomenclature was recognized as a major obstacle to the field’s progress. Chemistry’s diverse roots in alchemy, pharmacy, and metallurgy had left it with a confusing and inconsistent collection of names and symbols for substances and with several conflicting ways of describing their composition. Chemists like Antoine Lavoisier made groundbreaking efforts to reform this situation. Investigations into atomic and molecular structure in the nineteenth and twentieth centuries presented chemical scientists with further problems of representation.

Lavoisier and his followers well recognized that language and symbol help shape our views of reality. Continuing changes in our understanding of chemical composition and structure have ensured that nomenclature and symbolic conventions remain an issue to the present day.

 

ABSTRACTS, REVIEWS, COMPILATIONS, AND INDEXES STORED AND RETRIEVED MANUALLY

This proliferation of discoveries, societies, and journals devoted to them posed both opportunities and challenges. By the nineteenth century, the amount of material published each year had grown too large for a single person to read. A variety of aids digested and presented the new knowledge in accessible forms allowing chemical scientists to limit their reading to topics that they found useful and to obtain shorter versions of articles that might warrant a full reading at a later time. The very first chemistry journal, Lorenz von Crell’s Chemisches Journal (1778), carried abstracts of foreign literature, and Thomas Thomson’s Annals of Philosophy (1814) included an annual review of chemical progress. The first chemistry-related journal just for abstracts was published as Pharmaceutisches Centralblatt in 1830 and soon began to cover all chemistry. By 1859, a weekly chemical periodical had appeared.

Also during the nineteenth century, compilations like Beilstein, Gmelin, and the Pharmacopoeia of the United States supplied chemists with easily accessible basic information and citations for nearly all known compounds. Beilstein published the first formula index in 1899 and the first index to permit substructure searching in 1918. All these aids to research required armies of workers writing notes on thousands of pieces of paper or index cards and coding and sorting them appropriately.

The need of scientists to have access to worldwide and interdisciplinary sources of knowledge has resulted in numerous cooperative efforts. In 1911 Wilhelm Ostwald funded and actively promoted a scheme to organize and make accessible all knowledge to such intellectuals (and chemists) as Svante Arrhenius and Ernst Solvay. The “Bridge movement” and a special institute that Ostwald planned to document the field of chemistry did not progress very far. But two institutions with nearly contemporaneous origins, the Chemical Abstracts Service growing out of the ACS’s Chemical Abstracts (1907) and the International Union of Pure and Applied Chemistry (1919), were quite successful in the long run in standardizing and organizing the dissemination of chemical information worldwide. Ultimately they were aided by the growth of English as the standard textual language of international scientific publication and by new technologies that made possible automated means of storing, retrieving, and exchanging information.

 

ABSTRACTS, REVIEWS, COMPILATIONS, AND INDEXES STORED AND RETRIEVED USING MECHANICAL OR ELECTROMECHANICAL SORTERS

Mechanical and electromechanical sorting of punched and edge-notched cards constituted a revolution in information handling comparable to the introduction of card catalogs in the 1870s and 1880s. In the 1930s Watson Davis successfully used microfilm to distribute scientific literature, but efforts to combine it with mechanized retrieval were not successful. After World War II, however, new machines allowed sorters to construct both traditional and new reference materials with greater speed and accuracy. Jobs that were formerly prohibitively time-consuming now became practical.

Associated with the punched-card revolution in information technology were a number of machine-compatible techniques for dealing with chemical notation. Among the latter were linear means of indicating chemical structures, such as G. Malcolm Dyson’s (1946) and William J. Wiswesser’s (1949) notation systems, and ways of breaking up and tabulating structures that could be coded for use in a machine, such as Hoechst’s fragment code GREMAS (1957) and the work of Jacques-Emile DuBois (1954), Donald J. Gluck and Harry L. Morgan (1962), and others on tabular reproduction of structural formulas. The task of converting chemical names into molecular formulas or structures, previously reserved for chemical cognoscenti, was broken up into multiple small steps that a machine could perform.

The field rapidly progressed from chemical literature to chemical documentation to chemical information. Where once the practitioners were literature chemists or chemical librarians, now a new generation of information scientists with both chemical and machine expertise was required to deal with the large volume of chemical information. They began to develop special devices and indexing approaches to sort and search multiple access points to the chemical literature. New concepts like Calvin Mooers’s descriptors and superimposed coding (1948), Mortimer Taube’s Uniterms (1950) and coordinate indexing (1951), and Hans Peter Luhn and Herbert Ohlman’s KWIC system (1958) aided in this endeavor. In 1951, at the Sharp and Dohme library, Claire Schultz employed Mooers’s superimposed coding and the Remington Rand card sorter to perform chemistry searches, and at Johns Hopkins University’s Welch Medical Library, Eugene Garfield successfully used the IBM punched-card sorter to search the Current List of Medical Literature.

 

ABSTRACTS, REVIEWS, COMPILATIONS, AND INDEXES STORED AND RETRIEVED USING ELECTRONIC COMPUTERS

As science information experts began to turn to electronic computers with stored programs in the late 1950s, they drew upon a wealth of machine processing experience and exciting new ideas. For more than a decade, technologies overlapped. A wide variety of information retrieval systems were introduced and refined using electromechanical devices and later implemented on electronic computers as they became more readily available. The first computer-produced periodical, Chemical Titles (1960), which used punched cards and the KWIC indexing system, is an example of the changes beginning to affect chemical information science. Meanwhile, specialized devices for inputting chemical structures into computers were patented, and topological coding of structures and transcoding algorithms were developed.

Second-generation transistorized electronic computers caused a revolution in chemical information processing and retrieval. Their magnetic storage systems—first tape and later disk drives—allowed large-scale databases that could be shared with other sites or searched in batch (off-line) mode and the results provided to users. As chemical literature grew exponentially, chemical scientists began to talk seriously about a worldwide chemical information system. Database developers like Chemical Abstracts Service (CAS) began to explore cooperative ventures, within the United States and overseas, both to create information products and to provide access to them. Public agencies, like the U.S. National Science Foundation (NSF), began to support ventures such as CAS’s Chemical Registry System, both funding them and using them to meet their own scientific needs. An example of international cooperation was the transfer of CAS connectivity tables into the French DARC topological database.

Developments in telecommunications in the late 1960s and early 1970s aided the shift to online searching and retrieval from these databases. Commercial online database vendors, such as DIALOG and ORBIT (1965), sought access to these databases eagerly and marketed them to libraries, corporate information centers, and individual researchers. These sophisticated chemical information databases included not just bibliographic citations but also patents, chemical properties, structural details, crystallographic and spectroscopic data, and much more. Supplementing online searching were floppy disks and magnetic tapes, CD-ROMs, and specialized file transfer protocol files.

Expertise in searching became essential for the chemical information specialists, but they still had to contend with terminology and language problems, database coverage, and the changing nature of chemistry. Chemical information scientists saw the need to systematically readdress notational and structural conventions and their implications for machine processing. They also established a whole new way of utilizing databases in terms of chemical reactions. Subsequent developments in computer hardware and software made possible other innovations, including the use of computer graphics to create three-dimensional models of molecules. Research programs to address these areas were developed at academic institutions (such as Sheffield University’s Postgraduate School of Librarianship and the master’s degree program in chemical information at the University of Paris) and in the private sector (such as Molecular Design Ltd.).

 

INTERNET

While the revolutions in computer processing speeds and online searching significantly aided scientific communication, they did not address the scientists’ need to share tentative ideas and results while at work on a particular problem. Letters, meetings, informal technical reports, and telephone calls (the conventional methods of communicating about work in progress) were slow, expensive, or both. “Snail mail” was not good enough.

In the late 1960s the U.S. government began funding an effort to ease communication between scientists engaged in defense work. ARPANET (Advanced Research Projects Agency Network) was established in 1969 to demonstrate how communications between computers could promote cooperative research among scientists. This precursor to the Internet was very successful, but access was quite limited until 1983, when the NSF began funding it as the INTERNET (Interactive Network). In 1986 it established its own network, NSFNET, to connect supercomputing centers. Academic institutions soon followed, establishing their own networks for communicating with each other via electronic mail and file transfer systems.

Rapid improvements in the 1990s in software, computer processing speeds and storage capacity, and telecommunications links made the Internet a worldwide system. International cooperative efforts for the sharing of scientific data via the Internet, such as initiatives undertaken by the Committee on Data for Science and Technology (1966) of the International Council of Scientific Unions, have greatly enhanced the value and importance of the Internet to scientific communication. Now scientists almost anywhere in the world can communicate rapidly and completely about their research and search and retrieve scientific literature from large and complex international databases from their desktops.

Are the Internet and World Wide Web and the accessibility of sophisticated chemical information databases the ultimate fulfillment of the needs and dreams of chemical investigators and information experts? Probably not. The vast quantity of material, inadequate indexing and retrieval mechanisms, and the lack of content refereeing, continue to challenge chemical information scientists. Nevertheless, we eagerly anticipate the future.