The Protein Data Bank (PDB) was established at Brookhaven National Laboratories in 1971 as an archive for biological macromolecular crystal structures. Originally, it contained only seven structures; today it holds over 129,000 structures for large biological molecules, including proteins and nucleic acids. In 1998, the management of the PDB became the responsibility of the Research Collaboratory for Structural Bioinformatics (RCSB), a consortium composed of Rutgers, the State University of New Jersey; the University of California at San Diego; and the National Institute of Standards and Technology (NIST). In 2000, a NIST researcher co-authored the article The Protein Data Bank in Nucleic Acids Research (volume 28, issue 1, pages 235-242). It has since become NISTs most highly cited journal article. In collaboration with the NIST co-author of The Protein Data Bank, library staff in the Information Services Office (ISO) at NIST analyzed this article, studying the authors, institutions, journals, research areas, and countries that have cited the article. ISO used library resources and tools to analyze the paper and visualize its impacts. As the most highly cited NIST-authored article, The Protein Data Bank, with over 15,000 citations, is considered a classic in its field. There are many ways to measure the impact of this article, as demonstrated through the various graphical representations shown in this paper. The Protein Data Bank has been cited across 151 different research areas in over 2,100 journals by authors from over 5,000 institutions in 102 countries. Too often, the impact of an article is measured simply by the number of citations, when in fact, a much richer story can be told through a close look at the citation data. This paper describes the methodologies for analyzing The Protein Data Bank article and illustrating its impact using various data visualizations.
impact assessment, publications, data visualization