NIST is building an infrastructure for advanced data integration and management that promotes scientific innovation in the biological, chemical, and materials sciences.
Big data, data sharing, and high-speed computing are revolutionizing scientific discovery and innovation. When researchers need fundamental, well-characterized data in biology, chemistry, and materials science, they turn to NIST’s Material Measurement Lab as a resource. But as research data sets become both larger and more complex, analysis and knowledge extraction tools must be developed and adapted to deal with scale, heterogeneity, and traceability. Researchers need fundamental advances in research tools—developed in collaboration with computer scientists, statisticians, and big data analytics specialists.
To address this need, we established the Office of Data and Informatics in 2014 to help maximize the value of the incredible amounts of data produced by our increasingly sophisticated analyses. The Office of Data and Informatics is dedicated to reinforcing NIST’s role as a leader in scientific reference data and data management.
NIST has produced and published Standard Reference Data for nearly 50 years. The NIST/EPA/NIH Mass Spectral Library Database, Reference Fluid Thermodynamic and Transport Properties Database, and Inorganic Crystal Structure Database are examples of three highly visible and active standard reference data products. As part of its mission, the Office of Data and Informatics is modernizing NIST’s reference data program to improve discoverability and ease of use, while deploying new technologies such as application programming interfaces. The Office of Data and Informatics strives for discoverability and interoperability with key partners, such as our centers of excellence, to advance data-enabled research and development, and is establishing data sharing environments and tools that demonstrate the benefits of data-based collaboration.
70 terabytes of storage for data generated by the Materials Genome Initiative
9,000+ standard reference data customer transactions