Materials Science is characterized by the diversity of the types and formats of data it uses. Much of this data is locked in the technical literature or scattered across hard drives, research databases, and websites. The promise of the Materials Genome vision will be realized when materials data becomes easily accessible and manipulable to form large datasets capable of supporting computational materials discovery and design.
NIST has engaged in a variety of internal and external collaborations including those between Software and Systems Division (ITL), several groups in the Material Measurement Laboratory (MML), the Center for Hierarchical Materials Design (CHiMaD), Northwestern University, the University of Chicago, Argonne National Laboratories, ASM International, Texas A&M University, the Research Data Alliance (RDA), the Corporation for National Research Initiatives (CRNI), and Kent State University.
A two fold approach is being used: 1) We are working to develop tools and methods to handle the curation, storage, and transformation of materials data from a variety of sources including scientific literature and experimentation; 2) We are developing an informatics infrastructure to facilitate the search, discovery, access, and use of materials data made available by the community.
We are investigating the use of Machine Learning and Natural Language Processing techniques to support semi-automated materials data curation and the use of XML- and NoSQL-based technologies as the basis of materials data repositories. We are also investigating the use of digital object based registries to make materials resources available throughout the community.
The ability to use advanced informatics tools and techniques to discover, create, disseminate, and use the material properties datasets in computational workflows will help ensure that the Materials Genome fulfills its promise of accelerated materials discovery and design.
For more information, visit the NIST Materials Genome Initiative page