Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Facilitating the Adoption of the FAIR Digital Object Framework in Material Science

Summary

With increasing use of data-driven methodologies, concerns around data discovery, data access, and data interoperability have come to the forefront. Communities have convened to work towards convergence of three complementary visions: (1) FAIR Data Principles, (2) Linked Data and Semantic Web, (3) Digital Object Architecture. This convergence has established the FAIR Digital Object Framework. This project seeks to enable the materials community to leverage these developments.

Description

The FAIR Data Principles define a set of aspirational guidelines for those whose goal is to improve the reusability of their data. However, the FAIR Data Principles do not define an explicit set of requirements to achieve FAIRness, which may vary within and across domains. Various projects have emerged to support community consensus approaches and evaluate maturity. This project seeks to survey this landscape and support the materials science and engineering community. This project is centered around evaluating and leveraging the FAIR Digital Object Framework, which will unite three complementary visions:

  1. FAIR Data Principles
  2. Linked Data and Semantic Web
  3. Digital Object Architecture

Linked Data and Semantic Web

Linked Data and Semantic Web support the FAIR Data Principles by providing a method to represent data and metadata in a formal, accessible, shared, and broadly applicable language for knowledge representation. Structured data (i.e., machine-actionable data) on the web continues to grow rapidly. One widely adopted format is schema.org, which can be used to provide machine-actionable metadata in varied arenas, such as a Restaurant or a Coronavirus Testing Facility. Furthermore, schema.org can be used to provide machine-actionable metadata for a Dataset. The schema.org Dataset schema has enjoyed rapid adoption by the scientific and scholarly community. Projects that have adopted existing schema.org specifications include, but are not limited to (alphabetical):

It is also possible to configure existing data publication tools to leverage schema.org. Tools that appear to support schema.org include, but are not limited to (alphabetical):

Furthermore, projects have also emerged to leverage and extend schema.org within a subdomain, such as BioSchemas in life sciences and CodeMeta in scientific software. The widespread adoption of schema.org informs the following mission goal:

  • Goal #1: Leverage and extend schema.org within materials science and engineering and demonstrate its use in supporting important concepts such as process-structure-property relationships and multi-scale material phenomena

Digital Object Architecture

A digital object is composed of Data, Metadata, Protocols & Operations, and Persistent Identifiers.

The Digital Object Architecture defines the concept of a Digital Object, specifies two associated protocols, and defines three core components. The goal of Digital Object Architecture is to extend the current internet architecture to support information management more broadly. The Digital Object Architecture supports FAIR Data Principles by providing identifier creation and resolution, such that data and metadata are assigned globally unique and persistent identifiers. The Digital Object Architecture also supports FAIR Data Principles by providing a registry and repository system such that data and metadata are retrievable by their identifier using a standardized communication protocol. The Digital Object Architecture can be leveraged to support a number of objectives identified by organizations such as the Research Data Alliance including:

  • Persistent identifiers and metadata for Data
  • Persistent identifiers and metadata for Physical Samples and Processing History
  • Persistent identifiers and metadata for Instruments, Instrument Components, and Service/Use History
  • Data Type Registry
  • FAIR Vocabulary and Semantic Services

The applicability of the Digital Object Architecture to the FAIR Data Principles informs the following mission goal:

  • Goal #2: Leverage the Digital Object Architecture within materials science and engineering and demonstrate its use in supporting the complete research data lifecycle

Implementation and Coordination Plan

This project will execute an implementation and coordination plan to specifically address recommendations made within a report sponsored by the National Science Foundation (NSF) and written by The Minerals, Metals & Materials Society (TMS) entitled: Building a Materials Data Infrastructure. This report identified 36 challenges and made 8 priority recommendations. This implementation and coordination plan will address the following recommendations in two phases:

  • Recommendation 1: Strengthen the MDI core in repository, registry, and tool development
  • Recommendation 4: Develop demonstration projects and cross-disciplinary community efforts that enhance and accelerate adoption of the MDI
  • Recommendation 7: Create MDI consortia and working groups

This project has chosen the following tactical approach for a phased implementation of action steps:

  • Phase 1 (Timeframe: 2018+): Develop and deploy robust repositories for use within demonstration projects and cross-disciplinary community efforts. This action step is informed by Recommendations #1 and #4 
  • Phase 2 (Timeframe: 2021+): Convene and coordinate working groups to support community consensus on the use of the FAIR Digital Object Framework within Materials Science and Engineering. This action step is informed by Recommendation #7

Phase 1 Actions

This phase focuses on working with specific communities to support practical adoption of the FAIR Digital Adoption Framework. We welcome new collaborations and partnerships centered around materials science and engineering. We currently have active research and partnerships within the following applications:

  • Circular Materials Economy
  • Microstructure Repository
  • High Throughput Experimentation and Collaboratory

Practical adoption requires an implementation with specific software technology. The FAIR Data Principles provides some high-level insight by making a distinction between “data that is machine-actionable as a result of specific investment in software supporting that data-type” and “data that is machine-actionable exclusively through the utilization of general-purpose, open technologies”. Therefore, we will leverage and/or develop software that can support the FAIR Digital Object Framework across domains. It is also important to recognize that the community already has many solutions in place, thus it is important to support a modular approach (e.g., microservice), which can support integration within existing frameworks. Finally, it is important to consider existing trends within the community. For example, JavaScript Object Notation (JSON) has gained widespread adoption within the materials community as a metadata interchange format. JSON is used as a metadata interchange format in many projects, including but not limited to (alphabetical):

Furthermore, the consensus-based format proposed by OPTIMADE has stimulated further innovation via the development of a federated data discovery portal. Although JSON enjoys widespread adoption as a metadata interchange format, there remains widespread diversity in the data and metadata formats. As a result, a number of projects have arisen to register semantic assets used within materials science and engineering. The widespread use of JSON and distinction of general-purpose, open technologies from the FAIR principles inform the following Phase 1 Actions:

  • Action #1: Develop and continuously improve JSON Schema definitions, which: (1) support Linked Data (JSON-LD) and (2) define a graph-based metadata model for materials data, physical samples, instruments, etc.
  • Action #2: Leverage or develop general-purpose, open software, which: (1) supports JSON Validation via JSON Schema, and (2) implements the vision of the Digital Object Architecture.
  • Action #3: Support the community development of robust repositories for use in demonstration projects by leveraging the outcomes of Action #1 and Action #2.

Phase 2 Actions

This phase focuses on convening and coordinating working groups to support community consensus on the use of the FAIR Digital Object Framework within Materials Science and Engineering. Many action steps will be informed by lessons learned from Phase 1 actions. However, the following actions are currently planned:

  • Action #4: Deploy a Material Schemas webpage modeled after BioSchemas and schema.org, which invites community contribution and participation
  • Action #5: Develop and publish a vocabulary definition file (e.g., schema.org JSON-LD), which invites community contribution and participation
  • Action #6: Launch a series of workshops, governance meetings, hackathons, and other events to support community consensus on these approaches and technologies

Major Accomplishments

  • Creation of a GitHub Repository supporting the community development JSON Schemas (Action 1)
  • Configuration of graph-based metadata model for materials data within Cordra, which is a partial implementation of the Digital Object Architecture (Action 2)
  • Implementation of a Python client library for interacting with the Cordra REST API (Action 2)
  • Development underway to support Open Source Recipe and Process-Property Repository for Circular Materials Economy with partner Materiom (Action 3)
  • Development underway to support development of Microstructure and Process-Structure-Property repository in response to NIST Workshop with partner MINED Data Science and Informatics Group at Georgia Institute of Technology (Action 3)
Created August 11, 2020, Updated September 4, 2020