Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Summary

The NIST Information Access Division (IAD) initiated a Data Science Research Program (DSRP) aimed to advance the measurement science for big data and data analytics. The goals of the DSRP are to accelerate the research progress on and the development of data analytic methods for greater and more accurate access and understanding of the information contained in multimodal heterogeneous data. Furthermore, the DSRP aims to advance measurement science of data analytics in a general setting, focusing on the use of data analytic algorithms across domains and applications.

The Multimodal Information Group (MIG) supports the DSRP by contributing to the test, evaluation, and benchmarking of emerging big data and data analytic approaches. This includes the development of necessary evaluation infrastructure such as software tools, datasets, metrics, evaluation paradigms, and analysis techniques. Part this evaluation infrastructure is the Data Science Evaluation Series (DSE), which is an evaluation series that provides a framework to evaluate data analytic algorithms in a generalized way.

Description

Current MIG Data Science project activities

 

  • DARPA D3M. NIST is supporting the DARPA D3M (Data-Driven Discoveries of Models) program as part of the Test and Evaluation (Government) Team. The D3M Program aims to allow automated systems with subject matter experts to model and solve complex machine learning problems.
  • Data Science Evaluation (DSE) Series.  The Data Science Evaluation Series (DSE) is a new evaluation series that aims to provide a cross-disciplinary framework to evaluate data analytic algorithms from all components of the data analytic pipeline. The DSE aims to evaluate algorithms in a generalized way, providing a forum that allows domain-specific algorithms to be adopted and evaluated in a generalized setting. 
  • Evaluation Management System (EMS). The Evaluation Management System (EMS) is a controlled private cloud providing general-purpose infrastructure that allows for controlled evaluations. The EMS provides evaluation infrastructure for the DSE that allows for isolated environments to evaluate submissions, advanced benchmarking analytics, and a platform to run systems that use a variety of architectures and distributed programming frameworks (such as Hadoop and Spark). The EMS integrates hardware and software components for easy deployment and reconfiguration of computational needs and enables integration of compute- and data-intensive problems within a controlled private cloud. This design allows for test and evaluation of different compute paradigms as well as facilitate integration of hardware acceleration components in order to best assess how a given evaluation can be run.

Previous MIG Data Science project activities

 

  • DARPA XDATA. NIST is supporting the DARPA XDATA program, which aims to develop tools to handle the computational challenges of analyzing large and incomplete data sets. NIST's role in this project is to support XDATA's evaluation of software tools produced through the project. Previous NIST contributions were standardized reports of analytic accuracy and system benchmarking for inclusion in the software documentation. Many of the tools developed from the XDATA project are released through the DARPA Open Catalog.

  • NIST Data Science Symposium. This NIST Data Science Symposium, with over 700 attendees, provided a forum to discuss measurement science for data analytics. This The MIG took part in the organization and coordination of this symposium.
Created July 25, 2014, Updated July 10, 2017