As part of a major collaborative effort to develop a standard framework to make it easier for scientists to use "Big Data" sets in their work, the National Institute of Standards and Technology (NIST) is publishing for public comment a draft NIST Big Data Interoperability Framework.
The seven-volume publication of Big Data foundational documents will serve as the U.S. input to the international standards community.
Big Data is the term used to describe the deluge of data in our networked, digitized, sensor-laden, information-driven world. Big Data collections are measured in trillions of bytes (terabytes) and thousands of trillions of bytes (petabytes). The data range from text, images and audio collected from social media to the output from physics experiments that deliver data points 40 million times a second. New technology is evolving to harness the rapid growth of data.
There is a broad agreement among commercial, academic and government leaders that effective use of Big Data has the potential to spark innovation, fuel commerce and drive progress. The availability of vast data resources carries the potential to answer questions previously out of reach. Questions such as: How do we reliably detect a potential pandemic early enough to intervene? Can we predict the properties of new materials even before they've been synthesized? How can we reverse the current advantage of the attacker over the defender in guarding against cybersecurity threats?
However, there is also broad agreement on the ability of Big Data to overwhelm traditional approaches. The rate at which data volumes, speeds and complexity are growing is outpacing scientific and technological advances in data analytics, management, transport and more.
A lack of consensus on some important, fundamental questions about Big Data is confusing potential users and holding back progress. What are the attributes that define Big Data solutions? How is Big Data different from the traditional data environments and related applications that we have encountered thus far? How do Big Data systems integrate into our current data systems? What are the central scientific, technological and standardization challenges that need to be addressed to accelerate the deployment of robust Big Data solutions?
"One of NIST's Big Data goals was to develop a reference architecture that is vendor-neutral, and technology- and infrastructure-agnostic to enable data scientists to perform analytics processing for their given data sources without worrying about the underlying computing environment," said NIST's Digital Data Advisor Wo Chang.
To do this, NIST formed the NIST Big Data Public Working Group (NBD-PWG) with members from industry, academia and government from around the world. The working group has developed consensus definitions, taxonomies, key requirements for data security and privacy protections, a proposed reference architecture and a standard roadmap.
The findings of the NBD-PWG to date make up the NIST Big Data Interoperability Framework:
- Volume 1: Definitions
- Volume 2: Taxonomies
- Volume 3: Use Cases and General Requirements
- Volume 4: Security and Privacy
- Volume 5: Architectures White Paper Survey
- Volume 6: Reference Architecture
- Volume 7: Standards Roadmap.
The NIST Big Data Interoperability Framework may be found on the NIST Big Data Public Working Group page at http://bigdatawg.nist.gov/V1_output_docs.php. The deadline for comments is May 21, 2015. Please send them to SP1500comments@nist.gov.