Given the explosion of data production, storage capabilities, communications technologies, computational power, and supporting infrastructure, data science is now recognized as a highly-critical growth area with impact across many sectors including science, government, finance, health care, manufacturing, advertising, retail, and others. Since data science technologies are being leveraged to drive crucial decision making, it is of paramount importance to be able to measure the performance of these technologies and to correctly interpret their output. The NIST Information Technology Laboratory is forming a cross-cutting data science program focused on driving advancements in data science through system benchmarking and rigorous measurement science.
Ashit Talukder (NIST), John Garofolo (NIST), Mark Przybocki (NIST), Craig Greenberg (NIST)
Call For Abstracts:
Participants who wish to give presentations of their technical perspectives or present posters (potentially with technical demonstrations) that address symposium topics should submit a brief one-page abstract and brief one-paragraph bio to firstname.lastname@example.org by February 21st, 2014 (those abstracts received after January 10th, 2014 will only be considered for poster presentations). Those who submit abstracts by January 10th will be notified whether their perspectives have been selected for plenary or poster presentation by January 31st. Those submitting abstracts after January 10th and prior to February 21st will be notified whether their perspectives have been selected for a poster presentation on a rolling basis sometime between February 1st and March 1st. Speakers, panelists, and poster presenters will be selected by the organizers based on relevance to symposium objectives and workshop balance. Due to the technical nature of the symposium, no marketing will be permitted.
Understanding the Data Science Technical Landscape:
- Primary challenges in and technical approaches to complex workflow components of Big Data systems, including ETL, lifecycle management, analytics, visualization & human-system interaction.
- Major forms of analytics employed in data science.
Improving Analytic System Performance via Measurement Science
- Generation of ground truth for large datasets and performance measurement with limited or no ground truth.
- Methods to measure the performance of data analytic workflows where there are multiple subcomponents, decision points, and human interactions.
- Methods to measure the flow of uncertainty across complex data analytic systems.
- Approaches to formally characterizing end-to-end analytic workflows.
Datasets to Enable Rigorous Data Science Research
- Useful properties for data science reference datasets.
- Leveraging simulated data in data science research.
- Efficient approaches to sharing research data.