Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

NIST AI Use Scenarios Library: Developing Repeatable AI Evaluations and Metrics

Published

Author(s)

Yee-Yin Choong, Kristen Greene

Abstract

Measurement science for AI evaluations is a growing field. The National Institute of Standards and Technology (NIST) recently conducted a pilot evaluation of generative AI (GAI), specifically large language models (LLMs), in the ARIA (Assessing Risks and Impacts of AI) Program. NIST has identified the need for a library of sector-specific scenarios for future AI evaluations and is seeking input from the larger AI community, including government, industry, academia, and civil society. Although similar efforts to capture AI use cases exist (e.g., ISO/IEC TR 24030:2024), providing a useful starting point, there are notable limitations that NIST seeks to address in developing its AI Use Scenarios Library. In particular, the ISO use case collection was captured before the widespread availability of GAI, which suggests a more current use case collection is warranted. Evaluations of positive and negative impacts associated with AI rely on realistic and testable scenarios, along with associated test data, methods, and metrics. The purpose of NIST's AI Use Scenario Library is to develop a repository of sector-specific scenarios grounded in real-world uses to support evaluations and measurements of AI impacts. Using a structured worksheet, NIST aims to gather use cases from the AI community across sectors to facilitate the development of sector-specific scenarios. The initial compilation of AI use cases, structured according to the worksheet, will enable NIST to effectively identify and document details associated with each use case. This input will allow NIST to develop realistic and testable scenarios for AI evaluations. While the intention is to make the scenario library publicly available, NIST will reserve some scenarios for future NIST AI evaluations. Through this effort, NIST seeks scenario coverage across a wide variety of risks and benefits associated with AI across sectors.
Conference Dates
July 22-23, 2025
Conference Location
Arlington, VA, US
Conference Title
GenAI for Government: Harnessing LLMs Symposium

Citation

Choong, Y. and Greene, K. (2025), NIST AI Use Scenarios Library: Developing Repeatable AI Evaluations and Metrics, GenAI for Government: Harnessing LLMs Symposium, Arlington, VA, US, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=959893 (Accessed September 15, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created July 23, 2025, Updated September 4, 2025
Was this page helpful?