NIST AI Use Scenarios Library: Developing Repeatable AI Evaluations and Metrics

Yee-Yin Choong; Kristen Greene

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

NIST AI Use Scenarios Library: Developing Repeatable AI Evaluations and Metrics

Published

July 23, 2025

Author(s)

Yee-Yin Choong, Kristen Greene

Abstract

Measurement science for AI evaluations is a growing field. The National Institute of Standards and Technology (NIST) recently conducted a pilot evaluation of generative AI (GAI), specifically large language models (LLMs), in the ARIA (Assessing Risks and Impacts of AI) Program. NIST has identified the need for a library of sector-specific scenarios for future AI evaluations and is seeking input from the larger AI community, including government, industry, academia, and civil society. Although similar efforts to capture AI use cases exist (e.g., ISO/IEC TR 24030:2024), providing a useful starting point, there are notable limitations that NIST seeks to address in developing its AI Use Scenarios Library. In particular, the ISO use case collection was captured before the widespread availability of GAI, which suggests a more current use case collection is warranted. Evaluations of positive and negative impacts associated with AI rely on realistic and testable scenarios, along with associated test data, methods, and metrics. The purpose of NIST's AI Use Scenario Library is to develop a repository of sector-specific scenarios grounded in real-world uses to support evaluations and measurements of AI impacts. Using a structured worksheet, NIST aims to gather use cases from the AI community across sectors to facilitate the development of sector-specific scenarios. The initial compilation of AI use cases, structured according to the worksheet, will enable NIST to effectively identify and document details associated with each use case. This input will allow NIST to develop realistic and testable scenarios for AI evaluations. While the intention is to make the scenario library publicly available, NIST will reserve some scenarios for future NIST AI evaluations. Through this effort, NIST seeks scenario coverage across a wide variety of risks and benefits associated with AI across sectors.

Conference Dates

July 22-23, 2025

Conference Location

Arlington, VA, US

Conference Title

GenAI for Government: Harnessing LLMs Symposium

Pub Type

Conferences

Download Paper

Local Download

Information technology, Artificial intelligence and AI measurement and evaluation

Citation

Choong, Y. and Greene, K. (2025), NIST AI Use Scenarios Library: Developing Repeatable AI Evaluations and Metrics, GenAI for Government: Harnessing LLMs Symposium, Arlington, VA, US, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=959893 (Accessed March 10, 2026)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created July 23, 2025, Updated September 4, 2025

Was this page helpful?

NIST AI Use Scenarios Library: Developing Repeatable AI Evaluations and Metrics

Author(s)

Abstract

Download Paper

Citation

Additional citation formats

Issues