Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

2025 NIST GenAI (Pilot): Code Challenge Evaluation Plan

Published

Author(s)

Peter Fontana, Yooyoung Lee, Hariharan Iyer, Sonika Sharma

Abstract

We are launching a pilot for measuring and evaluating unit tests generated by Artificial Intelligence (AI) for testing elementary python code. This pilot will provide an environment that will facilitate the development and improvement of the abilities of AI Large Language Models (LLMs) to write effective tests for software code.
Citation
Evaluating Generative AI Technologies

Keywords

Generative AI, AI-Generated Code, Software Testing, Large Language Model

Citation

Fontana, P. , Lee, Y. , Iyer, H. and Sharma, S. (2025), 2025 NIST GenAI (Pilot): Code Challenge Evaluation Plan, Evaluating Generative AI Technologies, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=959303, https://ai-challenges.nist.gov/code (Accessed January 25, 2026)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created July 16, 2025, Updated January 23, 2026
Was this page helpful?