NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
NIST GenAI (Pilot): an Overview of Text-to-Text Evaluation Results
Published
Author(s)
Yooyoung Lee, Hariharan Iyer
Abstract
The 2024 NIST Generative AI (GenAI) Pilot Study focuses on evaluating text-to-text (T2T) generation and discrimination tasks to assess the capabilities and limitations of generative AI models. The study aims to measure the effectiveness of AI-generated text in mimicking human writing and the ability of AI-based discriminators to distinguish between human- and AI-generated content. A curated dataset of human-authored and machine-generated summaries served as the benchmark, with performance assessed using statistical and machine-learning-based metrics, including AUC (Area Under the Curve) and Brier scores. The presentation includes the evaluation submissions, data analyses, results, challenges, and future work.
Lee, Y.
and Iyer, H.
(2025),
NIST GenAI (Pilot): an Overview of Text-to-Text Evaluation Results, NIST GenAI, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=959907, https://ai-challenges.nist.gov/genai
(Accessed October 13, 2025)