Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

2024 NIST Generative AI (GenAI): Evaluation Plan for Text-to-Text (T2T) Discriminators

Published

Author(s)

Yooyoung Lee, George Awad, Asad Butt, Lukas Diduch, Kay Peterson, Seungmin Seo, Ian Soboroff, Hariharan Iyer

Abstract

Generator (G) teams will be tested on their system's ability to generate content that is indistinguishable from human-generated content. For the pilot study, the evaluation will help determine strengths and weaknesses in their approaches including insights about how and when humans and/or AI can detect AI-generated content. Discriminator (D) teams will be tested on their system's ability to differentiate between AI-generated content and human-generated content. Lessons learned from both sides of teams should benefit future research directions and approaches to understand cutting-edge technologies as well as source for recommendations and guidance for responsible and safe use of digital content. Participants are required to select if they are participating as a generator team, a discriminator team, or both. This document describes the evaluation plan for "a discriminator team" and covers task definitions, task conditions, file formats for system inputs and outputs, evaluation metrics and protocols for participating in GenAI challenges. In this 2024 GenAI pilot study, the task for D-participants is a detection evaluation to measure how well systems can automatically detect AI-generated content vs Human-generated content in multiple modalities, such as text, audio, image and video. In this pilot study, the task will focus on text modality only.
Citation
Generative AI challenge

Keywords

Generative AI, Large Language Model (LLM), evaluation, challenge, performance measure, text-to-text, text-to-image

Citation

Lee, Y. , Awad, G. , Butt, A. , Diduch, L. , Peterson, K. , Seo, S. , Soboroff, I. and Iyer, H. (2024), 2024 NIST Generative AI (GenAI): Evaluation Plan for Text-to-Text (T2T) Discriminators, Generative AI challenge, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=957332, https://ai-challenges.nist.gov/genai (Accessed February 17, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created April 1, 2024, Updated January 28, 2025