Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Artificial intelligence

Center for AI Standards and Innovation (CAISI)

3. Using transcript review tools to find cheating at scale

The number of benchmarks and models that CAISI has evaluated would have made it infeasible to rely on manual review of every transcript to search for evaluation cheating, so we employed AI-based transcript review to aid our search.

Using Inspect, the open-source framework that CAISI uses to run evaluations, we built a transcript analysis system that uses LLM reviewers to score an evaluation transcript for cheating. This system provides reviewer models with a prompt that combines:

A rubric with scorable categories for known cheating risks for that benchmark (and a general “other” category for uncategorized cheating), plus:
- Specific include/exclude examples for each category
- An additional “unintended solution” flag that could be appended to scores to indicate cases of cheating that directly led to a successful task solution
A formatted version of the evaluation transcript, including system and user messages as well as the evaluated model’s messages, tool calls, and the responses to those tool calls
Additional metadata about the evaluation task, including the score according to the original grading system and information about the intended task solution (such as canonical patches for SWE-bench, or write-ups and reference exploits for Cybench)

Reviewer models respond with a JSON object containing any applicable scores for the transcript, providing a confidence from 1 - 10, justification, and relevant message numbers for each. Scores from multiple reviewers are then aggregated to provide a final sample score – for example, the results above are reported for detections with an average confidence score greater than or equal to 5.

Created November 28, 2025, Updated December 2, 2025

Was this page helpful?