The Center for AI Standards and Innovation develops and publishes voluntary guidelines to support the responsible design, development, deployment, use, and governance of advanced AI models, systems, and agents.
- Practices for Automated Benchmark Evaluations of Language Models (Initial Public Draft) – These draft guidelines identify preliminary best practices for evaluating language models and AI agent systems. We are soliciting public comment on this document until March 31, 2026, at 11:59 PM Eastern Time. To provide feedback, please email NISTAI800-1 [at] nist.gov (NISTAI800-2[at]nist[dot]gov).