Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

International Network for Advanced AI Measurement, Evaluation, and Science Publishes Consensus Areas on Practices for Automated Evaluations

International Consensus and Open Questions on Best Practices for Automated AI Evaluations

Today, the International Network for Advanced AI Measurement, Evaluation, and Science published a list of key practices and open questions for measuring and evaluating AI capabilities.

The International Network was founded by the Center for AI Standards and Innovation (CAISI) in November 2024 and focuses on strengthening the science that underpins AI evaluation. It is comprised of government bodies from ten countries including Australia, Canada, the European Union, France, Japan, Kenya, the Republic of Korea, Singapore, the United Kingdom, and the United States.

In June 2025, Secretary of Commerce Howard Lutnick charged the Center for AI Standards and Innovation (CAISI) within NIST with developing guidelines and best practices to measure and improve the security of AI systems and assist industry to develop voluntary standards. In line with Secretary Lutnick’s direction and with America’s AI Action Plan, CAISI’s participation in the Network aims to further technical AI practices that promote innovation, reflect American values, and counter authoritarian influence.

In December 2025, CAISI hosted Network members and representatives from industry and other technical organizations in San Diego on the sidelines of NeurIPS for exchanges focused on identifying best practices and surfacing open questions in automated AI evaluations. CAISI’s contributions drew on its experience working to advance AI measurement science in close concert with industry and the scientific community. The preliminary consensus that emerged reflects many of CAISI’s ongoing efforts, such as CAISI’s draft Best Practices for Automated Benchmark Evaluations recently released for public comment.

As International Network discussions continue, including at the India AI Impact Summit, CAISI will continue to advocate for U.S. interests and for international approaches to AI evaluation that reflect gold-standard measurement science.

Released February 13, 2026
Was this page helpful?