Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Trojan Detection Evaluation: Finding Hidden Behavior in AI Models

Published

Author(s)

Michael Paul Majurski, Derek Juba, Timothy Blattner, Peter Bajcsy, Walid Keyrouz

Abstract

Neural Networks are trained on data, learn relationships in that data, and then are deployed to the world to operate on new data. For example, a traffic sign classification AI can differentiate stop signs and speed limit signs. One potential problem is that an adversary can disrupt the training pipeline to insert Trojan behaviors. For example, the AI can be given just a few additional examples of stop signs with yellow squares on them, each labeled "speed limit sign." We explore the TrojAI program (a collaboration between NIST, IAPRA, and JHU/APL) which hopes to combat such trojan attacks via 1) developing reference datasets and 2) operating a challenge where detection methods can be evaluated against sequestered data. Submissions, packaged into Singularity containers, are run against the sequestered data and results are posted to a public leaderboard. This presentation explores the dataset generation, testing infrastructure, and a baseline detection method within the TrojAI program.

Keywords

ai, adversarial machine learning

Citation

Majurski, M. , Juba, D. , Blattner, T. , Bajcsy, P. and Keyrouz, W. (2020), Trojan Detection Evaluation: Finding Hidden Behavior in AI Models (Accessed October 15, 2024)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created October 10, 2020, Updated March 28, 2023