Trojan Detection Evaluation: Finding Hidden Behavior in AI Models

Michael Paul Majurski; Derek Juba; Timothy Blattner; Peter Bajcsy; Walid Keyrouz

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

Trojan Detection Evaluation: Finding Hidden Behavior in AI Models

Published

October 10, 2020

Author(s)

Michael Paul Majurski, Derek Juba, Timothy Blattner, Peter Bajcsy, Walid Keyrouz

Abstract

Neural Networks are trained on data, learn relationships in that data, and then are deployed to the world to operate on new data. For example, a traffic sign classification AI can differentiate stop signs and speed limit signs. One potential problem is that an adversary can disrupt the training pipeline to insert Trojan behaviors. For example, the AI can be given just a few additional examples of stop signs with yellow squares on them, each labeled "speed limit sign." We explore the TrojAI program (a collaboration between NIST, IAPRA, and JHU/APL) which hopes to combat such trojan attacks via 1) developing reference datasets and 2) operating a challenge where detection methods can be evaluated against sequestered data. Submissions, packaged into Singularity containers, are run against the sequestered data and results are posted to a public leaderboard. This presentation explores the dataset generation, testing infrastructure, and a baseline detection method within the TrojAI program.

Pub Type

Talks

Keywords

ai, adversarial machine learning

Software testing and Artificial intelligence

Citation

Majurski, M. , Juba, D. , Blattner, T. , Bajcsy, P. and Keyrouz, W. (2020), Trojan Detection Evaluation: Finding Hidden Behavior in AI Models (Accessed July 26, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created October 10, 2020, Updated March 28, 2023

Was this page helpful?

Trojan Detection Evaluation: Finding Hidden Behavior in AI Models

Author(s)

Abstract

Keywords

Citation

Additional citation formats

Issues