A multidisciplinary team of computer scientists, cognitive scientists, mathematicians, and specialists in AI and machine learning that all have diverse background and research specialties, explore and define the core tenets of explainable AI (XAI). The team aims to develop measurement methods and best practices that support the implementation of those tenets. Ultimately, the team plans to develop a metrologist’s guide to AI systems that address the complex entanglement of terminology and taxonomy as it relates to the myriad layers of the AI field. AI must be explainable to society to enable understanding, trust, and adoption of new AI technologies, the decisions produced, or guidance provided by AI systems.
Stay tuned for further announcements and related activity by checking this page or by subscribing to Artificial Intelligence updates through GovDelivery at https://service.govdelivery.com/accounts/USNIST/subscriber/new.
Please direct questions to explainable-AI [at] nist.gov ().
NIST held a virtual workshop on Explainable Artificial Intelligence (AI) on January 26-28, 2021. Explainable AI is a key element of trustworthy AI and there is significant interest in explainable AI from stakeholders, communities, and areas across this multidisciplinary field. As part of NIST’s efforts to provide foundational tools, guidance, and best practices for AI-related research, we released a draft white paper, Four Principles of Explainable Artificial Intelligence, for public comment. Inspired by comments received, this workshop will delve further into developing an understanding of explainable AI. A summary of the workshop is now available. The revised version of the whitepaper is now released and the public comments received from the draft whitepaper are now available.
Four Principles of Explainable Artificial Intelligence (NISTIR 8312), September 2021
We introduce four principles for explainable artificial intelligence (AI) that comprise fundamental properties for explainable AI systems. We propose that explainable AI systems deliver accompanying evidence or reasons for outcomes and processes; provide explanations that are understandable to individual users; provide explanations that correctly reflect the system's process for generating the output; and that a system only operates under conditions for which it was designed and when it reaches sufficient confidence in its output. We have termed these four principles as explanation, meaningful, explanation accuracy, and knowledge limits, respectively. Through significant stakeholder engagement, these four principles were developed to encompass the multidisciplinary nature of explainable AI, including the fields of computer science, engineering, and psychology. Because one-size-fits-all explanations do not exist, different users will require different types of explanations. We present five categories of explanation and summarize theories of explainable AI. We give an overview of the algorithms in the field that cover the major classes of explainable algorithms. As a baseline comparison, we assess how well explanations provided by people follow our four principles. This assessment provides insights to the challenges of designing explainable AI systems.
Psychological Foundations of Explainability and Interpretability in Artificial Intelligence (NISTIR 8367) April 2021
This paper makes the case that interpretability and explainability are distinct requirements for machine learning systems. To make this case, the authors provide an overview of the literature in experimental psychology pertaining to interpretation (especially of numerical stimuli) and comprehension. They find that interpretation refers to the ability to contextualize a model’s output in a manner that relates it to the system’s designed functional purpose, and the goals, values, and preferences of end users. In contrast, explanation refers to the ability to accurately describe the mechanism, or implementation, that led to an algorithm’s output, often so that the algorithm can be improved in some way. Beyond these definitions, this review shows that humans differ from one another in systematic ways that affect the extent to which they prefer to make decisions based on detailed explanations versus less precise interpretations. These individual differences, such as personality traits and skills, are associated with their abilities to derive meaningful interpretations from precise explanations of model output. This implies that system output should be tailored to different types of users.