There are increasing calls for systems that are able to explain themselves to their end users to increase transparency and help engender trust. But, what should such explanations contain, and how should that information be presented? A pilot study of justifications produced by textual entailment systems serves as a cautionary tale that there are no general answers to these questions. Six different judges each acting in the role of surrogate end-user independently rated how comprehensible justifications of entailment decisions were on a five-point scale. Interrater agreement was low, with an intra-class correlation of about 0.4. More than half of the explanations received both one rating of 'Very Poor' or 'Poor' and one rating of 'Good' or 'Very Good'; and in 32 cases, the same explanation received all five possible ratings from 'Very Poor' through 'Very Good'.
ACM CHI 2021 workshop on Operationalizing Human-centered Perspectives in Explainable AI (HCXAI'21)
System Explanations: A Cautionary Tale, ACM CHI 2021 workshop on Operationalizing Human-centered Perspectives in Explainable AI (HCXAI'21), virtual, but nominally in Yokohama, JP, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=932212
(Accessed December 2, 2023)