Unreliable evidence in binary classification problems

David W. Flater

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

Unreliable evidence in binary classification problems

Published

May 7, 2019

Author(s)

David W. Flater

Abstract

Binary classification problems include such things as classifying email messages as spam or non-spam and screening for the presence of disease (which can be seen as classifying a subject as disease-positive or disease- negative). Both Bayesian and frequentist approaches have been applied to these problems. Both kinds of approaches provide poor estimates of the predictive value of tests for which the number of positive results in the sample is either very small or very large. A classifier that does not account for the uncertainty of these estimates is vulnerable to making inferences from unreliable evidence. This report explains the problem and explores options for accounting for the often-neglected uncertainty. A neat solution that does no harm to less uncertain cases remains elusive.

Citation

Technical Note (NIST TN) - 2044

Report Number

2044

NIST Pub Series

Technical Note (NIST TN)

Pub Type

NIST Pubs

Download Paper

https://doi.org/10.6028/NIST.TN.2044

Artificial intelligence and Uncertainty quantification

Citation

Flater, D. (2019), Unreliable evidence in binary classification problems, Technical Note (NIST TN), National Institute of Standards and Technology, Gaithersburg, MD, [online], https://doi.org/10.6028/NIST.TN.2044 (Accessed July 30, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created May 7, 2019

Was this page helpful?

Unreliable evidence in binary classification problems

Author(s)

Abstract

Download Paper

Citation

Additional citation formats

Issues