Using Chebyshev's Inequality to Determine Sample Size in Biometric Evaluation of Fingerprint Data
Jin Chu Wu, Charles L. Wilson
For fingerprint dataset, which in many cases may exceed a million samples, the underlying distribution function with respect to similarity scores is unknown. The size of biometric evaluation test sample in these cases is an important question. In this article, Chebyshev's inequality, in combination with simple random sampling, is used to determine the sample size for these biometric applications. The performance of fingerprint-image matcher is measured by both the area under a Receiver Operating Characteristic (ROC) curve and the value of the True Accept Rate (TAR) at an operational False Accept Rate (FAR). The Chevyshev's greater than 95% intervals of these two criteria based on 500 Monte Carlo iterations are computed for different sample sizes as well as for both high-and low-quality fingerprint image matchers. The stability of such Monte Carlo calculations with respects to the iteration size is also presented. The choice of sample size is dependent on the qualities of fingerprint-image matchers as well as on which criterion is invoked. However, in general, for 6,000 match similarity scores, 50,000 to 70,000 scores randomly selected from 35,994,000 non-match similarity scores can ensure reasonable accuracy with greater than 95% probability.
and Wilson, C.
Using Chebyshev's Inequality to Determine Sample Size in Biometric Evaluation of Fingerprint Data, NIST Interagency/Internal Report (NISTIR), National Institute of Standards and Technology, Gaithersburg, MD, [online], https://doi.org/10.6028/NIST.IR.7273
(Accessed December 11, 2023)