The intended applications of automatic face recognition systems include venues that vary widely in demographic diversity. Formal evaluations of algorithms do not commonly consider the effects of population diversity on performance. We document the effects of racial and gender demographics on the accuracy of algorithms that match identity in pairs of face images. In particular, we focus on the effects of the background population distribution of non-matched identities against which identity matches are compared. The algorithm we tested was created by fusing three of the top performers from a recent US Government competition. First, we demonstrate the variability of algorithm performance estimates when the population of non-matched identities were demographically yoked by race and/or gender (i.e., yoking constrains non-matched pairs to be of the same race or gender). We also found differences in the match threshold required to obtain a false positive rate of :001 when demographic control scenarios varied. In a second experiment, we explored the effects of progressive increases in population diversity on algorithm performance. We found systematic, but non-general, effects when the balance between majority and minority populations of non-matched identities shifted. Finally, we show that identity match accuracy differs substantially when the non-match identity population varied by race. The results indicate the importance of the demographic composition and modeling of the background population in predicting the accuracy of face recognition algorithms.
Citation: NIST Interagency/Internal Report (NISTIR) - 7757
NIST Pub Series: NIST Interagency/Internal Report (NISTIR)
Pub Type: NIST Pubs