Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Face Recognition Vendor Test (FRVT) - Performance of Automated Gender Classification Algorithms



Mei L. Ngan, Patrick J. Grother


Facial gender classification is an area studied in the Face Recognition Vendor Test (FRVT) with Still Facial Images Track. While peripheral to automated face recognition, it has become a growing area of research, given its potential use in various applications. The motivation for gender classification systems has grown in recent years, given the rise of the digital age and the increase in human-computer interaction. Gender-targeted surveillance (e.g., monitoring gender-restricted areas), gender-adaptive targeted marketing (e.g., displaying gender-specific advertisements from digital signage), passive gender demographic data collection (e.g., to drive gender-related product offerings), and gender-based indexing of face images, are potential applications of automated gender classification. NIST performed a large scale empirical evaluation of facial gender classification algorithms, with participation from five commercial providers and one university, using large operational datasets comprised of facial images from visas and law enforcement mugshots, leveraging a combined corpus of close to 1 million images. NIST employed a lights-out, black- box testing methodology designed to model operational reality where software is shipped and used ”as-is” without subsequent algorithmic training. Core gender classification accuracy was baselined over a large dataset composed of images collected under well-controlled pose, illumination, and facial expression conditions, then assessed demographically by gender, age group, and ethnicity. Analysis on commonly benchmarked in-the-wild (i.e., unconstrained) datasets was conducted and compared with those from the constrained dataset. The impact of number of image samples per subject was captured and assessments of classification performance on sketches and gender verification accuracy were documented.
NIST Interagency/Internal Report (NISTIR) - 8052
Report Number


gender classification, soft biometrics
Created April 20, 2015, Updated November 10, 2018