Between 2014 and 2018, facial recognition software got 20 times better at searching a database to find a matching photograph, according to the National Institute of Standards and Technology’s (NIST) evaluation of 127 software algorithms from 39 different developers—the bulk of the industry. The findings, together with other data in a NIST report published today, point to a rapidly advancing marketplace for face-based biometric matching algorithms.
The new publication, NIST Interagency Report (NISTIR) 8238, Ongoing Facial Recognition Vendor Test (FRVT), updates the agency’s previous evaluations of facial recognition software, 2010’s NISTIR 7709 and 2014’s NISTIR 8009. Comparing the reports indicates that the field of developers has grown and that, broadly speaking, facial recognition software is improving at an increasing rate.
The test—performed in the 2010, 2014 and 2018 evaluations—judged how well an algorithm could match a person’s photo with a different one of the same person stored in a large database. This type of “one to many” search is often employed to check for a person who might be applying for a visa or driver’s license under a name different than their own.
The team found that just 0.2 percent of searches failed this year, compared with a 4 percent failure rate in 2014 and 5 percent in 2010. Failure means that when an image of a person’s face is submitted to the recognition software, it fails to return the matching face image that resides in the database.
All of the top-performing algorithms from the latest round make use of machine-learning software architectures called convolutional neural networks. According to NIST’s Patrick Grother, one of the report’s authors, the rapid advance of machine-learning tools has effectively revolutionized the industry.
“The implication that error rates have fallen this far is that end users will need to update their technology,” said Grother, a NIST computer scientist. “The test shows a wholesale uptake by the industry of convolutional neural networks, which didn’t exist five years ago. About 25 developers have algorithms that outperform the most accurate one we reported in 2014.”
But prospective users should beware: The new algorithms do not all perform the same, and the best algorithms are far ahead of the pack.
“There remains a very wide spread of capability across the industry,” Grother said. “This implies you need to properly consider accuracy when you’re selecting new-generation software.”
The NIST evaluation team used a database of 26.6 million photos to test software submitted by companies and one university team. The participants did not have access to the database, which NIST kept sequestered from the developers. The NIST report includes results for the effects of aging on faces, scalability to large databases, the identification of twins, and the use of poor quality images.
NIST’s report identifies the submissions by name and presents them in ranked tables. The main ranked list reflects how often an algorithm put the correct result of the “one to many” search at the top of the list of possible identities, a metric called the rank one recognition rate. However, the report also considers variations such as how the algorithms performed when misidentifications, called false positives, must be minimized. Other algorithms sometimes outperformed the best performing algorithm when these variant factors took priority.
The work was partly supported by the Department of Homeland Security Science and Technology Directorate.