2017-DEC-15 – TRAIT Final Report now available
The outcomes of the TRAIT activity have been published as NISTIR 8199 - The Text Recognition Algorithm Independent Evaluation (TRAIT).
2016-SEPT-08 – Phase 3 API now available
2015-DEC-02 – Validation Package now available
The TRAIT2016 validation package is now available at http://nigos.nist.gov:8080/trait2016/trait2016Validation.tar.gz. The purpose of validation is to ensure that NIST's execution of your library submission produces the expected output.
The process requires you to
- compile and link the provided validation code against your libraries,
- run the executables, and
- email the results to email@example.com.
Please see the included README file for more details. NIST will run the same validation code on your submitted libraries and check that the results come out identical. Comments regarding the package should be directed to firstname.lastname@example.org.
2015-NOV-17 - Final API
2015-OCT-05 - Draft API
2015-OCT-05 - Program Announcement
What: The Text Recognition Algorithm Independent Evaluation (TRAIT) is being conducted to assess the capability of text detection and recognition algorithms to correctly detect and recognize text appearing in unconstrained imagery.
Who: NIST invites all organizations, particularly universities and corporations, to submit their technologies to TRAIT-2016. The evaluation is open worldwide. Participation is free. NIST does not provide funds to participants.
How: TRAIT-2016 is a sequestered evaluation of text detection and recognition algorithms. Algorithms are submitted to NIST and executed on large scale corpora available to NIST. The algorithms are submitted as compiled libraries implementing a C++ API. Developers do not submit source code or IP to NIST.
Why: The primary driver of the evaluation is to support forensic investigations of digital media. These images are of interest to NIST's partner law enforcement agencies that seek to employ text recognition in investigating this area of serious crime. The primary applications are identification of previously known victims and suspects, as well as detection of new victims and suspects. The presence of text may allow a location to be identified or to generate leads.
The primary dataset is an operational child exploitation collection containing illicit pornographic images and video. The images are present on digital media seized in criminal investigations. The files include children who range in age from infant through adolescent. Many of the images contain geometrically unconstrained text. This text is human-legible and sometimes has investigational value. Such text is visible on certificates, posters, logos, uniforms, sports apparel, computer screens, business cards, newspapers, books lying on tables, cigarette packets and a long list of more rare objects.
Please see the data on page 2 of the API document posted above
Sponsorship: This work is sponsored by the Department of Homeland Security's Science and Technology Directorate. The TRAIT activity falls under their Child Exploitation Image Analytics Program(CHEXIA). NIST is also running a separate evaluation of face recognition technology under the CHEXIA program.