The data used in IREX comes from publicly available datasets and from operational data provided by various branches of the US Government and other government entities. The publicly available data sets may be procured from the original sources; in general, NIST does not have license to redistribute any of the public datasets that are used in IREX. The operational data has been provided by the data owners under the condition that it will not be redistributed by NIST. Hence, the IREX program cannot provide datasets to the research community. Mention of any dataset in this list does not represent an endorsement of the dataset for any particular use by NIST or any US Government entity. Additional information on public datasets may be found at https://tsapps.nist.gov/BDBC/.
The following lists include the descriptor used in the IREX reports followed by information on where they may be obtained. Since NIST has no control over the public sources, the availability, terms and conditions for the public data sets may change without warning. The datasets used in IREX include:
Validation datasets are small datasets distributed to evaluation participants; the participants run their to-be-submitted algorithms on the datasets in their environment and submit the results to NIST. NIST runs the participants’ submitted algorithms on the same datasets in the NIST evaluation environment and compares results with results submitted by the participants to assure that the algorithms are generating the same results in the NIST evaluation environment that they did in the participants’ environment.
The current IREX validation data set is described at https://doi.org/10.6028/NIST.TN.2058 and is made available to participants in IREX evaluations.