Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

On the Quality of the TREC_COVID IR Test Collections



Ellen M. Voorhees, Kirk Roberts


Shared text collections continue to be vital infrastructure for IR research. The COVID-19 pandemic offered an opportunity to create a test collection that captured the rapidly changing information space during a pandemic, and the TREC-COVID effort was created to build such a collection using the TREC framework. This paper examines the quality of the resulting TREC-COVID test collections, and in doing so, offers a critique of the state-of-the-art in building reusable IR test collections. The largest of the collections--called 'TREC-COVID Complete'--is found to be on par with previous TREC ad hoc collections with existing quality tests uncovering no apparent problems. Yet the lack of any way to definitively demonstrate the collection's quality and its violation of previously used quality heuristics suggest much work remains to be done to understand the factors affecting collection quality.
Proceedings Title
Proceedings of ACM Special Interest Group on Information Retrieval (ACM SIGIR 2021)
Conference Dates
July 11-15, 2021
Conference Location
virtual, originally Montreal, CA


covid-19, datasets, test collections, TREC


Voorhees, E. and Roberts, K. (2021), On the Quality of the TREC_COVID IR Test Collections, Proceedings of ACM Special Interest Group on Information Retrieval (ACM SIGIR 2021), virtual, originally Montreal, CA, [online],, (Accessed July 21, 2024)


If you have any questions about this publication or are having problems accessing it, please contact

Created July 11, 2021, Updated February 14, 2023