Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Too many Relevants: Whither Cranfield Test Collections?

Published

Author(s)

Ellen M. Voorhees, Nick Craswell, Jimmy Lin

Abstract

This paper presents the lessons regarding the construction and use of large Cranfield-style test collections learned from the TREC 2021 Deep Learning track. The corpus used in the 2021 edition of the track was much bigger than the corpus used in previous years and contains many more relevant documents. The process used to select documents to judge that had been used in earlier years of the track failed to produce a reliable collection because most topics have too many relevant documents. Judgment budgets were exceeded before an adequate sample of the relevant set could be found, so there are likely many unknown relevant documents in the unjudged portion of the corpus. As a result, the collection is not reusable, and furthermore, recall-based measures are unreliable, even for the retrieval system results used in building it. Yet, early-precision measures cannot distinguish among system results because the maximum score is easily obtained for many topics. And since the existing tools for appraising the quality of test collections depend on systems' scores, they also fail when there are too many relevant documents. Collection builders will need new strategies and tools for building reliable test collections for continued use of the Cranfield paradigm on ever-larger corpora. Ensuring that the definition of 'relevant' truly reflects the desired systems' rankings is a provisional strategy for continued collection building.
Proceedings Title
Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
Conference Dates
July 11-15, 2022
Conference Location
Madrid, ES
Conference Title
45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022)

Keywords

Cranfield, information retrieval, reusability, score saturation, test collections, TREC

Citation

Voorhees, E. , Craswell, N. and Lin, J. (2022), Too many Relevants: Whither Cranfield Test Collections?, Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, ES, [online], https://doi.org/10.1145/3477495.3531728, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=934359 (Accessed October 10, 2024)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created July 11, 2022, Updated February 14, 2023