Test collections are a mainstay of information retrieval research. Since the 1990s, large reusable test collections have been developed in the context of community evaluations such as TREC, NTCIR, CLEF, and INEX. Recently, advances in pooling practice as well as crowdsourcing technologies have placed test collection building back into the hands of the small research group or company. In all of these cases, practitioners should be aware of, and concerned about the quality of test collections. This paper surveys work in test collection quality measures, references case studies to illustrate their use, and provides guidelines on assessing the quality of test collections in practice.
Proceedings of the 2010 Workshop on Evaluating Information Access (EVIA 2010)
Test Collection Diagnosis and Treatment, Proceedings of the 2010 Workshop on Evaluating Information Access (EVIA 2010), Tokyo, -1, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=905766
(Accessed June 3, 2023)