Using Replicates in Information Retrieval Evaluation

Ellen M. Voorhees; Daniel V. Samarov; Ian M. Soboroff

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

Using Replicates in Information Retrieval Evaluation

Published

August 2, 2017

Author(s)

Ellen M. Voorhees, Daniel V. Samarov, Ian M. Soboroff

Abstract

This paper explores a method for more accurately estimating the main effect of the system in a typical test-collection-based evaluation of information retrieval systems, and thus increasing the sensitivity of system comparisons. Randomly partitioning the test document collection allows for multiple tests of a given system and topic (replicates). Bootstrap ANOVA can use these replicates to extract system-topic interactions---something not possible without replicates---yielding a more precise value for the system effect and a narrower confidence interval around that value. Experiments using multiple TREC collections demonstrate that removing the topic-system interactions substantially reduces the confidence intervals around the system effect as well as increases the number of significant pairwise differences found. Further, the method is robust against small changes in the number of partitions used, against variability in the documents that constitute the partitions, and the measure of effectiveness used to quantify system effectiveness.

Citation

ACM Transactions on Information Systems

Pub Type

Journals

Download Paper

DOI Link

Keywords

information retrieval, statistical analysis, test collections, topic variance

Information technology and Data and informatics

Citation

Voorhees, E. , Samarov, D. and Soboroff, I. (2017), Using Replicates in Information Retrieval Evaluation, ACM Transactions on Information Systems, [online], https://doi.org/10.1145/3086701 (Accessed July 12, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created August 2, 2017, Updated November 10, 2018

Was this page helpful?

Using Replicates in Information Retrieval Evaluation

Author(s)

Abstract

Download Paper

Keywords

Citation

Additional citation formats

Issues