Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

The Effect of Sampling Strategy on Inferred Measures

Published

Author(s)

Ellen M. Voorhees

Abstract

Using the inferred measures framework is a popular choice for constructing test collections when the target document set is too large for pooling to be a viable option. Within the framework, different amounts of assessing effort is placed on different regions of the ranked lists as defined by a sampling strategy. The sampling strategy is critically important to the quality of the resultant collection, but there is little published guidance as to the important factors. This paper addresses this gap by examining the effect on collection quality of different sampling strategies within the inferred measures framework. The quality of a collection is measured by how accurately it distinguishes the set of significantly different system pairs. Top-K pooling is competitive, though not the best strategy because it cannot distinguish topics with large relevant set sizes. Incorporating a deep, very sparsely sampled stratum is a poor choice. Strategies that include a top-10 pool create better collections than those that do not, as well as allow Precision(10) scores to be directly computed.
Proceedings Title
Proceedings of SIGIR 2014
Conference Dates
July 6-11, 2014
Conference Location
Gold Coast
Conference Title
SIGIR 2014

Keywords

information retrieval evaluation, test collections

Citation

Voorhees, E. (2014), The Effect of Sampling Strategy on Inferred Measures, Proceedings of SIGIR 2014, Gold Coast, -1, [online], https://doi.org/10.1145/2600428.2609524 (Accessed April 25, 2024)
Created July 6, 2014, Updated November 10, 2018