An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
The Effect of Sampling Strategy on Inferred Measures
Published
Author(s)
Ellen M. Voorhees
Abstract
Using the inferred measures framework is a popular choice for constructing test collections when the target document set is too large for pooling to be a viable option. Within the framework, different amounts of assessing effort is placed on different regions of the ranked lists as defined by a sampling strategy. The sampling strategy is critically important to the quality of the resultant collection, but there is little published guidance as to the important factors. This paper addresses this gap by examining the effect on collection quality of different sampling strategies within the inferred measures framework. The quality of a collection is measured by how accurately it distinguishes the set of significantly different system pairs. Top-K pooling is competitive, though not the best strategy because it cannot distinguish topics with large relevant set sizes. Incorporating a deep, very sparsely sampled stratum is a poor choice. Strategies that include a top-10 pool create better collections than those that do not, as well as allow Precision(10) scores to be directly computed.
Voorhees, E.
(2014),
The Effect of Sampling Strategy on Inferred Measures, Proceedings of SIGIR 2014, Gold Coast, -1, [online], https://doi.org/10.1145/2600428.2609524
(Accessed December 13, 2024)