On Run Diversity in "Evaluation as a Service"

Ellen M. Voorhees

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

On Run Diversity in "Evaluation as a Service"

Published

July 6, 2014

Author(s)

Ellen M. Voorhees

Abstract

"Evaluation as a service" (EaaS) is a new methodology that enables community-wide evaluations and the construction of test collections on documents that cannot be distributed. The basic idea is that evaluation organizers provide a service API through which the evaluation task can be completed. This concept, however, violates some of the premises of traditional pool-based collection building, and, as a result, the quality of the resulting test collection may be compromised. In particular, the service API might restrict the diversity of runs that contribute to the pool: not only may this hamper innovation by researchers, but the lack of diversity might lead to incomplete judgment pools that affect the reusability of the collection. This paper shows that the distinctiveness of the retrieval runs used to construct the first test collection built using EaaS, the TREC 2013 Microblog collection, is not substantially different from that of the TREC-8 ad hoc collection, a high-quality collection built using traditional pooling. An additional test of collection reusability, the `leave out uniques' test, suggests the Microblog 2013 collection's pools are less complete than the TREC-8 collection, though both collections strongly benefit from the presence of a set of distinctive and effective manual runs. Although we cannot yet generalize to all EaaS evaluations, our analyses reveal no obvious flaws in the test collection built using the methodology in the TREC 2013 Microblog track.

Proceedings Title

Proceedings of SIGIR 2014

Conference Dates

July 6-11, 2014

Conference Location

Gold Coast

Conference Title

SIGIR 2014

Pub Type

Conferences

Download Paper

DOI Link

Keywords

information retrieval, test collection, TREC

Data and informatics

Citation

Voorhees, E. (2014), On Run Diversity in "Evaluation as a Service", Proceedings of SIGIR 2014, Gold Coast, -1, [online], https://doi.org/10.1145/2600428.2609484 (Accessed July 16, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created July 6, 2014, Updated November 10, 2018

Was this page helpful?

On Run Diversity in "Evaluation as a Service"

Author(s)

Abstract

Download Paper

Keywords

Citation

Additional citation formats

Issues