Building a Question Answering Test Collection

Ellen M. Voorhees; D M. Tice

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

Building a Question Answering Test Collection

Published

July 1, 2000

Author(s)

Ellen M. Voorhees, D M. Tice

Abstract

The TREC-8 Question Answering (QA) Track was the first large-scale evaluation of domain-independent question answering systems. In addition to fostering research on the QA task, the track was used to investigate whether the evaluation methodology used for document retrieval is appropriate for a different natural language processing task. As with document relevance judging, assessors had legitimate differences of opinions as to whether a response actually answers a question, but comparative evaluation of QA systems was stable despite these differences. Creating a reusable QA test collection is fundamentally more difficult than creating a document retrieval test collection since the QA task has no equivalent to document identifiers.

Citation

ACM Special Interest Group in Information Retrieval (SIGIR)

Volume

Pub Type

Journals

Keywords

evaluation, human assessors, natural language processing, question answering, task-based training, think-aloud observations, TREC

Citation

Voorhees, E. and Tice, D. (2000), Building a Question Answering Test Collection, ACM Special Interest Group in Information Retrieval (SIGIR) (Accessed November 7, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created July 1, 2000, Updated February 17, 2017

Was this page helpful?

Building a Question Answering Test Collection

Author(s)

Abstract

Keywords

Citation

Additional citation formats

Issues