Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Building a Question Answering Test Collection



Ellen M. Voorhees, D M. Tice


The TREC-8 Question Answering (QA) Track was the first large-scale evaluation of domain-independent question answering systems. In addition to fostering research on the QA task, the track was used to investigate whether the evaluation methodology used for document retrieval is appropriate for a different natural language processing task. As with document relevance judging, assessors had legitimate differences of opinions as to whether a response actually answers a question, but comparative evaluation of QA systems was stable despite these differences. Creating a reusable QA test collection is fundamentally more difficult than creating a document retrieval test collection since the QA task has no equivalent to document identifiers.
ACM Special Interest Group in Information Retrieval (SIGIR)


evaluation, human assessors, natural language processing, question answering, task-based training, think-aloud observations, TREC


Voorhees, E. and Tice, D. (2000), Building a Question Answering Test Collection, ACM Special Interest Group in Information Retrieval (SIGIR) (Accessed June 14, 2024)


If you have any questions about this publication or are having problems accessing it, please contact

Created July 1, 2000, Updated February 17, 2017