Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Building a Question Answering Test Collection

Published

Author(s)

Ellen M. Voorhees, D M. Tice

Abstract

The TREC-8 Question Answering (QA) Track was the first large-scale evaluation of domain-independent question answering systems. In addition to fostering research on the QA task, the track was used to investigate whether the evaluation methodology used for document retrieval is appropriate for a different natural language processing task. As with document relevance judging, assessors had legitimate differences of opinions as to whether a response actually answers a question, but comparative evaluation of QA systems was stable despite these differences. Creating a reusable QA test collection is fundamentally more difficult than creating a document retrieval test collection since the QA task has no equivalent to document identifiers.
Citation
ACM Special Interest Group in Information Retrieval (SIGIR)
Volume
34

Keywords

evaluation, human assessors, natural language processing, question answering, task-based training, think-aloud observations, TREC

Citation

Voorhees, E. and Tice, D. (2000), Building a Question Answering Test Collection, ACM Special Interest Group in Information Retrieval (SIGIR) (Accessed June 14, 2024)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created July 1, 2000, Updated February 17, 2017