The TREC-8 Question Answering Track Evaluation

Ellen M. Voorhees; D M. Tice

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

The TREC-8 Question Answering Track Evaluation

Published

May 1, 2000

Author(s)

Ellen M. Voorhees, D M. Tice

Abstract

The TREC-8 Question Answering track was the first large-scale evaluation of systems that return answers, as opposed to lists of documents, in response to a question. As a first evaluation, it is important to examine the evaluation methodology itself to understand any limits on the conclusions that can be drawn from the evaluation and possibly to find ways to improve subsequent evaluations. This paper has two main goals: to describe in detail how the evaluation was implemented, and to examine the consequences of the methodology on the comparative performance of the systems participating in the evaluation. The examination uncovered no serious flaws in the methodology, supporting its continued use for question answering evaluation. Nonetheless, redefining the specific task to be performed so that it more closely matches an actual user task does appear warranted.

Citation

The TREC-8 Question Answering Track Evaluation

Volume

Pub Type

Others

Download Paper

Local Download

Keywords

evaluation, human assessors, natural language processing, question answering, task-based training, think-aloud observations, TREC

Citation

Voorhees, E. and Tice, D. (2000), The TREC-8 Question Answering Track Evaluation, The TREC-8 Question Answering Track Evaluation, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151446 (Accessed July 17, 2026)

Additional citation formats

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created May 1, 2000, Updated February 17, 2017

Was this page helpful?