An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
The TREC-8 Question Answering track was the first large-scale evaluation of systems that return answers, as opposed to lists of documents, in response to a question. As a first evaluation, it is important to examine the evaluation methodology itself to understand any limits on the conclusions that can be drawn from the evaluation and possibly to find ways to improve subsequent evaluations. This paper has two main goals: to describe in detail how the evaluation was implemented, and to examine the consequences of the methodology on the comparative performance of the systems participating in the evaluation. The examination uncovered no serious flaws in the methodology, supporting its continued use for question answering evaluation. Nonetheless, redefining the specific task to be performed so that it more closely matches an actual user task does appear warranted.
evaluation, human assessors, natural language processing, question answering, task-based training, think-aloud observations, TREC
Citation
Voorhees, E.
and Tice, D.
(2000),
The TREC-8 Question Answering Track Evaluation, The TREC-8 Question Answering Track Evaluation, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151446
(Accessed October 12, 2024)