The Importance of Focused Evaluations: A Case Study of TREC and DUC
Donna K. Harman
Evaluation has always been an important part of scientific research, and in information retrieval, this evaluation has mostly been done using test collections. In 1992, a new test collection was built at the National Institute of Standards and Technology (NIST), and a focused evaluation (the Text REtrieval Conference or TREC) was started to use the collection. Results from nearly 12 years of this focused evaluation show significant technology transfer across systems, leading to major improvements in system performance. Focused evaluations also create the ability to target specific problems in language technology, such as retrieval across languages, and to design tasks for evaluation such that issues can be studied concurrently by multiple groups. This chapter will discuss some of the tasks that have been examined in TREC, including critical factors in the design of those evaluations. Additionally a second focused evaluation, the Document Understanding Conference (DUC) which evaluates text summarization, will be discussed.
Progress in Natural Language Processing & Information Retrieval: A Festschrift for Karen Sparck Jones