Publications

Displaying 1 - 25 of 32

TREC 2020 News Track Overview

May 21, 2021

Author(s)

Ian Soboroff, Shudong Huang, Donna Harman

The News track focuses on information retrieval in the service of help- ing people read the news. In 2018, in cooperation with the Washington Post1, we released a new collection of nearly 600,000 news articles, and crafted two tasks related to how news is

DUC in Context

November 21, 2007

Author(s)

Paul D. Over, Hoa T. Dang, Donna K. Harman

Recent years have seen increased interest in text summarization with emphasis on evaluation of prototype ststems. Many factors can affect the design of such evaluations, requiring choices among competing alternatives. This paper examines several major

The Importance of Focused Evaluations: A Case Study of TREC and DUC

January 10, 2007

Author(s)

Donna K. Harman

Evaluation has always been an important part of scientific research, and in information retrieval, this evaluation has mostly been done using test collections. In 1992 a new test collection was built at the National Institute of Standards and Technology

The Fifth Text Retrival Conference [TREC-5]

October 30, 2006

Author(s)

Ellen M. Voorhees, Donna K. Harman

This paper is the track report for the TREC-5 confusion track. For TREC-5, retrieval from corrupted data was studied through retrieval of specific target documents from a corpus that was corrupted by applying OCR techniques to page images of varying

TREC: An Overview

February 17, 2006

Author(s)

Donna K. Harman, Ellen M. Voorhees

The Text REtrieval Conference (TREC) is a workshop series designed to build the infrastructure necessary for large-scale evaluation of text retrieval technology. Participants in the workshops (over 100 groups in the latest TREC) have been drawn from the

The History of IDF and its Influences on IR and Other Fields

December 21, 2005

Author(s)

Donna K. Harman

The surprisingly simple IDF measure developed in 1972 by Karen Sparck Jones has continued to dominate the term weighting metrics used in information retrieval, despite several efforts to develop more complex measures of term distribution. It has been

Novelty Detection: The TREC Experience

October 1, 2005

Author(s)

Ian M. Soboroff, Donna K. Harman

A challenge for search systems is to detect not only when an item is relevant to the user's information need, but also when it contains something new which the user has not seen before. In the TREC novelty track, the task was to highlight sentences

Beyond English

April 1, 2005

Author(s)

Donna K. Harman

This chapter summarizes TREC work on retrieval for language other than English. TREC has explored a variety of tasks including both single language tasks (for example, retrieving Chinese documents using Chinese queries) and cross-language tasks (such as

The TREC Ad Hoc Experiments

April 1, 2005

Author(s)

Donna K. Harman

Ad hoc retrieval is the prototypical search engine task: searching a static set of documents with a previouslyunseen query. The ad hoc task was one of the first two tasks tackled in TREC and was run for eight years, representing hundreds of experiments

The TREC Test Collections

April 1, 2005

Author(s)

Donna K. Harman

The creation of a set of large, unbiased test collections has been critical to the success of TREC. This chapteris the documentation for the TREC collections. It reviews the motivation for building the collections, describes the methods used to create them

IT--The Twelfth Text Retrieval Conference, TREC 2003

October 25, 2004

Author(s)

Ellen M. Voorhees, Donna K. Harman

This chapter provides an executive summary of the TREC workshop series and the remainder of thevolume. It explains the motivation for TREC and highlights TREC's accomplishments in improving retrievaleffectiveness and fostering technology transfer.

The Effects of Human Variation in DUC Summarization Evaluation

July 1, 2004

Author(s)

Donna K. Harman, Paul D. Over

There is a long history of research in automatic text summarization systems by both the text retrieval and the natural language processing communities, but evaluation of such systems' output has always presented problems. One critical problem remains how

Overview of the TREC 2003 Novelty Track

May 1, 2004

Author(s)

Ian M. Soboroff, Donna K. Harman

The Eleventh Text REtrieval Conference (TREC-11)

May 1, 2003

Author(s)

Ellen M. Voorhees, Donna K. Harman

The Eleventh Text Retrieval Conference was held in Gaithersburg, Maryland, November 19-22, 2002. TREC 2002 is the latest in a series of workshops designed to foster research in information retrieval and related tasks. This year's conference consisted of

Overview of the TREC 2002 Novelty Track

April 1, 2003

Author(s)

Donna K. Harman

The novelty track was a new track in TREC-11. The basic task was as follows: given a TREC topic and an ordered list of relevant documents (ordered by relevance ranking), find the relevant and novel sentences that should be returned to the user from this

The Development and Evolution of TREC and DUC

October 1, 2002

Author(s)

Donna K. Harman

The Text REtrieval Conference (TREC) has been running for 11 years now, with 93 participants in the last round of evaluation. This paper chronicles the changes in TREC over that time, emphasizing the evolution in the tasks that were evaluated rather than

The Tenth Text Retrieval Conference, TREC- 2001

April 1, 2002

Author(s)

Ellen M. Voorhees, Donna K. Harman

TREC 2001 is the latest in a series of workshops designed to foster research in information retrieval and related tasks. This year's conference consisted of six different tasks, including a new task on content-based retrieval of digital video. The overview

The DUC Summarization Evaluations

March 1, 2002

Author(s)

Donna K. Harman, Paul D. Over

There has been a long history of research in text summarization by both the text retrieval and the natural language processing communities, but evaluation of this research has always presented problems. In 2001 NIST launched a new text summarization

The Ninth Text REtrieval Conference (TREC-9)

October 1, 2001

Author(s)

Ellen M. Voorhees, Donna K. Harman

This paper provides an overview of the ninth Text REtrieval Conference (TREC-9) held in Gaithersburg, Maryland, November 13-16, 2000. TREC-9 is the latest in a series of workshops designed to foster research in text retrieval. This year's conference

The Importance of Focused Evaluations: A Case Study of TREC and DUC

March 1, 2001

Author(s)

Donna K. Harman

Evaluation has always been an important part of scientific research, and in information retrieval, this evaluation has mostly been done using test collections. In 1992, a new test collection was built at the National Institute of Standards and Technology

CLIR Evaluation at TREC

January 1, 2001

Author(s)

Donna K. Harman, M C. Braschler, M Hess, M Kluck, C Peters, P Schauble, P Sheridan

Starting in 1997, the National Institute of Standards and Technology conducted 3 years of evaluation of cross-language information retrieval systems in the Text REtrieval Conference (TREC). Twenty-two participating systems used topics (test questions) in

The Eighth Text REtrieval Conference (TREC-8)

November 1, 2000

Author(s)

Ellen M. Voorhees, Donna K. Harman

This report constitutes the proceedings of the eighth Text REtrieval Conference (TREC-8) held in Gaithersburg, Maryland, November 16, 1999. The conference was co-sponsored by the National Institute of Standards and Technology (NIST) and the Defense

Overview of the Sixth Text REtrieval Conference (TREC-6)

October 25, 1999

Author(s)

Ellen M. Voorhees, Donna K. Harman

The Text REtrieval Conference is a workshop series designed to encourage research on text retrieval for realistic applications by providing large test collections, uniform scoring procedures, and a forum for organizations interested in comparing results

The Seventh Text REtrieval Conference (TREC-7)

July 1, 1999

Author(s)

Ellen M. Voorhees, Donna K. Harman

This report constitutes the proceedings of the seventh Text REtrieval Conference (TREC-7) held in Gaithersburg, Maryland, November 9-11, 1998. The conference was co-sponsored by the National Institute of Standards and Technology (NIST) and the Defense

Results and Challenges in Web Search Evaluation

March 1, 1999

Author(s)

D Hawking, Nick Craswell, P Thistlewaite, Donna Harman

A frozen 18.5 million page snapshot of part of the Web has been created to enable and encourage meaningful and reproducible evaluation of Web search systems and techniques. This collection is being used in an evaluation framework within the Text Retrieval

Search Publications by: