Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PROJECTS/PROGRAMS

Content-Based Access to Unstructured Information

Summary

This project uses community evaluations to build the infrastructure required to evaluate system effectiveness on information access tasks. In particular, the project includes the annual TREC, TRECVID, and TAC evaluations, which foster research on search technologies for different media types (e.g., text, video) and genres (e.g., broadcast news, blogs, corporate repositories), as well as on language processing technologies such as summarization and information extraction.

Description

The goal of the project is to support the development of automatic systems that can provide content-based access to information that has not been explicitly structured for machine consumption. This support is in the form of evaluation infrastructure, since to quote Lord Kelvin, "If you can not measure it, you can not improve it."

The infrastructure is created through community participation in three evaluation conferences, the Text REtrieval Conference (TREC), the TREC Video Retrieval Evaluation (TRECVID), and the Text Analysis Conference (TAC). Within each conference, several focus areas are selected in collaboration with participants, outside sponsors, and other stakeholders. For each focus area, the conference provides guidelines defining the evaluation task and a corresponding data set. Participants perform the task and submit their results to NIST. For most tasks, the union of the submitted results is annotated for correctness, and these annotations form the basis for scoring individual submissions. Finally, participants gather at a workshop to discuss their results.

Open evaluations with carefully selected test data has proved to be a powerful technique to advance technology and measure progress in a field. In addition to improving the state-of-the-art, a common focus on an evaluation task forms or solidifies a research community, establishes the research methodology for the field, facilitates technology transfer, and amortizes the costs associated with building the necessary infrastructure.

Major Accomplishments

The evaluation conferences have created publicly-available data sets and appropriate evaluation methodology that enable individual research groups to measure the quality of their own access methods. These resources have been acknowledged as being instrumental in the development of information access technologies, and are routinely used and cited in work reported in the open literature.

Organizations

NIST Staff

Former Staff

ian.soboroff [at] nist.gov (Ian Soboroff), Group Leader
hoa.dang [at] nist.gov (Hoa Dang)
darrin.dimmick [at] nist.gov (Darrin Dimmick)
angela.ellis [at] nist.gov (Angela Ellis)
donna.harman [at] nist.gov (Donna Harman)
over [at] nist.gov (Paul Over)
ellen.voorhees [at] nist.gov (Ellen Voorhees)

Guest Researchers:
george.awad [at] nist.gov (George Awad)
yasaman.haghpanah-jahromi [at] nist.gov (Yasaman Haghpanah-Jahromi)
dean.mccullough [at] nist.gov (Dean McCullough)
sauparna.palchowdhury [at] nist.gov (Sauparna Palchowdhury)

Contact

Ian Soboroff

[email protected]

(301) 975-3987

Project Status

Ongoing

Other Projects

• Text Retrieval Conference (TREC)

• Text Analysis Conference (TAC)

• TREC Video Retrieval Evaluation (TRECVID)

Created March 5, 2009, Updated March 26, 2025

Was this page helpful?