Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

IARPA MATERIAL Program

 

The goal of the IARPA MATERIAL (Machine Translation for English Retrieval of Information in Any Language) program is to develop methods to locate text and speech content in “documents” (speech or text) in low-resource languages using domain-contextualized English queries, and to display a summary in English of the information of interest in the relevant documents.  This capability is expected to enable effective triage and analysis of large volumes of data, and to do so in a way that takes into account an analyst’s domains of interest in a variety of less studied languages.  The program will require that the capability be constructed using limited amounts of ground truth bitext data and no domain adaptation data.  Successful systems will be able to adapt to new domains and new genres.

The queries will be in English, the material to be searched will be in different languages, and the summaries must be displayed in English.  It should be noted that in real-world use, the output from the system would represent documents from multiple languages, mingled in one output “queue.” 

A summary could be a word-cloud, an extractive summary, or an abstractive summary.  The summary will be required to be formatted as static text, possibly with multiple colors, sizes, and spatial alignments and orientations, but with no animations, and no lines or arrows or other graphic elements.  The central requirement is that the summary must suffice for the user to judge the relevance of the retrieved items to the domain-contextualized query.  Research done under MATERIAL will need to include work on effective summarization. 

A central aspect of a MATERIAL system is that an actual information need will be within a context characterized by domains of interest.  An example is seeking information about Ebola, but only in the context of epidemiology.  Another example is wheat, in the context of agriculture, vs. wheat in the context of nutrition and food availability, vs. wheat in the context of cultural norms of what the population in some location normally chooses to eat.  MATERIAL will systematically address this association of context with an information need (which we will call a query in a domain).

NIST was asked to design and conduct performance evaluations for the MATERIAL program.

Open Evaluations

In association with the evaluations for MATERIAL, NIST also works with IARPA in conducting a series of open challenges focusing on specific aspects of MATERIAL.

  • OpenASR: A componet evaluation focusing automatic speech recognition (ASR) technologies.
  • OpenCLIR: A scaled-down version of the MATERIAL evaluation focusing on cross-language information retrieval.

Option Period 1 (2019/20) Evaluation

Base Period (2018/19) Evaluation 

 

Created August 17, 2017, Updated September 8, 2020