The goal of the OpenCLIR (Open Cross Language Information Retrieval) evaluation is to develop methods to locate text and speech content in “documents” (speech or text) in low-resource languages, using English queries. This capability is one of several expected to ultimately support effective triage and analysis of large volumes of data, in a variety of less studied languages. Successful systems will be able to adapt to new languages and new genres.
The OpenCLIR evaluation was created out of the IARPA MATERIAL program that encompasses more tasks, including domain classification and summarization, and more languages and query types. The purpose of OpenCLIR is to provide a simplified, smaller scale evaluation open to all. Please see IARPA's MATERIAL website and NIST's MATERIAL website for more information on the MATERIAL program and IARPA's OpenCLIR website for OpenCLIR specifically.
The first OpenCLIR evaluation, OpenCLIR19, took place in January/February 2019. Details can be found in the evaluation plan linked under Documentation below.
The first OpenCLIR evaluation planned on declaring a winner in two separate categories, text and audio data, with a monetary award for the winner of USD 10,000 in the text category and USD 20,000 in the audio category. Please see the documentation below for more details and rules regarding the prizes.
The winners of the OpenCLIR19 challenge were announced by IARPA on November 8, 2019 in this Tweet.
Text data track winner:
Text data track runners-up:
|Release of evaluation plan||July 2018|
|Registration period||Mid-July, 2018 - November 30, 2018|
Release of Build Packs (training data)
Release of ANALYSIS, DEV, QUERY-DEV (encrypted data, decryption keys)
August 21, 2018 - May 31, 2019
August 21, 2018
August 21, 2018
|Release of EVAL, QUERY-EVAL (encrypted data)||March 4, 2019|
Release of EVAL, QUERY-EVAL (decryption keys)
System output due to NIST
March 11 - May 31, 2019
March 11, 2019
May 31, 2019
|System description due to NIST||July 12, 2019|