The goal of the OpenCLIR (Open Cross Language Information Retrieval) Challenge is to develop methods to locate text and speech content in “documents” (speech or text) in low-resource languages, using English queries. This capability is one of several expected to ultimately support effective triage and analysis of large volumes of data, in a variety of less studied languages. Successful systems will be able to adapt to new languages and new genres.
The OpenCLIR Challenge was created out of the IARPA (Intelligence Advanced Research Projects Activity) MATERIAL (Machine Translation for English Retrieval of Information in Any Language) program, which encompasses more tasks, including domain classification and summarization, and more languages and query types. Also see the NIST's MATERIAL page. The purpose of OpenCLIR is to provide a simplified, smaller scale evaluation open to all.
OpenCLIR19 took place in January/February 2019. Details can be found in the evaluation plan linked in the Documentation and Resources section.
| Milestone | Date | 
|---|---|
| Release of evaluation plan | July 2018 | 
| Registration period | Mid-July, 2018 - November 30, 2018 | 
| Development cycle Release of Build Packs (training data) Release of ANALYSIS, DEV, QUERY-DEV (encrypted data, decryption keys) | August 21, 2018 - May 31, 2019 August 21, 2018 August 21, 2018 | 
| Release of EVAL, QUERY-EVAL (encrypted data) | March 4, 2019 | 
| Evaluation period Release of EVAL, QUERY-EVAL (decryption keys) System output due to NIST | March 11 - May 31, 2019 March 11, 2019 May 31, 2019 | 
| System description due to NIST | July 12, 2019 | 
The main metric computed was AQWV (Actual Query Weighted value), which is described in detail in the evaluation plan. The table below lists the best AQWV scores attained for the text and speech categories for each team that fully participated.
| Organization | Team | AQWV (text) | AQWV (speech) | 
| Elhuyar Foundation, Spain | Elhuyarixa | 0.3383 | 0 | 
| Dublin City University, Ireland | DCU-ADAPT | 0.3030 | 0.0303 | 
| Hunan University of Science and Technology, China | CLIR-KPNM | 0.1835 | 0 | 
| Catskills Research Company, USA | Catskills Research | -0.0277 | -0.0062 | 
| University of North Texas, USA | UNTIIA | -0.6701 | 0 | 
Based on the OpenCLIR19 test and system description results supplied by NIST to IARPA, IARPA planned on declaring and awarding OpenCLIR19 winners in two separate categories, text and audio data, with a monetary award for the winner of USD 10,000 in the text category and USD 20,000 in the audio category. Please see the documentation below for more details and rules regarding the prizes.
The winners of the OpenCLIR19 Challenge were announced by IARPA on November 8, 2019 in this Tweet.
Text data track winner:
Text data track runners-up:
Speech track:
Please email openclir_poc [at] nist.gov (openclir_poc[at]nist[dot]gov) for any questions or comments regarding the OpenCLIR Challenge.