Skip to main content

NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.

Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.

U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages

Published

Author(s)

Kay Peterson, Audrey N. Tong, Jennifer Yu

Abstract

In 2021, the National Institute of Standards and Technology (NIST), in cooperation with the Intelligence Advanced Research Project Activity (IARPA), conducted OpenASR21, the second cycle of an open challenge series of automatic speech recognition (ASR) technology for low-resource languages. The OpenASR21 Challenge was offered for 15 low-resource languages. Five of these languages were new in 2021. OpenASR21 also introduced a case-sensitive scoring track on a wider set of data genres for three of the new languages, as a proxy for assessing ASR performance on proper nouns. The paper gives an overview of the challenge setup and results. Fifteen teams from seven countries made at least one required valid submission. 504 submissions were scored. Results show that ASR performance under a severely constrained training condition is still a challenge, with the best Word Error Rate (WER) ranging from 32% (Swahili) to 68% (Farsi). However, improvements over OpenASR20 were made by augmenting training data with perturbation and text-to-speech techniques along with system combination.
Proceedings Title
Proceedings INTERSPEECH 2022
Conference Dates
September 18-22, 2022
Conference Location
Incheon, KR
Conference Title
INTERSPEECH 2022

Keywords

automatic speech recognition, evaluation, low-resource, conversational speech, news broadcast, topical broadcast, case sensitivity, IARPA MATERIAL, Amharic, Cantonese, Farsi, Georgian, Guarani, Javanese, Kazakh, Kurmanji Kurdish, Mongolian, Pashto, Somali, Swahili, Tagalog, Tamil, Vietnamese

Citation

Peterson, K. , Tong, A. and Yu, J. (2022), OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages, Proceedings INTERSPEECH 2022, Incheon, KR, [online], https://doi.org/10.21437/Interspeech.2022-10972, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=934557 (Accessed October 9, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created September 22, 2022, Updated December 6, 2022
Was this page helpful?