OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages

Kay Peterson; Audrey N. Tong; Jennifer Yu

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages

Published

September 22, 2022

Author(s)

Kay Peterson, Audrey N. Tong, Jennifer Yu

Abstract

In 2021, the National Institute of Standards and Technology (NIST), in cooperation with the Intelligence Advanced Research Project Activity (IARPA), conducted OpenASR21, the second cycle of an open challenge series of automatic speech recognition (ASR) technology for low-resource languages. The OpenASR21 Challenge was offered for 15 low-resource languages. Five of these languages were new in 2021. OpenASR21 also introduced a case-sensitive scoring track on a wider set of data genres for three of the new languages, as a proxy for assessing ASR performance on proper nouns. The paper gives an overview of the challenge setup and results. Fifteen teams from seven countries made at least one required valid submission. 504 submissions were scored. Results show that ASR performance under a severely constrained training condition is still a challenge, with the best Word Error Rate (WER) ranging from 32% (Swahili) to 68% (Farsi). However, improvements over OpenASR20 were made by augmenting training data with perturbation and text-to-speech techniques along with system combination.

Proceedings Title

Proceedings INTERSPEECH 2022

Conference Dates

September 18-22, 2022

Conference Location

Incheon, KR

Conference Title

INTERSPEECH 2022

Pub Type

Conferences

Download Paper

https://doi.org/10.21437/Interspeech.2022-10972

Local Download

Keywords

automatic speech recognition, evaluation, low-resource, conversational speech, news broadcast, topical broadcast, case sensitivity, IARPA MATERIAL, Amharic, Cantonese, Farsi, Georgian, Guarani, Javanese, Kazakh, Kurmanji Kurdish, Mongolian, Pashto, Somali, Swahili, Tagalog, Tamil, Vietnamese

Information technology

Citation

Peterson, K. , Tong, A. and Yu, J. (2022), OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages, Proceedings INTERSPEECH 2022, Incheon, KR, [online], https://doi.org/10.21437/Interspeech.2022-10972, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=934557 (Accessed November 26, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created September 22, 2022, Updated December 6, 2022

Was this page helpful?

OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages

Author(s)

Abstract

Download Paper

Keywords

Citation

Additional citation formats

Issues