Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages

Published

Author(s)

Kay Peterson, Audrey N. Tong, Jennifer Yu

Abstract

In 2021, the National Institute of Standards and Technology (NIST), in cooperation with the Intelligence Advanced Research Project Activity (IARPA), conducted OpenASR21, the second cycle of an open challenge series of automatic speech recognition (ASR) technology for low-resource languages. The OpenASR21 Challenge was offered for 15 low-resource languages. Five of these languages were new in 2021. OpenASR21 also introduced a case-sensitive scoring track on a wider set of data genres for three of the new languages, as a proxy for assessing ASR performance on proper nouns. The paper gives an overview of the challenge setup and results. Fifteen teams from seven countries made at least one required valid submission. 504 submissions were scored. Results show that ASR performance under a severely constrained training condition is still a challenge, with the best Word Error Rate (WER) ranging from 32% (Swahili) to 68% (Farsi). However, improvements over OpenASR20 were made by augmenting training data with perturbation and text-to-speech techniques along with system combination.
Proceedings Title
Proceedings INTERSPEECH 2022
Conference Dates
September 18-22, 2022
Conference Location
Incheon, KR
Conference Title
INTERSPEECH 2022

Keywords

automatic speech recognition, evaluation, low-resource, conversational speech, news broadcast, topical broadcast, case sensitivity, IARPA MATERIAL, Amharic, Cantonese, Farsi, Georgian, Guarani, Javanese, Kazakh, Kurmanji Kurdish, Mongolian, Pashto, Somali, Swahili, Tagalog, Tamil, Vietnamese

Citation

Peterson, K. , Tong, A. and Yu, J. (2022), OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages, Proceedings INTERSPEECH 2022, Incheon, KR, [online], https://doi.org/10.21437/Interspeech.2022-10972, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=934557 (Accessed April 24, 2024)
Created September 22, 2022, Updated December 6, 2022