Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

OpenASR20: An Open Challenge for Automatic Speech Recognition ofConversational Telephone Speech in Low-Resource Languages

Published

Author(s)

Kay Peterson, Audrey N. Tong, Jennifer Yu

Abstract

In 2020, the National Institute of Standards and Technology (NIST), in cooperation with the Intelligence Advanced Research Project Activity (IARPA), conducted an open challenge on automatic speech recognition (ASR) technology for low-resource languages on a challenging data type - conversational telephone speech. The OpenASR20 Challenge was offered for ten low-resource languages - Amharic, Cantonese, Guarani, Javanese, Kurmanji Kurdish, Mongolian, Pashto, Somali, Tamil, and Vietnamese. A total of nine teams from five countries fully participated, and 128 valid submissions were scored. This paper gives an overview of the challenge setup and procedures, as well as a summary of the results. The results show overall high word error rate (WER), with the best results on a severely constrained training data condition ranging from 0.4 to 0.65, depending on the language. ASR with such limited resources remains a challenging problem. Providing a computing platform may be a way to level the playing field and encourage wider participation in challenges like OpenASR.
Proceedings Title
Proc. Interspeech 2021
Conference Dates
August 31-September 3, 2021
Conference Location
Brno, CZ
Conference Title
INTERSPEECH 2021, Special session on OpenASR and Low Resource ASR Development

Keywords

automatic speech recognition, evaluation, low-resource language, conversational telephone speech, IARPA MATERIAL, Amharic, Cantonese, Guarani, Javanese, Kurmanji Kurdish, Mongolian, Pashto, Somali, Tamil, Vietnamese

Citation

Peterson, K. , Tong, A. and Yu, J. (2021), OpenASR20: An Open Challenge for Automatic Speech Recognition ofConversational Telephone Speech in Low-Resource Languages, Proc. Interspeech 2021, Brno, CZ, [online], https://doi.org/10.21437/Interspeech.2021-1930, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=932302 (Accessed May 27, 2024)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created September 1, 2021, Updated November 29, 2022