An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Seyed Omid Sadjadi, Timothée N. Kheyrkhah, Audrey N. Tong, Craig S. Greenberg, Douglas Reynolds, Elliot Singer, Lisa Mason, Jaime Hernandez-Cordero
Abstract
In 2017, NIST conducted the most recent in an ongoing series of Language Recognition Evaluations (LRE) meant to foster research in robust text- and speaker-independent language recognition, as well as measure performance of current state-of-the-art systems. LRE17 was organized in a similar manner to LRE15, focusing on differentiating closely related languages (14 in total) drawn from 5 language clusters, namely Arabic, Chinese, English, Iberian, and Slavic. Similar to LRE15, LRE17 offered Fixed and Open training conditions to facilitate cross-system comparisons as well as understand the impact of additional and unconstrained amount of training data on system performance. There were, however, several differences between LRE17 and LRE15 most notably including 1) release of a small development set which broadly matched the LRE17 test set, 2) use of audio extracted from online videos (AfV) as development and test material, 3) system outputs in form of log-likelihood scores, rather than log-likelihood ratios, and 4) an alternative cross-entropy based performance metric. A total of 25 research organizations, forming 18 teams, participated in this four-month long valuation and, combined, they submitted 79 valid systems to be evaluated. This paper presents an overview of the evaluation and analysis of system performance over all primary evaluation conditions. The evaluation results suggests that 1) language recognition on AfV data was, in general, more challenging than telephony data 2) top performing systems exhibit similar performance, 3) performance improvements were largely due to data augmentation and use of more complex models for data representation, and 4) effective use of the development set seemed to be essential for the top performing systems.
Proceedings Title
Speaker Odyssey 2018
Conference Dates
June 26-29, 2018
Conference Location
Les Sables dOlonne
Conference Title
The Speaker and Language Recognition Workshop: Odyssey 2018
, S.
, Kheyrkhah, T.
, Tong, A.
, Greenberg, C.
, Reynolds, D.
, Singer, E.
, Mason, L.
and Hernandez-Cordero, J.
(2018),
The 2017 NIST Language Recognition Evaluation, Speaker Odyssey 2018, Les Sables dOlonne, -1, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=925272
(Accessed December 4, 2024)