NIST has coordinated annual evaluations of text-independent speaker recognition from 1996 to 2006. This paper discusses the last three of these, which utilized conversational speech data from the Mixer Corpora recently collected by the Linguistic Data Consortium. We review the evaluation procedures, the matrix of test conditions included, and the performance trends observed. While most of the data is collected over telephone channels, one multi-channel test condition utilizes a subset of Mixer conversations recorded simultaneously over multiple microphone channels and a telephone line. The corpus also includes some non-English conversations involving bilingual speakers, allowing an examination of the effect of language on performance results. On the various test conditions involving English language conversational telephone data, considerable performance gains are observed over the past three years.
Citation: IEEE Trans. on Audio, Speech & Language Processing
Pub Type: Journals
Mixer Corpora, NIST SRE¿s, speaker recognition