The Impact of Data Dependency on Speaker Recognition Evaluation

Jin Chu Wu; Alvin F. Martin; Craig S. Greenberg; Raghu N. Kacker

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

The Impact of Data Dependency on Speaker Recognition Evaluation

Published

February 8, 2017

Author(s)

Jin Chu Wu, Alvin F. Martin, Craig S. Greenberg, Raghu N. Kacker

Abstract

The data dependency due to multiple use of the same subjects has impact on the standard error (SE) of the detection cost function (DCF) in speaker recognition evaluation. The DCF is defined as a weighted sum of the probabilities of type I and type II errors at a given threshold. A two-layer data structure is constructed: target scores are grouped into target sets based on the dependency, and likewise for non-target scores. On account of the needed equal probabilities for scores being selected when resampling, target sets must contain the same number of target scores, and so must non-target sets. In addition to the bootstrap method with i.i.d. assumption, the nonparametric two-sample one-layer and two-layer bootstrap methods are carried out based on whether the resampling takes place only on sets, or subsequently on scores within the sets. Due to the stochastic nature of the bootstrap, the distributions of the SEs of the DCF estimated using the three different bootstrap methods are created and compared. After performing hypothesis testing, it is found that data dependency increases not only the SE but also the variation of the SE, and the two-layer bootstrap is more conservative than the one-layer bootstrap. The rationale regarding the different impacts of the three bootstrap methods on the estimated SEs is investigated.

Citation

IEEE Transactions on Audio Speech and Language Processing

Volume

Issue

Pub Type

Journals

Download Paper

Local Download

Keywords

Data dependency, speaker recognition, standard error, bootstrap, multinomial probability, resampling.

Data and informatics, Statistical analysis and Uncertainty quantification

Citation

, J. , Martin, A. , Greenberg, C. and Kacker, R. (2017), The Impact of Data Dependency on Speaker Recognition Evaluation, IEEE Transactions on Audio Speech and Language Processing, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=922671 (Accessed July 25, 2026)

Additional citation formats

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created February 8, 2017, Updated February 19, 2017

Was this page helpful?