The Impact of Data Dependency on Speaker Recognition Evaluation
Jin Chu Wu, Alvin F. Martin, Craig S. Greenberg, Raghu N. Kacker
The data dependency due to multiple use of the same subjects has impact on the standard error (SE) of the detection cost function (DCF) in speaker recognition evaluation. The DCF is defined as a weighted sum of the probabilities of type I and type II errors at a given threshold. A two-layer data structure is constructed: target scores are grouped into target sets based on the dependency, and likewise for non-target scores. On account of the needed equal probabilities for scores being selected when resampling, target sets must contain the same number of target scores, and so must non-target sets. In addition to the bootstrap method with i.i.d. assumption, the nonparametric two-sample one-layer and two-layer bootstrap methods are carried out based on whether the resampling takes place only on sets, or subsequently on scores within the sets. Due to the stochastic nature of the bootstrap, the distributions of the SEs of the DCF estimated using the three different bootstrap methods are created and compared. After performing hypothesis testing, it is found that data dependency increases not only the SE but also the variation of the SE, and the two-layer bootstrap is more conservative than the one-layer bootstrap. The rationale regarding the different impacts of the three bootstrap methods on the estimated SEs is investigated.
IEEE Transactions on Audio Speech and Language Processing
, Martin, A.
, Greenberg, C.
and Kacker, R.
The Impact of Data Dependency on Speaker Recognition Evaluation, IEEE Transactions on Audio Speech and Language Processing, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=922671
(Accessed October 1, 2022)