The Impact of Scenario Development on the Performance of Speech Translation Systems Prescribed by the SCORE Framework
Brian A. Weiss, Craig I. Schlenoff
The Defense Advanced Research Projects Agency's (DARPA) Spoken Language Communication and Translation for Tactical Use (TRANSTAC) program is a focused advanced technology research and development program. The intent of the TRANSTAC program is to demonstrate capabilities to quickly develop and implement free-form, two-way, speech-to-speech spoken language translation systems allowing speakers of different languages to communicate with each other in real-world tactical situations without the need for an interpreter. The National Institute of Standards and Technology (NIST), with support from the Mitre Corporation and Appen Pty Limited, has been funded by DARPA to evaluate the TRANSTAC technologies since 2006. The NIST-led Independent Evaluation Team (IET) has numerous responsibilities in this ongoing effort including collecting and processing training data, designing and implementing performance evaluations and analyzing the test data. In order to design and execute fair and relevant evaluations, the NIST IET has employed the System, Component and Operationally-Relevant Evaluation (SCORE) framework. The SCORE framework is a unified set of criteria and tools built around the premise that in order to gain an understanding of how a technology would perform in its intended environment, it must be evaluated at both the component and system levels and further tested in operationally-relevant environments while capturing both quantitative and qualitative performance data. Since an evaluation goal of the TRANSTAC program is to capture quantitative performance data of the translation technologies, the IET developed and implemented SCORE-inspired live evaluation scenarios. The two forms of live evaluation scenarios have unique impacts on the quantitative performance data. This paper not only presents the TRANSTAC program and SCORE methodology, but also focuses on the evaluation scenarios and their influence on system performance.
Proceedings of the Performance Metrics for Intelligent Systems (PerMIS) 2009
and Schlenoff, C.
The Impact of Scenario Development on the Performance of Speech Translation Systems Prescribed by the SCORE Framework, Proceedings of the Performance Metrics for Intelligent Systems (PerMIS) 2009, Gaithersburg, MD, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=903628
(Accessed June 3, 2023)