APPROACHES AND BEST PRACTICES: Data Collection of Audio Dialogues to Support the Training of Speech-to-Speech Translation Systems

Brian A. Weiss; Craig I. Schlenoff; Ann M. Virts

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

APPROACHES AND BEST PRACTICES: Data Collection of Audio Dialogues to Support the Training of Speech-to-Speech Translation Systems

Published

June 25, 2010

Author(s)

Brian A. Weiss, Craig I. Schlenoff, Ann M. Virts

Abstract

The purpose of this document is to describe the best practices that personnel from the National Institute of Standards and Technology (NIST) have developed and implemented to efficiently and effectively capture two-way, free-form speech-to-speech audio dialogues within recording studios. These dialogues, produced to support the development and evaluation of machine translation technologies, are conducted by English and foreign language speakers conversing with one another in their native languages through the mediation of an interpreter. NIST personnel have collected over 500 hours of bilingual audio data sets encompassing more than 1100 dialogues across three unique language pairs (English/Iraqi-Arabic, English/Dari, and English/Pashto) since it became involved in this work in 2007. This document will present the methods the NIST team has designed and employed allowing the successful capture of audio data. In addition to the data collection protocols including personnel training and workflow, data collection scenario generation and speaker recruitment protocols will be discussed.

Citation

NIST Interagency/Internal Report (NISTIR) - 7712

Report Number

7712

NIST Pub Series

NIST Interagency/Internal Report (NISTIR)

Pub Type

NIST Pubs

Download Paper

Local Download

Keywords

Speech-to-speech translation systems, data collection

Information technology and Data and informatics

Citation

Weiss, B. , Schlenoff, C. and Virts, A. (2010), APPROACHES AND BEST PRACTICES: Data Collection of Audio Dialogues to Support the Training of Speech-to-Speech Translation Systems, NIST Interagency/Internal Report (NISTIR), National Institute of Standards and Technology, Gaithersburg, MD, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=905565 (Accessed June 13, 2026)

Additional citation formats

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created June 25, 2010, Updated February 19, 2017

Was this page helpful?