The Multimodal Information Group in the Information Technology Laboratory's Information Access Division at NIST is undertaking the development of an evaluation for machine translation (MT) using "edit-distance" as the evaluation metric as defined by the GALE (Global Autonomous Language Exploitation) program.
Machine Translation (MT) is a technology that translates language data from a foreign language into a target language. The source language for the GALE program will be either Arabic or Chinese, and the target language will be English. Input data will be in the form of either audio or text, with the output always being text.
The GALE program will evaluate MT in terms of the quality of the system translations. This will be accomplished by measuring the edit distance between a system output and a gold standard reference. The term "edit-distance" refers to the number of edits (modifications) that someone needs to make to the output of a machine translation system such that the resulting text is fluent English and completely captures the meaning of the gold standard reference.
MT for GALE presents interesting challenges as it will require both translation from text sources and a combination of transcription+translation from speech sources. NIST will measure MT quality in terms of edit-distance from human performance.
This web site provides information regarding the NIST activities related to this work. If you have any GALE related questions for NIST, you may send e-mail to: GALE_poc@nist.gov.