NOTICE: Due to a lapse in annual appropriations, most of this website is not being updated. Learn more.
Form submissions will still be accepted but will not receive responses at this time. Sections of this site for programs using non-appropriated funds (such as NVLAP) or those that are excepted from the shutdown (such as CHIPS and NVD) will continue to be updated.
An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation
Published
Author(s)
Chris Callison-Burch, Philipp Koehn, Christof Monz, Kay Peterson, Mark A. Przybocki, Omar F. Zaidan
Abstract
This paper presents the results of the WMT10 and MetricsMATR10 shared tasks, which included a translation task, a system combination task, and an evaluation task. We conducted a large-scale manual evaluation of 104 machine translation systems and 41 system combination entries. We used the ranking of these systems to measure how strongly auto- matic metrics correlate with human judgments of translation quality for 26 metrics. This year we also investigated increasing the number of human judgments by hiring non-expert annotators through Amazon's Mechanical Turk.
Proceedings Title
ACL 2010 Joint Fifth Workshop on Statistical Machine Translation and MetricsMaTr
Callison-Burch, C.
, Koehn, P.
, Monz, C.
, Peterson, K.
, Przybocki, M.
and Zaidan, O.
(2010),
Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation, ACL 2010 Joint Fifth Workshop on Statistical Machine Translation and MetricsMaTr, Uppsala, SE, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=906756
(Accessed October 13, 2025)