Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation
Chris Callison-Burch, Philipp Koehn, Christof Monz, Kay Peterson, Mark A. Przybocki, Omar F. Zaidan
This paper presents the results of the WMT10 and MetricsMATR10 shared tasks, which included a translation task, a system combination task, and an evaluation task. We conducted a large-scale manual evaluation of 104 machine translation systems and 41 system combination entries. We used the ranking of these systems to measure how strongly auto- matic metrics correlate with human judgments of translation quality for 26 metrics. This year we also investigated increasing the number of human judgments by hiring non-expert annotators through Amazon's Mechanical Turk.
ACL 2010 Joint Fifth Workshop on Statistical Machine Translation and MetricsMaTr
, Koehn, P.
, Monz, C.
, Peterson, K.
, Przybocki, M.
and Zaidan, O.
Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation, ACL 2010 Joint Fifth Workshop on Statistical Machine Translation and MetricsMaTr, Uppsala, SE, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=906756
(Accessed September 21, 2023)