Date of Updated Release: Tuesday, November 1, 2006, version 4
The NIST 2006 Machine Translation Evaluation (MT-06) was part of an ongoing series of evaluations of human language translation technology. NIST conducts these evaluations in order to support machine translation (MT) research and help advance the state-of-the-art in machine translation technology. These evaluations provide an important contribution to the direction of research efforts and the calibration of technical capabilities. The evaluation was administered as outlined in the official MT-06 evaluation plan.
Disclaimer
These results are not to be construed, or represented as endorsements of any participant's system or commercial product, or as official findings on the part of NIST or the U.S. Government. Note that the results submitted by developers of commercial MT products were generally from research systems, not commercially available products. Since MT-06 was an evaluation of research algorithms, the MT-06 test design required local implementation by each participant. As such, participants were only required to submit their translation system output to NIST for uniform scoring and analysis. The systems themselves were not independently evaluated by NIST.
There is ongoing discussion within the MT research community regarding the most informative metrics for machine translation. The design and implementation of these metrics are themselves very much part of the research. At the present time, there is no single metric that has been deemed to be completely indicative of all aspects of system performance.
The data, protocols, and metrics employed in this evaluation were chosen to support MT research and should not be construed as indicating how well these systems would perform in applications. While changes in the data domain, or changes in the amount of data used to build a system, can greatly influence system performance, changing the task protocols could indicate different performance strengths and weaknesses for these same systems.
Because of the above reasons, this should not be interpreted as a product testing exercise and the results should not be used to make conclusions regarding which commercial products are best for a particular application.
The MT-06 evaluation consisted of two tasks. Each task required a system to perform translation from a given source language into the target language. The source languages were Arabic and Chinese, and the target language was English.
MT research and development requires language data resources. System performance is strongly affected by the type and amount of resources used. Therefore, two different resource categories were defined as conditions of evaluation. The categories differed solely by the amount of data that was available for use in system training and development. The evaluation conditions were called "Large Data Track" and "Unlimited Data Track".
Other submissions not in categories described above are not reported here.
Source Data
In an effort to reduce data creation costs, the MT-06 evaluation made use of GALE-06 evaluation data (GALE subset). NIST augmented the GALE subset with additional data of equal or greater size for most of the genres (NIST subset). This provided a larger and more diverse test set. Each set contained documents drawn from newswire text documents, web-based newsgroup documents, human transcription of broadcast news, and human transcription of broadcast conversations. The source documents were encoded in UTF-8.
The test data was selected from a pool of data collected by the LDC during February 2006. The careful selection process sought to have a variety of sources (see below), publication dates, and difficulty ratings while hitting the target test set size.
Genre | Arabic | Chinese | ||
Sources | Target Size (num of reference words) | Sources | Target Size (num of reference words) | |
Newswire | Agence France Presse Assabah Xinhua News Agency | 30K | Agence France Presse Xinhua News Agency | 30K |
Newsgroup | Google's groups Yahoo's groups | 20K | Google's groups | 20K |
Broadcast News | Dubai TV Al Jazeera Lebanese Broadcast Corporation | 20K | Central China TV New Tang Dynasty TV Phoenix TV | 20K |
Broadcast Conversation | Dubai TV Al Jazeera Lebanese Broadcast Corporation | 10K | Central China TV New Tang Dynasty TV Phoenix TV | 10K |
Reference Data
The GALE subset had one adjudicated high quality translation that was produced by the National Virtual Translation Center. The NIST subset had four independently generated high quality translations that were produced by professional translation companies. In both subsets, each translation agency was required to have native speaker(s) of the source and target languages, working on the translations.
Machine translation quality was measured automatically using an N-gram co-occurrence statistic metric developed by IBM and referred to as BLEU. BLEU measures translation accuracy according to the N-grams or sequence of N-words that it shares with one or more high quality reference translations. Thus, the more co-occurrences the better the score. BLEU is an accuracy metric, ranging from "0" to "1" with "1" being the best possible score. A detailed description of BLEU can be found in the paper Papineni, Roukos, Ward, Zhu (2001). "Bleu: a Method for Automatic Evaluation of Machine Translation" (keyword = RC22176).
Although BLEU was the official metric for MT-06, measuring translation quality is an ongoing research topic in the MT community. At the present time, there is no single metric that has been deemed to be completely indicative of all aspects of system performance. Three additional automatic metrics METEOR, TER, and BLEU-refinement as well as human assessment were used to report the system performance. As stated in the evaluation specification document, this official public version of the results will report only the scores as measured by BLEU.
The table below lists the organizations involved in submitting MT-06 evaluation results. Most submitted results representing their own organizations, some participated only in a collaborative effort (marked by the @ symbol), and some did both (marked by the + symbol).
Site ID | Organization | Location |
apptek | Applications Technology Inc. | USA |
arl | Army Research Laboratory+ | USA |
auc | Egypt | |
bbn | BBN Technologies | USA |
cu | Cambridge University@ | UK |
cmu | Carnegie Mellon University@ | USA |
casia | Institute of Automation Chinese Academy of Sciences | China |
columbia | USA | |
dcu | Dublin City University | Ireland |
google | Google | USA |
hkust | China | |
ibm | IBM | USA |
ict | Institute of Computing Technology Chinese Academy of Sciences | China |
iscas | China | |
isi | Information Sciences Institute+ | USA |
itcirst | ITC-irst | Italy |
jhu | Johns Hopkins University@ | USA |
ksu | Kansas State University | USA |
kcsl | KCSL Inc. | Canada |
lw | Language Weaver | USA |
lcc | Language Computer | USA |
lingua | Lingua Technologies Inc. | Canada |
msr | Microsoft Research | USA |
mit | MIT@ | USA |
nict | National Institute of Information and Communications Technology | Japan |
nlmp | National Laboratory on Machine Perception Peking University | China |
ntt | Japan | |
nrc | National Research Council Canada+ | Canada |
qmul | Queen Mary University of London | England |
rwth | RWTH Aachen University+ | Germany |
sakhr | Sakhr Software Co. | USA |
sri | SRI International | USA |
ucb | University of California Berkeley | USA |
edinburgh | University of Edinburgh+ | Scotland |
uka | University of Karlsruhe@ | Germany |
umd | University of Maryland@ | USA |
upenn | University of Pennsylvania | USA |
upc | Universitat Politecnica de Catalunya | Spain |
uw | University of Washington@ | USA |
xmu | Xiamen University | China |
Site ID | Team/Collaboration | Location |
arl-cmu | Army Research Laboratory & Carnegie Mellon University | USA |
cmu-uka | Carnegie Mellon University & University of Karlsruhe | USA, Germany |
edinburgh-mit | University of Edinburgh & MIT | Scotland, USA |
isi-cu | Information Sciences Institute & Cambridge University | USA, England |
rwth-sri-nrc-uw | RWTH Aachen University, SRI International, National Rearch Council Canada, University of Washington | Germany, USA, Canada, USA |
umd-jhu | University of Maryland & Johns Hopkins University | USA |
Each site/team could submit one or more systems for evaluation with one system marked as its primary system. The primary system indicated the site/team's best effort. This official public version of the results report the results only for the primary systems.
The tables below list the results of the NIST 2006 Machine Translation Evaluation. The results are sorted by the BLEU scores and reported separately for the GALE subset and the NIST subset because they do not have the same number of reference translations. The results are also reported for each data domain. Note that these scores reflect case-errors.
Friedman's Rank Test for k Correlated Samples was used to test for significant difference among the systems. The initial null hypothesis was that all systems were the same. If the null hypothesis was rejected at the 95% level of confidence, the lowest scoring system was taken out of the pool of systems to be tested, and the Friedman's Rank Test was repeated for the remaining systems until no significant difference was found. The remaining systems that were not removed from the pool were deemed to be statistically equivalent. The process was repeated for the systems taken out of the pool. Alternating colors (white and yellow backgrounds) show the different groups.
Key:
Note: Site 'nlmp' was unable to process the entire test set. No result is listed for that site.
Overall BLEU Scores
Site ID | BLEU-4 |
---|---|
0.4281 | |
ibm | 0.3954 |
isi | 0.3908 |
rwth | 0.3906 |
apptek*# | 0.3874 |
lw | 0.3741 |
bbn | 0.3690 |
ntt | 0.3680 |
itcirst | 0.3466 |
cmu-uka | 0.3369 |
umd-jhu | 0.3333 |
edinburgh*# | 0.3303 |
sakhr | 0.3296 |
nict | 0.2930 |
qmul | 0.2896 |
lcc | 0.2778 |
upc | 0.2741 |
columbia | 0.2465 |
ucb | 0.1978 |
auc | 0.1531 |
dcu | 0.0947 |
kcsl*# | 0.0522 |
Newswire BLEU Scores
Site ID | BLEU-4 |
---|---|
0.4814 | |
ibm | 0.4542 |
rwth | 0.4441 |
isi | 0.4426 |
lw | 0.4368 |
bbn | 0.4254 |
apptek*# | 0.4212 |
ntt | 0.4035 |
umd-jhu | 0.3997 |
edinburgh*# | 0.3945 |
cmu-uka | 0.3943 |
itcirst | 0.3798 |
qmul | 0.3737 |
sakhr | 0.3736 |
nict | 0.3568 |
lcc | 0.3089 |
upc | 0.3049 |
columbia | 0.2759 |
ucb | 0.2369 |
auc | 0.1750 |
dcu | 0.0875 |
kcsl*# | 0.0423 |
Newsgroup BLEU Scores
Site ID | BLEU-4 |
---|---|
apptek*# | 0.3311 |
0.3225 | |
ntt | 0.2973 |
isi | 0.2895 |
ibm | 0.2774 |
bbn | 0.2771 |
rwth | 0.2726 |
itcirst | 0.2696 |
sakhr | 0.2634 |
lw | 0.2503 |
cmu | 0.2436 |
edinburgh*# | 0.2208 |
lcc | 0.2135 |
columbia | 0.2111 |
umd-jhu | 0.2059 |
nict | 0.1875 |
upc | 0.1842 |
ucb | 0.1690 |
dcu | 0.1177 |
qmul | 0.1116 |
auc | 0.1099 |
kcsl*# | 0.0770 |
Broadcast News BLEU Scores
Site ID | BLEU-4 |
---|---|
0.3781 | |
apptek*# | 0.3729 |
lw | 0.3646 |
isi | 0.3630 |
ibm | 0.3612 |
rwth | 0.3511 |
ntt | 0.3324 |
bbn | 0.3302 |
umd-jhu | 0.3148 |
itcirst | 0.3128 |
edinburgh*# | 0.2925 |
cmu | 0.2874 |
sakhr | 0.2814 |
qmul | 0.2768 |
upc | 0.2463 |
nict | 0.2458 |
lcc | 0.2445 |
columbia | 0.2054 |
auc | 0.1419 |
ucb | 0.1114 |
dcu | 0.0594 |
kcsl*# | 0.0326 |
Overall BLEU Scores
Site ID | BLEU-4 |
---|---|
apptek*# | 0.1918 |
0.1826 | |
isi | 0.1714 |
ibm | 0.1674 |
sakhr | 0.1648 |
rwth | 0.1639 |
lw | 0.1594 |
ntt | 0.1533 |
itcirst | 0.1475 |
bbn | 0.1461 |
cmu | 0.1392 |
umd-jhu | 0.1370 |
qmul | 0.1345 |
edinburgh*# | 0.1305 |
nict | 0.1192 |
upc | 0.1149 |
lcc | 0.1129 |
columbia | 0.0960 |
ucb | 0.0732 |
auc | 0.0635 |
dcu | 0.0320 |
kcsl*# | 0.0176 |
Newswire BLEU Scores
Site ID | BLEU-4 |
---|---|
0.2647 | |
ibm | 0.2432 |
isi | 0.2300 |
rwth | 0.2263 |
apptek*# | 0.2225 |
sakhr | 0.2196 |
lw | 0.2193 |
ntt | 0.2180 |
bbn | 0.2170 |
itcirst | 0.2104 |
umd-jhu | 0.2084 |
cmu | 0.2055 |
edinburgh*# | 0.2052 |
qmul | 0.1984 |
nict | 0.1773 |
lcc | 0.1648 |
upc | 0.1575 |
columbia | 0.1438 |
ucb | 0.1299 |
auc | 0.0937 |
dcu | 0.0466 |
kcsl*# | 0.0182 |
Newsgroup BLEU Scores
Site ID | BLEU-4 |
---|---|
apptek*# | 0.1747 |
sakhr | 0.1331 |
0.1130 | |
ibm | 0.1060 |
rwth | 0.1017 |
isi | 0.0918 |
ntt | 0.0906 |
lw | 0.0853 |
cmu | 0.0840 |
bbn | 0.0837 |
itcirst | 0.0821 |
qmul | 0.0818 |
umd-jhu | 0.0754 |
edinburgh*# | 0.0681 |
lcc | 0.0643 |
nict | 0.0639 |
columbia | 0.0634 |
upc | 0.0603 |
ucb | 0.0411 |
auc | 0.0326 |
dcu | 0.0254 |
kcsl*# | 0.0089 |
Broadcast News BLEU Scores
Site ID | BLEU-4 |
---|---|
apptek*# | 0.1944 |
isi | 0.1766 |
0.1721 | |
lw | 0.1649 |
rwth | 0.1599 |
ibm | 0.1588 |
sakhr | 0.1495 |
itcirst | 0.1471 |
ntt | 0.1469 |
bbn | 0.1391 |
cmu | 0.1362 |
umd-jhu | 0.1309 |
qmul | 0.1266 |
edinburgh*# | 0.1240 |
nict | 0.1152 |
upc | 0.1150 |
lcc | 0.1016 |
columbia | 0.0879 |
auc | 0.0619 |
ucb | 0.0412 |
dcu | 0.0252 |
kcsl*# | 0.0229 |
Broadcast Conversation BLEU Scores
Site ID | BLEU-4 |
---|---|
isi | 0.1756 |
apptek*# | 0.1747 |
0.1745 | |
rwth | 0.1615 |
lw | 0.1582 |
ibm | 0.1563 |
ntt | 0.1512 |
sakhr | 0.1446 |
itcirst | 0.1425 |
bbn | 0.1400 |
umd-jhu | 0.1277 |
qmul | 0.1265 |
cmu | 0.1261 |
edinburgh*# | 0.1203 |
upc | 0.1200 |
lcc | 0.1157 |
nict | 0.1156 |
columbia | 0.0866 |
ucb | 0.0783 |
auc | 0.0620 |
dcu | 0.0306 |
kcsl*# | 0.0183 |
Overall BLEU Scores
Site ID | BLEU-4 |
---|---|
0.4535 | |
lw | 0.4008 |
rwth | 0.3970 |
rwth+sri+nrc+uw* | 0.3966 |
nrc | 0.3750 |
sri | 0.3743 |
edinburgh*# | 0.3449 |
cmu | 0.3376 |
arl-cmu | 0.1424 |
Newswire BLEU Scores
Site ID | BLEU-4 |
---|---|
0.5034 | |
lw | 0.4589 |
rwth+sri+nrc+uw* | 0.4493 |
rwth | 0.4458 |
nrc | 0.4300 |
sri | 0.4240 |
edinburgh*# | 0.4133 |
cmu | 0.3974 |
arl-cmu | 0.1402 |
Newsgroup BLEU Scores
Site ID | BLEU-4 |
---|---|
0.3652 | |
lw | 0.2851 |
rwth | 0.2829 |
nrc | 0.2799 |
rwth+sri+nrc+uw* | 0.2755 |
sri | 0.2534 |
cmu | 0.2372 |
edinburgh*# | 0.2287 |
arl-cmu | 0.1485 |
Broadcast News BLEU Scores
Site ID | BLEU-4 |
---|---|
0.4018 | |
lw | 0.3685 |
rwth | 0.3662 |
rwth+sri+nrc+uw* | 0.3639 |
sri | 0.3326 |
nrc | 0.3312 |
edinburgh*# | 0.3049 |
cmu | 0.2988 |
arl-cmu | 0.1363 |
Overall BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1957 | |
lw | 0.1721 |
rwth+sri+nrc+uw* | 0.1710 |
rwth | 0.1680 |
sri | 0.1614 |
nrc | 0.1517 |
cmu | 0.1382 |
edinburgh*# | 0.1365 |
arl-cmu | 0.0736 |
Newswire BLEU Scores
Site ID | BLEU-4 |
---|---|
0.2812 | |
lw | 0.2294 |
rwth+sri+nrc+uw* | 0.2289 |
rwth | 0.2258 |
nrc | 0.2172 |
sri | 0.2081 |
edinburgh*# | 0.2068 |
cmu | 0.2006 |
arl-cmu | 0.0858 |
Newsgroup BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1267 | |
rwth | 0.1133 |
rwth+sri+nrc+uw* | 0.1078 |
lw | 0.1007 |
nrc | 0.1007 |
sri | 0.0953 |
cmu | 0.0894 |
edinburgh*# | 0.0722 |
arl-cmu | 0.0558 |
Broadcast News BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1868 | |
rwth+sri+nrc+uw* | 0.1730 |
lw | 0.1715 |
sri | 0.1661 |
rwth | 0.1625 |
nrc | 0.1415 |
edinburgh*# | 0.1293 |
cmu | 0.1276 |
arl-cmu | 0.0855 |
Broadcast Conversation BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1824 | |
lw | 0.1756 |
rwth+sri+nrc+uw* | 0.1676 |
sri | 0.1671 |
rwth | 0.1658 |
nrc | 0.1429 |
edinburgh*# | 0.1341 |
cmu | 0.1322 |
arl-cmu | 0.0584 |
Overall BLEU Scores
Site ID | BLEU-4 |
---|---|
isi | 0.3393 |
0.3316 | |
lw | 0.3278 |
rwth | 0.3022 |
ict | 0.2913 |
edinburgh*# | 0.2830 |
bbn | 0.2781 |
nrc | 0.2762 |
itcirst | 0.2749 |
umd-jhu | 0.2704 |
ntt | 0.2595 |
nict | 0.2449 |
cmu | 0.2348 |
msr | 0.2314 |
qmul | 0.2276 |
hkust | 0.2080 |
upc | 0.2071 |
upenn | 0.1958 |
iscas | 0.1816 |
lcc | 0.1814 |
xmu | 0.1580 |
lingua* | 0.1341 |
kcsl*# | 0.0512 |
ksu | 0.0401 |
Newswire BLEU Scores
Site ID | BLEU-4 |
---|---|
isi | 0.3486 |
0.3470 | |
lw | 0.3404 |
ict | 0.3085 |
rwth | 0.3022 |
nrc | 0.2867 |
umd-jhu | 0.2863 |
edinburgh*# | 0.2776 |
bbn | 0.2774 |
itcirst | 0.2739 |
ntt | 0.2656 |
nict | 0.2509 |
cmu | 0.2496 |
msr | 0.2387 |
qmul | 0.2299 |
upenn | 0.2064 |
upc | 0.2057 |
hkust | 0.1999 |
lcc | 0.1721 |
iscas | 0.1715 |
xmu | 0.1619 |
lingua* | 0.1412 |
kcsl*# | 0.0510 |
ksu | 0.0380 |
Newsgroup BLEU Scores
Site ID | BLEU-4 |
---|---|
0.2620 | |
isi | 0.2571 |
lw | 0.2454 |
edinburgh*# | 0.2434 |
rwth | 0.2417 |
nrc | 0.2330 |
ict | 0.2325 |
bbn | 0.2275 |
itcirst | 0.2264 |
umd-jhu | 0.2061 |
ntt | 0.2036 |
nict | 0.2006 |
msr | 0.1878 |
cmu | 0.1865 |
hkust | 0.1851 |
qmul | 0.1840 |
iscas | 0.1681 |
upenn | 0.1665 |
lcc | 0.1634 |
upc | 0.1619 |
xmu | 0.1406 |
lingua* | 0.1207 |
kcsl*# | 0.0531 |
ksu | 0.0361 |
Broadcast News BLEU Scores
Site ID | BLEU-4 |
---|---|
rwth | 0.3501 |
0.3481 | |
isi | 0.3463 |
lw | 0.3327 |
bbn | 0.3197 |
edinburgh*# | 0.3172 |
itcirst | 0.3128 |
ict | 0.2977 |
ntt | 0.2928 |
umd-jhu | 0.2928 |
nrc | 0.2914 |
qmul | 0.2571 |
nict | 0.2568 |
msr | 0.2527 |
cmu | 0.2468 |
upc | 0.2403 |
hkust | 0.2376 |
iscas | 0.2090 |
lcc | 0.2046 |
upenn | 0.2008 |
xmu | 0.1652 |
lingua* | 0.1323 |
kcsl*# | 0.0475 |
ksu | 0.0464 |
Overall BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1470 | |
isi | 0.1413 |
lw | 0.1299 |
edinburgh*# | 0.1199 |
itcirst | 0.1194 |
nrc | 0.1194 |
rwth | 0.1187 |
ict | 0.1185 |
bbn | 0.1165 |
umd-jhu | 0.1140 |
cmu | 0.1135 |
ntt | 0.1116 |
nict | 0.1106 |
hkust | 0.0984 |
msr | 0.0972 |
qmul | 0.0943 |
upc | 0.0931 |
upenn | 0.0923 |
iscas | 0.0860 |
lcc | 0.0813 |
xmu | 0.0747 |
lingua* | 0.0663 |
ksu | 0.0218 |
kcsl*# | 0.0199 |
Newswire BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1905 | |
isi | 0.1685 |
lw | 0.1596 |
ict | 0.1515 |
edinburgh*# | 0.1467 |
rwth | 0.1448 |
bbn | 0.1433 |
umd-jhu | 0.1419 |
nrc | 0.1404 |
itcirst | 0.1377 |
cmu | 0.1353 |
ntt | 0.1350 |
msr | 0.1280 |
hkust | 0.1161 |
nict | 0.1155 |
qmul | 0.1102 |
upenn | 0.1068 |
upc | 0.1039 |
iscas | 0.0947 |
lcc | 0.0878 |
xmu | 0.0861 |
lingua* | 0.0657 |
kcsl*# | 0.0178 |
ksu | 0.0138 |
Newsgroup BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1365 | |
isi | 0.1235 |
edinburgh*# | 0.1140 |
lw | 0.1137 |
ict | 0.1130 |
itcirst | 0.1108 |
nrc | 0.1098 |
nict | 0.1075 |
rwth | 0.1071 |
cmu | 0.1054 |
bbn | 0.1049 |
ntt | 0.1026 |
umd-jhu | 0.0978 |
upenn | 0.0941 |
hkust | 0.0892 |
qmul | 0.0858 |
upc | 0.0851 |
msr | 0.0841 |
lcc | 0.0765 |
iscas | 0.0745 |
lingua* | 0.0687 |
xmu | 0.0681 |
ksu | 0.0249 |
kcsl*# | 0.0177 |
Broadcast News BLEU Scores
Site ID | BLEU-4 |
---|---|
isi | 0.1441 |
0.1409 | |
lw | 0.1343 |
rwth | 0.1231 |
itcirst | 0.1193 |
nrc | 0.1192 |
cmu | 0.1159 |
bbn | 0.1146 |
ict | 0.1146 |
edinburgh*# | 0.1110 |
ntt | 0.1096 |
nict | 0.1090 |
umd-jhu | 0.1084 |
hkust | 0.1005 |
upc | 0.0986 |
qmul | 0.0951 |
msr | 0.0922 |
iscas | 0.0891 |
upenn | 0.0882 |
lcc | 0.0814 |
xmu | 0.0705 |
lingua* | 0.0609 |
kcsl*# | 0.0204 |
ksu | 0.0192 |
Broadcast Conversation BLEU Scores
Site ID | BLEU-4 |
---|---|
isi | 0.1280 |
0.1262 | |
edinburgh*# | 0.1119 |
lw | 0.1112 |
itcirst | 0.1106 |
nict | 0.1106 |
umd-jhu | 0.1102 |
nrc | 0.1095 |
bbn | 0.1060 |
ntt | 0.1016 |
rwth | 0.1013 |
ict | 0.0990 |
cmu | 0.0973 |
hkust | 0.0891 |
msr | 0.0873 |
qmul | 0.0870 |
upc | 0.0848 |
iscas | 0.0842 |
upenn | 0.0815 |
lcc | 0.0796 |
xmu | 0.0753 |
lingua* | 0.0700 |
ksu | 0.0270 |
kcsl*# | 0.0223 |
Overall BLEU Scores
Site ID | BLEU-4 |
---|---|
0.3496 | |
rwth | 0.2975 |
edinburgh*# | 0.2843 |
cmu | 0.2449 |
casia | 0.1894 |
xmu | 0.1713 |
Newswire BLEU Scores
Site ID | BLEU-4 |
---|---|
0.3634 | |
rwth | 0.2974 |
edinburgh*# | 0.2852 |
cmu | 0.2430 |
casia | 0.1905 |
xmu | 0.1696 |
Newsgroup BLEU Scores
Site ID | BLEU-4 |
---|---|
0.2870 | |
edinburgh*# | 0.2450 |
rwth | 0.2307 |
cmu | 0.2004 |
casia | 0.1709 |
xmu | 0.1618 |
Broadcast News BLEU Scores
Site ID | BLEU-4 |
---|---|
0.3649 | |
rwth | 0.3509 |
edinburgh*# | 0.3142 |
cmu | 0.2644 |
casia | 0.1889 |
xmu | 0.1818 |
Overall BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1526 | |
edinburgh*# | 0.1187 |
rwth | 0.1172 |
cmu | 0.1034 |
casia | 0.0900 |
xmu | 0.0793 |
Newswire BLEU Scores
Site ID | BLEU-4 |
---|---|
0.2057 | |
edinburgh*# | 0.1465 |
rwth | 0.1436 |
cmu | 0.1158 |
casia | 0.1001 |
xmu | 0.0817 |
Newsgroup BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1432 | |
edinburgh*# | 0.1070 |
rwth | 0.1032 |
cmu | 0.1015 |
casia | 0.0916 |
xmu | 0.0782 |
Broadcast News BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1482 | |
rwth | 0.1224 |
edinburgh*# | 0.1090 |
cmu | 0.1020 |
casia | 0.0891 |
xmu | 0.0775 |
Broadcast Conversation BLEU Scores
Site ID | BLEU-4 |
---|---|
0.1206 | |
edinburgh*# | 0.1157 |
rwth | 0.1010 |
cmu | 0.0957 |
casia | 0.0812 |
xmu | 0.0801 |
NIST data set | BLEU-4 | ||||
Site ID | Language | Overall | Newswire | Newsgroup | Broadcast News |
Arabic | 0.4569 | 0.5060 | 0.3727 | 0.4076 | |
Chinese | 0.3615 | 0.3725 | 0.2926 | 0.3859 |
GALE data set | BLEU-4 | |||||
Site ID | Language | Overall | Newswire | Newsgroup | Broadcast News | Broadcast Conversation |
Arabic | 0.2024 | 0.2820 | 0.1359 | 0.1932 | 0.1925 | |
Chinese | 0.1576 | 0.2086 | 0.1454 | 0.1532 | 0.1300 |