Primary Test Sets

These test sets are the primary means for evaluating machine quality on this site. If you'd like to participate, consider translating one of these sets.

Name Origin Domain Related Corpora Citation Notes Test Setups
it-test2016 
(Matrix)
(Download)
ACL WMT 2016 IT A collection of 1000 sentences from IT domain, in equal amounts from Bulgarian, Czech, German, Spanish, Dutch, and Portuguese. 56
khresmoi-query-test-2014 
(Matrix)
(Download)
Khresmoi Medical English search queries randomly sampled from user query logs of the Health on the Net foundation and the Trip database and professionally translated into Czech, German, and French. 6
khresmoi-summary-test-2014 
(Matrix)
(Download)
Khresmoi Medical Sentences sampled from summaries of English medical documents crawled from the web in 2012 and professionally translated into Czech, French, and German. 6
newstest2013 
(Matrix)
(Download)
ACL WMT 2013 News A collection of 52 news articles, in equal amounts from Czech, English, French, German, Russian, and Spanish. These articles were collected in December 2012 and professionally translated for WMT 2013. 30
newstest2014 
(Matrix)
(Download)
ACL WMT 14 News Official test set for ACL WMT 2014. 10
newstest2015 
(Matrix)
(Download)
ACL WMT 2015 News A collection of news articles (and in the case of French news discussions) in Czech, English, French, German, Finnish, and Russian. These articles were collected in December 2014 and professionally translated for WMT 2015. 10
newstest2016 
(Matrix)
(Download)
ACL WMT 2016 News A collection of news articles in Czech, English, German, Finnish, Romanian, Russian, and Turkish. These articles were collected in December 2014 and professionally translated for WMT 2016. 12
newstest2017 
(Matrix)
(Download)
EMNLP WMT 2017 News A collection of news articles in Chinese, Czech, English, Finnish, German, Latvian, Russian and Turkish 14

Other Test Sets

Name Origin Domain Related Corpora Citation Notes Test Setups
acquis 
(Matrix)
(Download)
JRC Laws JRC-Acquis
462 Machine Translation Systems for Europe, Philipp Koehn, Alexandra Birch and Ralf Steinberger, MT Summit XII, 2009 Moses systems trained on the freely available Acquis corpus. Please use the provided devlopment sets for development and ensure that you do not include these in training (see *.info files for document IDs). 462
nc-test2006 
(Matrix)
(Download)
NAACL WMT 2006 News Commentary News Commentary
Manual and Automatic Evaluation of Machine Translation between European Languages, Philipp Koehn and Christof Monz, NAACL WMT 2006 This test set contains the 1064 out-of-domain Europarl lines from the NAACL WMT 2006. Also released as "nc-devtest2007" for ACL WMT 2007. 12
nc-test2007 
(Matrix)
(Download)
ACL WMT 2007 News Commentary News Commentary
(Meta-) Evaluation of Machine Translation, Callison-Burch et al., ACL WMT 2007 Official domain-specific test set from ACL WMT 2007. 20
nc-test2008 
(Matrix)
(Download)
ACL WMT 2008 News Commentary News Commentary
Further Meta-Evaluation of Machine Translation, Callison-Burch et al., ACL WMT 2008 This was used only for Czech to and from English for the official evaluation. 20
newstest2008 
(Matrix)
(Download)
ACL WMT 2008 News Further Meta-Evaluation of Machine Translation, Callison-Burch et al., ACL WMT 2008 A collection of 90 news articles, 15 each from Czech, English, French, German, Spanish and Hungarian. These articles were collected in November and December 2007 and human translated for WMT 2008. 30
newstest2009 
(Matrix)
(Download)
ACL WMT 2009 News Europarl
News Commentary
Hunglish
"Findings of the 2009 Workshop on Statistical Machine Translation", Callison-Burch et al., ACL WMT 2009 A collection of 136 news articles, in equal amounts from Czech, English, French, German, Spanish, Italian and Hungarian. These articles were collected in September and October 2008 and human translated for WMT 2009. 42
newstest2010 
(Matrix)
(Download)
ACL WMT 2010 News Europarl
News Commentary
Callison-Burch et al.: Findings of the 2010 Joint Workshop on Statistical Machine Translation and Metrics for Machine Translation A collection of 119 news articles, in equal amounts from Czech, English, French, German, and Spanish. These articles were collected in December 2009 and human translated for WMT 2010. 20
newstest2011 
(Matrix)
(Download)
ACL WMT 2011 News Europarl
News Commentary
A collection of 110 news articles, in equal amounts from Czech, English, French, German, and Spanish. These articles were collected in December 2010 and human translated for WMT 2011. 20
newstest2012 
(Matrix)
(Download)
ACL WMT 2012 News A collection of 99 news articles, in equal amounts from Czech, English, French, German, and Spanish. These articles were collected in December 2011 and human translated for WMT 2012. 20
syscomb2011 
(Matrix)
(Download)
ACL WMT 2011 News Europarl
News Commentary
Test set for system combination. Subset of the WMT 2011 translation task test set. 20
test2005 
(Matrix)
(Download)
ACL WPT 2005 European Parliament Proceedings Europarl
Official test set for ACL WPT 2005. This is identical to "devtest2006" released for subsequent WMT shared tasks. 110
test2006 
(Matrix)
(Download)
NAACL WMT 2006 European Parliament Proceedings Europarl
Manual and Automatic Evaluation of Machine Translation between European Languages, Philipp Koehn and Christof Monz, NAACL WMT 2006 This test set contains the 2000 in-domain Europarl lines from the NAACL WMT 2006. 110
test2007 
(Matrix)
(Download)
ACL WMT 2007 European Parliament Proceedings Europarl
(Meta-) Evaluation of Machine Translation, Callison-Burch et al., ACL WMT 2007 Official test set from ACL WMT 2007. 110
test2008 
(Matrix)
(Download)
ACL WMT 2008 European Parliament Proceedings Europarl
Further Meta-Evaluation of Machine Translation, Callison-Burch et al., ACL WMT 2008 Official Europarl test set from WMT 2008. 110