Name: Europarl

Notes: The Europarl parallel corpus is extracted from the proceedings of the European Parliament. It includes versions in 11 European languages: Romanic (French, Italian, Spanish, Portuguese), Germanic (English, Dutch, German, Danish, Swedish), Greek and Finnish. Version 3 includes data through 10/2006.

Citation: Europarl: A Parallel Corpus for Statistical Machine Translation, Philipp Koehn, MT Summit 2005

Url: http://www.statmt.org/europarl/

Back