Interested in Contributing?

Scored Systems

System Submitter System Notes Constraint Run Notes BLEU BLEU-cased TER BEER 2.0 CharactTER
MSRA.MADL  (Details) Microsoft
Microsoft
Multi-Agent dual learning + transformer_big yes

44.3

42.8

0.462

0.665

0.424

sharp_deen  (Details) sharp
sharp
yes

43.2

41.8

0.473

0.661

0.431

sharp_deen  (Details) sharp
sharp
yes

43.0

41.5

0.475

0.661

0.432

sharp_deen  (Details) sharp
sharp
yes

42.6

41.1

0.482

0.660

0.433

Facebook FAIR   (Details) edunov
Facebook FAIR
yes

42.3

40.8

failed

0.655

0.440

sharp_deen  (Details) sharp
sharp
yes

42.1

40.6

0.487

0.659

0.435

NEU  (Details) NiuTrans
Northeastern University
Ensemble of 8 deep Transformer (30 layers) models + back-translation with sampling + distillation by ensemble teachers + hypothesis combination yes

42.1

40.5

0.476

0.655

0.438

wyx_mt_deen  (Details) wyx
zwf big model baseline yes

42.1

40.0

0.483

0.653

0.452

UCAM  (Details) fstahlberg
University of Cambridge
yes 1 sentence-level LM, 1 document-level LM, 4 NMT models fine-tuned with EWC

41.3

39.7

0.490

0.648

0.456

RWTH Aachen System  (Details) pbahar
RWTH Aachen University
Ensemble of 3 big Transformer models, back-translated data, data-filtering yes

41.2

39.6

0.484

0.650

0.447

MLLP-UPV   (Details) mllp
MLLP group - Univ. Politècnica de València
Transformer big model. Includes 10M sentences from Paracrawl and 44M noisy backtranslated sentences. Finetuned on newstest08-16. Single model. yes

40.9

39.3

0.491

0.646

0.456

test  (Details) kxyg
SYU
ensemble yes

40.6

39.0

0.496

0.644

0.459

dfki-nmt  (Details) zhangjingyi
dfki
yes

42.1

38.8

0.481

0.649

0.445

JHU  (Details) kelly-yash-jhu
Johns Hopkins University
3 Transformer base ensemble + filtered backtranslation with restricted sampling + filtered ParaCrawl & CommonCrawl + continued training on newstest15-18 + reranking with R2L models and LM yes Post Submission De-En Work

40.0

38.4

0.502

0.643

0.455

JHU  (Details) kelly-yash-jhu
Johns Hopkins University
3 Transformer base ensemble + filtered backtranslation with restricted sampling + filtered ParaCrawl & CommonCrawl + continued training on newstest15-18 + reranking with R2L models and LM yes

39.6

38.1

0.506

0.642

0.457

uedin-de-en-lr4ens1  (Details) ugermann
University of Edinburgh
yes Ensemble of 4 checkpoint L2R big transformers. A generic big transformer was trained on available news and other data, including data back-translated with our 2018 systems (all monolingual news data), then tuned on data specifically selected for the test set (considering the source side only). The ensemble consists of the two top BLUE / ce-mean-words checkpoint models as per validation on newstest2018.

36.7

35.0

0.514

0.626

0.486

sharp_deen  (Details) sharp
sharp
yes

35.9

34.2

0.534

0.617

0.500

TartuNLP-c  (Details) andre
University of Tartu
yes Baseline

35.9

33.9

0.529

0.619

0.494

PROMT NMT DE-EN  (Details) Alex Molchanov
PROMT LLC
transformer, single model no transformer, single model

33.7

32.1

0.556

0.603

0.520

parfda  (Details) bicici
yes de-en

32.5

30.9

0.563

0.602

0.554

parfda  (Details) bicici
yes de-en with hyphen splitting

32.4

30.7

0.563

0.604

0.550

parfda  (Details) bicici
yes de-en with nplm and kenlm

32.2

30.7

0.577

0.599

0.555

parfda  (Details) bicici
yes de-en with nplm

30.9

29.2

0.576

0.597

0.560