Statistical Machine Translation - Nepal Summer School in Advanced Language Engineering

Invitation

The success of statistical machine translation systems such as Moses, Language Weaver and Google Translate has shown that it is possible to build high performance machine translation systems with a small amount of effort using statistical learning techniques.

This course will present the basic modeling behind statistical machine translation in a concise way.

Instructor

Alex Fraser

Email Address: SubstituteMyLastName@ims.uni-stuttgart.de

University of Stuttgart

DFG Project: Models of Morphosyntax for Statistical Machine Translation

Institute for Natural Language Processing (IMS/IfNLP)

SFB 732 - Incremental Specification in Context

Schedule

Location: University of Kathmandu, see the Summer School in Advanced Language Engineering web page.

Homework assignments:

Additional Resources:

Lectures:

September 18th Part 6. Translating to morphologically rich languages: case study on German
powerpoint slides
pdf slides
September 17th Part 5. Advanced topics in SMT. Discriminative bitext alignment, morphological processing, syntax
powerpoint slides
pdf slides
Reading: Koehn 10.1, 10.2, 10.3, 11.1
September 16th Part 4. Log-linear Models for SMT and Minimum Error Rate Training powerpoint slides
pdf slides
Reading: Koehn Chapter 5, 9.1, 9.2, 9.3
September 15th Part 3. Phrase-based Models and Decoding (automatically translating a text given an already learned model) powerpoint slides
pdf slides
Reading: Koehn 5.1, 5.2, Chapter 6
September 13th Part 2. Bitext alignment (extracting lexical knowledge from parallel corpora) powerpoint slides
pdf slides
Reading: Koehn Chapter 4
Optional Reading: Kevin Knight's SMT Tutorial (concentrate on Model 1)
September 10th to 11th Part 1. Introduction, basics of statistical machine translation (SMT), evaluation of MT (I also switched to slides on BLEU from Chris Callison-Burch) powerpoint slides
pdf slides
CCB slides
Reading: Koehn Chapters 1 and 3
OmegaT translation memory