Models of Morphosyntax for Statistical Machine
Translation -- Morphosyntaktische Modelle
für statistische maschinelle Übersetzung
Models of Morphosyntax for Statistical Machine
Translation
Statistical approaches to machine translation (MT)
have shown themselves to be effective in the last few years. However,
when translating into a morphologically rich language this is not
true, particularly when there is also significant syntactic
divergence between the two languages. The quality of statistical machine translation is poor in
this case because of independence assumptions made between the models
of morphology, syntax and translation that do not reflect linguistic
reality.
The project uses advances in automatic linguistic
analysis of syntax and morphology to advance statistical MT. The
dependencies between morphology, syntax and translation are
directly modeled. This leads to the creation of translation
models and search algorithms that dramatically improve
translation quality for morphologically rich languages.
Funded by the German Research Foundation
News
Principal Investigators
Dr. Alexander Fraser
Prof. Dr. Hinrich Schuetze
Present Staff
Anita Ramm (nee Gojun)
Marion Di Marco (nee Weller)
Past Staff
Fabienne Braune
Fabienne Cap (nee Fritzinger)
Nadir Durrani
Patrick Leucht
Hassan Sajjad
Nina Seemann
Renjing Wang
Publications
-
Nadir Durrani, Helmut Schmid, Alexander Fraser, Philipp Koehn, Hinrich Schütze (2015). The Operation Sequence Model - Combining N-Gram-based and Phrase-based Statistical Machine Translation. Computational Linguistics. 41(2), pages 157-186. abstract
-
Marion Weller, Alexander Fraser, Sabine Schulte im Walde (2015). Target-side Generation of Prepositions for SMT. In Proceedings the 18th Annual Conference of the European Association for Machine Translation (EAMT), pages 177-184, Antalya, Turkey, May. abstract
-
Marion Weller, Sabine Schulte im Walde, Alexander Fraser (2014). Using Noun Class Information to Model Selectional Preferences for Translating Prepositions in SMT. In Proceedings of the Eleventh Biennial Conference of the Association for Machine Translation in the Americas (AMTA), pages 275-287, Vancouver, BC, Canada, October. abstract
-
Nadir Durrani, Philipp Koehn, Helmut Schmid, Alexander Fraser (2014). Investigating the Usefulness of Generalized Word Representations in SMT. In Proceedings of the 25th Annual Conference on Computational Linguistics (COLING), pages 421-432, Dublin, Ireland, August. abstract
-
Marion Weller, Fabienne Cap, Stefan Müller, Sabine Schulte im Walde, Alexander Fraser (2014). Distinguishing Degrees of Compositionality in Compound Splitting for Statistical Machine Translation. In Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComaComa) at COLING, pages 81-90, Dublin, Ireland, August. abstract
-
Marion Weller, Alexander Fraser, Ulrich Heid (2014). Combining Bilingual Terminology Mining and Morphological Modeling for Domain Adaptation in SMT. In Proceedings of the Seventeenth Annual Conference of the European Association for Machine Translation (EAMT), pages 11-18, Dubrovnik, Croatia, June. abstract
-
Fabienne Cap, Marion Weller, Anita Ramm, Alexander Fraser (2014). CimS - The CIS and IMS joint submission to WMT 2014: translating from English into German. In Proceedings of the ACL Ninth Workshop on Statistical Machine Translation, pages 71-78, Baltimore, Maryland, USA, June. abstract
-
Fabienne Cap, Alexander Fraser, Marion Weller, Aoife Cahill (2014). How to Produce Unseen Teddy Bears: Improved Morphological Processing of Compounds in SMT. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 579-587, Goteborg, Sweden, April. abstract
-
Ales Tamchyna, Fabienne Braune, Alexander Fraser, Marine Carpuat, Hal Daume III, Chris Quirk (2014). Integrating a Discriminative Classifier into Phrase-based and Hierarchical Decoding. The Prague Bulletin of Mathematical Linguistics, No. 101, pages 29-41, April. abstract
-
Alexander Fraser, Marion Weller, Aoife Cahill, Fabienne Cap (2013). NECTAR: Modeling Inflection and Word-Formation in SMT. In Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology (GSCL), Nectar Track, 6 pages, Darmstadt, Germany, September.
-
Nadir Durrani, Alexander Fraser, Helmut Schmid, Hieu Hoang, Philipp Koehn (2013). Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT? In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), pages 399-405, Sofia, Bulgaria, August. abstract
-
Marion Weller, Alexander Fraser, Sabine Schulte im Walde (2013). Using Subcategorization Knowledge to Improve Case Prediction for Translation to German. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL), pages 593-603, Sofia, Bulgaria, August. abstract
-
Nadir Durrani, Helmut Schmid, Alexander Fraser, Hassan Sajjad, Richard Farkas (2013).
Munich-Edinburgh-Stuttgart Submissions of OSM Systems at WMT13. In Proceedings of the ACL Eighth Workshop on Statistical Machine Translation, pages 122-127, Sofia, Bulgaria, August.
-
Hassan Sajjad, Svetlana Smekalova, Nadir Durrani, Alexander Fraser, Helmut Schmid (2013).
QCRI-MES Submission at WMT13: Using Transliteration Mining to Improve Statistical Machine Translation. In Proceedings of the ACL Eighth Workshop on Statistical Machine Translation, pages 219-224, Sofia, Bulgaria, August.
-
Marion Weller, Max Kisselew, Svetlana Smekalova, Alexander Fraser, Helmut Schmid, Nadir Durrani, Hassan Sajjad, Richard Farkas (2013).
Munich-Edinburgh-Stuttgart Submissions at WMT13: Morphological and Syntactic Processing for SMT. In Proceedings of the ACL Eighth Workshop on Statistical Machine Translation, pages 232-239, Sofia, Bulgaria, August.
-
Nadir Durrani, Alexander Fraser, Helmut Schmid (2013). Model With Minimal Translation Units, But Decode With Phrases. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), pages 1-11, Atlanta, Georgia, USA, June. abstract
-
Alexander Fraser, Helmut Schmid, Richard Farkas, Renjing Wang, Hinrich Schuetze (2013). Knowledge Sources for Constituent Parsing of German, a Morphologically Rich and Less-Configurational Language. Computational Linguistics, 39(1), pages 57-85. abstract
-
Hassan Sajjad, Alexander Fraser, Helmut Schmid (2012). A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL), pages 469-477, Jeju Island, Korea, July. abstract
-
Fabienne Braune, Anita Gojun, Alexander Fraser (2012). Long-distance Reordering During Search for Hierarchical Phrase-based SMT. In Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT), pages 177-184, Trento, Italy, May. abstract
-
Alexander Fraser, Marion Weller, Aoife Cahill, Fabienne Cap (2012). Modeling Inflection and Word-Formation in SMT. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 664-674, Avignon, France, April. abstract
-
Anita Gojun, Alexander Fraser (2012). Determining the Placement of German Verbs in English-to-German SMT.
In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 726-735, Avignon, France, April. abstract
-
Hassan Sajjad, Nadir Durrani, Helmut Schmid, Alexander Fraser (2011). Comparing Two Techniques for Learning Transliteration Models Using a Parallel Corpus. In Proceedings of The 5th International Joint Conference on Natural Language Processing (IJCNLP), pages 129-137, Chiang Mai, Thailand, November.
-
Nadir Durrani, Helmut Schmid, Alexander Fraser (2011). A Joint Sequence Translation Model with Integrated Reordering. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), pages 1045-1054, Portland, Oregon, USA, June. Errata
-
Hassan Sajjad, Alexander Fraser, Helmut Schmid (2011). An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), pages 430-439, Portland, Oregon, USA, June.
-
Fabienne Braune, Alexander Fraser (2010). Improved Unsupervised Sentence Alignment for Symmetrical and Asymmetrical Parallel Corpora. In Proceedings of the the 23rd International Conference on Computational Linguistics (COLING) - Posters, pages 81-89, Beijing, China, August. Software
-
Nadir Durrani, Hassan Sajjad, Alexander Fraser, Helmut Schmid (2010). Hindi-to-Urdu Machine Translation Through Transliteration. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pages 465-474, Uppsala, Sweden, July.
-
Fabienne Fritzinger, Alexander Fraser (2010). How to Avoid Burning Ducks: Combining Linguistic Analysis and Corpus Statistics for German Compound Processing. In Proceedings of the ACL 2010 Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 224-234, Uppsala, Sweden, July.
-
Florian Schwarck, Alexander Fraser, Hinrich Schuetze (2010). Bitext-Based Resolution of German Subject-Object Ambiguities. In Proceedings of the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Short Papers, pages 737-740, Los Angeles, California, USA, June.