Deep Learning is an interesting new branch of machine learning where neural networks consisting of multiple layers have shown new generalization capabilities. The seminar will look at advances in both general deep learning approaches, and at the specific case of Neural Machine Translation (NMT). NMT is a new paradigm in data-driven machine translation. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences and the full sequence to sequence task is handled in one model.
Here is a link to last semester's seminar.
There is a Munich interest group for Deep Learning, which has an associated mailing list, the paper announcements are sent out on this list. See the link here.
Email Address: SubstituteLastName@cis.uni-muenchen.de
Thursdays 14:45 (s.t.), location ZOOM ONLINE
You can install the zoom client or click cancel and use browser support (might not work for all browsers).
Contact Alexander Fraser if you need the zoom link.
New attendees are welcome. Read the paper and bring a paper or electronic copy with you, you will need to refer to it during the discussion.
Click here for directions to CIS.
If this page appears to be out of date, use the refresh button of your browser
Date | Paper | Links | Discussion Leader |
October 28th | Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu (2021). NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task--Next Sentence Prediction. arXiv | paper | Sheng Liang |
November 4th | M Saiful Bari, Tasnim Mohiuddin, Shafiq Joty (2020). UXLA: A Robust Unsupervised Data Augmentation Framework for Zero-Resource Cross-Lingual NLP. ACL | paper | Viktor Hangya |
November 25th | Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning (2021). Fast Model Editing at Scale. arXiv | paper | Matthias Assenmacher |
December 9th | Victor Sanh, Albert Webson et al. (2021). Multitask Prompted Training Enables Zero-Shot Task Generalization. arXiv | paper | Kerem Şenel |
December 16th | Anonymous. Towards a Unified View of Parameter-Efficient Transfer Learning. ICLR 2022 submission | paper | Alexandra Chronopoulou |
January 27th, 2022 | Michael Matena, Colin Raffel (2021). Merging Models with Fisher-Weighted Averaging. arXiv. | paper | Katharina Hämmerl |
February 3rd | Jason Wei, Maarten Bosma, et al. (2021). Finetuned Language Models Are Zero-Shot Learners. arXiv. | paper | Abdullatif Köksal |
February 20th | Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli (2022). data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language. arxiv. | paper | Haris Jabbar |
March 3rd | Douwe Kiela, et al. (2021). Dynabench: Rethinking Benchmarking in NLP. NAACL. | paper | Pedro Henrique Luz de Araujo |
March 10th | Yihong Liu, Haris Jabbar, Hinrich Schütze (2022). Flow-Adapter Architecture for Unsupervised Machine Translation. ACL 2022 (draft version) | paper | Yihong and Haris |
March 17th | Pei Zhou, Karthik Gopalakrishnan, et al. (2021). Think Before You Speak: Using Self-talk to Generate Implicit Commonsense Knowledge for Response Generation. arXiv. | paper | Philipp Wicke |
March 31st | Colin Raffel (2021). A Call to Build Models Like We Build Open-Source Software. Blog Post. | blog | Mengjie Zhao |
Further literature:
You can go back through the previous semesters by clicking on the link near the top of the page.