Deep Learning is an interesting new branch of machine learning where neural networks consisting of multiple layers have shown new generalization capabilities. The seminar will look at advances in both general deep learning approaches, and at the specific case of Neural Machine Translation (NMT). NMT is a new paradigm in data-driven machine translation. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences and the full sequence to sequence task is handled in one model.
Here is a link to last semester's seminar.
There is a Munich interest group for Deep Learning, which has an associated mailing list, the paper announcements are sent out on this list. See the link here.
Email Address: SubstituteLastName@cis.uni-muenchen.de
Thursdays 14:45 (s.t.), location ZOOM ONLINE
You can install the zoom client or click cancel and use browser support (might not work for all browsers).
Contact Alexander Fraser if you need the zoom link.
New attendees are welcome. Read the paper and bring a paper or electronic copy with you, you will need to refer to it during the discussion.
Click here for directions to CIS.
If this page appears to be out of date, use the refresh button of your browser
Date | Paper | Links | Discussion Leader |
April 1st, 2021 | Swayamdipta et al. Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics. EMNLP 2020 | paper | Antonis Maronikolakis |
April 15th, 2021 | Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting (2021). CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation. arXiv 2021. | paper | Ayyoob Imani |
April 29th, 2021 | Koustuv Sinha, Robin Jia, Dieuwke Hupkes, Joelle Pineau, Adina Williams, Douwe Kiela (2021). Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little. arXiv 2021. | paper | Leonie Weißweiler |
May 6th, 2021 | Andrew Jaegle, Felix Gimeno, Andrew Brock, Andrew Zisserman, Oriol Vinyals, Joao Carreira (2021). Perceiver: General Perception with Iterative Attention. arXiv 2021. | paper | Jindřich Libovický |
May 20th, 2021 | Jonas Pfeiffer, Aishwarya Kamath, Andreas Rücklé, Kyunghyun Cho, Iryna Gurevych (2021). AdapterFusion: Non-Destructive Task Composition for Transfer Learning. EACL 2021. | paper | Viktor Hangya |
May 27th, 2021 | Nikita Nangia, Clara Vania, Rasika Bhalerao, Samuel R. Bowman (2020). CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models. EMNLP 2020. | paper | Victor Steinborn |
June 24th, 2021 | Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama (2021). End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering. arXiv 2021. | paper | Masoud Jalili Sabet |
July 1st, 2021 | Qi Dong, Shaogang Gong, Xiatian Zhu (2018). Imbalanced Deep Learning by Minority Class Incremental Rectification. IEEE Trans Pattern Analysis and Machine Intelligence. | paper | Alex Fraser |
July 8th, 2021 | Robert L. Logan IV, Ivana Balažević, Eric Wallace, Fabio Petroni, Sameer Singh, Sebastian Riedel (2021). Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models. arXiv. | paper | Kerem Şenel |
July 15th, 2021 | Jianing Zhou, Hongyu Gong, Suma Bhat (2021). PIE: Parallel Idiomatic Expression Corpus for Idiomatic Sentence Generation and Paraphrasing. ACL MWE Workshop. | paper | Alex Fraser |
July 22nd, 2021 | Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler (2021). Charformer: Fast Character Transformers via Gradient-based Subword Tokenization. arXiv. | paper | Jindřich Libovický |
August 12th, 2021 | Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni (2021). Autoregressive Entity Retrieval. ICLR | paper | Nora Kassner |
August 19th, 2021 | Mojtaba Komeili, Kurt Shuster, Jason Weston (2021). Internet-Augmented Dialogue Generation. arXiv. | paper | Timo Schick |
September 2nd, 2021 | Boxi Cao, Hongyu Lin, Xianpei Han, Le Sun, Lingyong Yan, Meng Liao, Tong Xue, Jin Xu (2021). Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases. ACL 2021 | paper | Martin Schmitt |
September 16th, 2021 | Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, Colin Raffel (2021). ByT5: Towards a token-free future with pre-trained byte-to-byte models. arXiv. | paper | Valentin Hofmann |
September 23rd, 2021 | Alex Radford et al. (2021). CLIP: Connecting Text and Images. Blog Post and arXiv. |
blog paper see sect 1 and 2 optional blog | Sophie Henning |
Further literature:
You can go back through the previous semesters by clicking on the link near the top of the page.