Deep Learning is an interesting new branch of machine learning where neural networks consisting of multiple layers have shown new generalization capabilities. The seminar will look at advances in both general deep learning approaches, and at the specific case of Neural Machine Translation (NMT). NMT is a new paradigm in data-driven machine translation. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences and the full sequence to sequence task is handled in one model.
Here is a link to last semester's seminar.
There is a Munich interest group for Deep Learning, which has an associated mailing list, the paper announcements are sent out on this list. See the link here.
Email Address: Put Last Name Here @cis.uni-muenchen.de
Thursdays 14:45 (s.t.), location ZOOM ONLINE
You can install the zoom client or click cancel and use browser support (might not work for all browsers).
Contact Alexander Fraser if you need the zoom link.
New attendees are welcome. Read the paper and bring a paper or electronic copy with you, you will need to refer to it during the discussion.
Click here for directions to CIS.
If this page appears to be out of date, use the refresh button of your browser
Date | Paper | Links | Discussion Leader |
December 1st, 2022 | Jason Wei, Xuezhi Wang, et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS. | paper | Viktor Hangya |
January 26th, 2023 | Long Ouyang, Jeff Wu, et al. (2022). Training language models to follow instructions with human feedback. arXiv. | paper Yoav Goldberg's blog | Sophie Henning |
February 2nd, 2023 | Leshem Choshen, Elad Venezian, Shachar Don-Yehia, Noam Slonim, Yoav Katz (2022). Where to start? Analyzing the potential value of intermediate models. arXiv. | paper github | Matthias Aßenmacher |
February 16th, 2023 | Jason Wei, Yi Tay, et al. (2022). Emergent Abilities of Large Language Models. Transactions TMLR. | paper | Kerem Senel |
February 23rd, 2023 | Gabriel Ilharco, Marco Tulio Ribeiro, et al. (2022). Editing Models with Task Arithmetic. arXiv. | paper | Alexandra Chronopoulou |
March 2nd, 2023 | CIS Internal | Contact Ayyoob | Ayyoob Imani |
March 16th, 2023 | Wenlong Huang, Fei Xia, et al. (2023). Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control. arXiv. | paper | Shengqiang Zhang |
March 23rd, 2023 | Barun Patra, Saksham Singhal, et al. (2022). Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning. arXiv. | paper | Katharina Hämmerl |
March 30th, 2023 | Sebastien Bubeck, Varun Chandrasekaran, et al. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. arXiv. | paper | Hinrich Schütze |
Further literature:
You can go back through the previous semesters by clicking on the link near the top of the page.