Current Topics in Natural Language Processing (WS 2022-2023)

Summary

Deep Learning is an interesting new branch of machine learning where neural networks consisting of multiple layers have shown new generalization capabilities. The seminar will look at advances in both general deep learning approaches, and at the specific case of Neural Machine Translation (NMT). NMT is a new paradigm in data-driven machine translation. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences and the full sequence to sequence task is handled in one model.

Here is a link to last semester's seminar.

There is a Munich interest group for Deep Learning, which has an associated mailing list, the paper announcements are sent out on this list. See the link here.

Instructors

Alexander Fraser

Email Address: Put Last Name Here @cis.uni-muenchen.de

CIS, LMU Munich

Hinrich Schütze

CIS, LMU Munich

Schedule

Thursdays 14:45 (s.t.), location ZOOM ONLINE

You can install the zoom client or click cancel and use browser support (might not work for all browsers).

Contact Alexander Fraser if you need the zoom link.

New attendees are welcome. Read the paper and bring a paper or electronic copy with you, you will need to refer to it during the discussion.

Click here for directions to CIS.

If this page appears to be out of date, use the refresh button of your browser

Date Paper Links Discussion Leader

December 1st, 2022 Jason Wei, Xuezhi Wang, et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS. paper Viktor Hangya

January 26th, 2023 Long Ouyang, Jeff Wu, et al. (2022). Training language models to follow instructions with human feedback. arXiv. paper
Yoav Goldberg's blog Sophie Henning

February 2nd, 2023 Leshem Choshen, Elad Venezian, Shachar Don-Yehia, Noam Slonim, Yoav Katz (2022). Where to start? Analyzing the potential value of intermediate models. arXiv. paper github Matthias Aßenmacher

February 16th, 2023 Jason Wei, Yi Tay, et al. (2022). Emergent Abilities of Large Language Models. Transactions TMLR. paper Kerem Senel

February 23rd, 2023 Gabriel Ilharco, Marco Tulio Ribeiro, et al. (2022). Editing Models with Task Arithmetic. arXiv. paper Alexandra Chronopoulou

March 2nd, 2023 CIS Internal Contact Ayyoob Ayyoob Imani

March 16th, 2023 Wenlong Huang, Fei Xia, et al. (2023). Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control. arXiv. paper Shengqiang Zhang

March 23rd, 2023 Barun Patra, Saksham Singhal, et al. (2022). Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning. arXiv. paper Katharina Hämmerl

March 30th, 2023 Sebastien Bubeck, Varun Chandrasekaran, et al. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. arXiv. paper Hinrich Schütze

Further literature:

You can go back through the previous semesters by clicking on the link near the top of the page.

Date	Paper	Links	Discussion Leader
December 1st, 2022	Jason Wei, Xuezhi Wang, et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS.	paper	Viktor Hangya
January 26th, 2023	Long Ouyang, Jeff Wu, et al. (2022). Training language models to follow instructions with human feedback. arXiv.	paper Yoav Goldberg's blog	Sophie Henning
February 2nd, 2023	Leshem Choshen, Elad Venezian, Shachar Don-Yehia, Noam Slonim, Yoav Katz (2022). Where to start? Analyzing the potential value of intermediate models. arXiv.	paper github	Matthias Aßenmacher
February 16th, 2023	Jason Wei, Yi Tay, et al. (2022). Emergent Abilities of Large Language Models. Transactions TMLR.	paper	Kerem Senel
February 23rd, 2023	Gabriel Ilharco, Marco Tulio Ribeiro, et al. (2022). Editing Models with Task Arithmetic. arXiv.	paper	Alexandra Chronopoulou
March 2nd, 2023	CIS Internal	Contact Ayyoob	Ayyoob Imani
March 16th, 2023	Wenlong Huang, Fei Xia, et al. (2023). Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control. arXiv.	paper	Shengqiang Zhang
March 23rd, 2023	Barun Patra, Saksham Singhal, et al. (2022). Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning. arXiv.	paper	Katharina Hämmerl
March 30th, 2023	Sebastien Bubeck, Varun Chandrasekaran, et al. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. arXiv.	paper	Hinrich Schütze