Foundation Model Frontiers

Sommersemester 2025
Hinrich Schütze, Shengqiang Zhang
Fr 10:15-11:45

Room

Topic

Foundation models have been for the last few years and continue to be a highly dynamic research area -- in terms of scientific progress, technical innovation and real-world impact. In this seminar, we will review and discuss the latest developments in foundation models, including new breakthroughs as they happen.

Credit for MSc Computerlinguistik

Schedule

day topic resources details
Apr 25 introduction organization, lectures, student topics
May 2 synthetic data talk by Latif Köksal, DeepMind
assignment of topics
May 9 memory etc. NoLiMa talk by Ali Modaresi
May 16 multilinguality (1) Manchu talk by Peiqin Lin
May 23 multilinguality (2) representation&editing talk by Mingyang Wang
May 30
Jun 6
Jun 13
Jun 20
Jun 27
Jul 4
Jul 11
Jul 18
Jul 25

Topics for Referat and Hausarbeit

Topics and papers given for each topic are (somewhat random) examples. Feel free to propose your own topics and papers for your Referat/Hausarbeit.
paper topic
all topics covered in the lectures (see above)
Geiping et al. test-time compute: recurrent depth approach
DeepSeek-AI DeepSeek-R1: reasoning through RL
Gemma Team Gemma 3 technical report
Olsson et al. induction heads
Maini et al. rephrasing the web
Sharkey et al. open problems in mechanistic interpretability
Park et al. linear representation hypothesis
Han et al. (2024) word embeddings are steers
Llama Team (2024) Llama 3
Qwen et al. (2024) Qwen 2.5
Abdin et al. (2024) Phi-4
Makelov et al. (2024) sparse autoencoders (2)
McDougall et al. (2023) copy suppression
Saphra et al. (2024) notion of mechinterp
Dutta et al. (2024) mechinterp: COT
Geva et al. (2023) factual associations/enrichment
nostalgebrist (2020) logit lens
Chughtai et al. (2024) summing up the facts
Shao et al. (2024) DeepSeekMath
Zhao et al. (2024) Marco-o1
Wu et al. (2024) REFT
Hübotter et al. (2025) SIFT
Hughes et al. (2024) open-endedness
Turpin et al. (2023) unfaithful COT
Gottweis et al. (2025) AI co-scientist
Venhoff et al. (2025) steered reasoning
Yu et al. (2024) superweights
Bricken et al. (2023) sparse autoencoders (1)
Packer et al. (2023) MemGPT
Milliere et al. (2024) philosophy of LLMs
StanfordNLP (2024) DSPy
Durrani et al. (2020) analyzing neurons
Voita et al. (2023) dead neurons
De Peuter et al. (2023) human-agent cooperation