
Michael Hahn
@mhahn29
Followers
1K
Following
5K
Media
26
Statuses
220
Professor at Saarland University @LstSaar @SIC_Saar. Previously PhD at Stanford @stanfordnlp. Machine learning, language, and cognitive science.
Saarbrücken Germany
Joined June 2012
RT @Nived_Rajaraman: Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025!. 📝 Soliciting abstracts/posters exp….
0
28
0
RT @nouhadziri: 📢 Can LLMs really reason outside the box in math? Or are they just remixing familiar strategies? . Remember DeepSeek R1, o1….
0
159
0
Very excited about this work: deep results from logic shedding light on Transformers and the benefit of depth.
New on arXiv: Knee-Deep in C-RASP, by @pentagonalize, Michael Cadilhac and me. The solid stepped line is our theoretical prediction based on what problems C-RASP can solve, and the numbers/colors are what transformers (no position embedding) can learn.
0
3
12
RT @tallinzen: I'm hiring at least one post-doc! We're interested in creating language models that process language more like humans than m….
0
52
0
RT @geoffreyirving: New alignment theory paper! We present a new scalable oversight protocol (prover-estimator debate) and a proof that hon….
0
55
0
RT @julien_siems: 1/9 There is a fundamental tradeoff between parallelizability and expressivity of Large Language Models. We propose a new….
0
34
0
RT @MorrisYau: Transformers: ⚡️fast to train (compute-bound), 🐌slow to decode (memory-bound). Can Transformers be optimal in both? Yes! By….
0
36
0
RT @JQ_Zhu: 1/9 Thrilled to share our recent theoretical paper (with @cocosci_lab) on human belief updating, now published in Psychological….
0
13
0
RT @broccolitwit: In Transformer theory research, we often use tiny models and toy tasks. A straightforward criticism is that this setting….
0
1
0
RT @lambdaviking: A fun project with really thorough analysis of how LLMs try and often fail to implement parsing algorithms. Bonus: find….
0
3
0
RT @agiats_football: 📝 Our #ACL2025 paper is now on arXiv!."Information Locality as an Inductive Bias for Neural Language Models". We quant….
0
11
0
RT @SonglinYang4: Check out log-linear attention—our latest approach to overcoming the fundamental limitation of RNNs’ constant state size,….
0
52
0
RT @Aaditya6284: Was super fun to be a part of this work! Felt very satisfying to bring the theory work on ICL with linear attention a bit….
0
5
0
RT @yuekun_yao: Can language models learn implicit reasoning without chain-of-thought? . Our new paper shows: Yes, LMs can learn k-hop reas….
0
2
0
RT @zzZixuanWang: LLMs can solve complex tasks that require combining multiple reasoning steps. But when are such capabilities learnable vi….
0
37
0
RT @michaelwhanna: @mntssys and I are excited to announce circuit-tracer, a library that makes circuit-finding simple!. Just type in a sent….
0
46
0