
Jacob Andreas
@jacobandreas
Followers
19K
Following
1K
Media
85
Statuses
3K
Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw
Cambridge, MA
Joined March 2007
👉 New preprint on a new family of Transformer-type models whose depth scales logarithmically with sequence length. Enables:.- fast training.- fast decoding.- large memory capacity in associative recall.- strong length generalization on state tracking.
Transformers: ⚡️fast to train (compute-bound), 🐌slow to decode (memory-bound). Can Transformers be optimal in both? Yes! By exploiting sequential-parallel duality. We introduce Transformer-PSM with constant time per token decode. 🧐
2
9
80
RT @kpal_koyena: 🚨 Registration is live! 🚨. The New England Mechanistic Interpretability (NEMI) Workshop is happening August 22nd 2025 at N….
0
28
0
RT @MorrisYau: Transformers: ⚡️fast to train (compute-bound), 🐌slow to decode (memory-bound). Can Transformers be optimal in both? Yes! By….
0
36
0
RT @uzpg_: @kaivu, @atticuswzf , and I were researching long horizon reasoning (with @jacobandreas). We found existing benchmarks’ hard pro….
0
10
0
RT @interplaywrkshp: 🚨🚨 Studying the INTERPLAY of LMs' internals and behavior? . Join our @colmweb.org workshop on comprehensivly evaluati….
0
4
0
RT @stanfordnlp: For this week’s NLP Seminar, we are thrilled to host @jacobandreas to talk about “Just Asking Questions”.When: 5/15 Thurs….
0
12
0
RT @LauraRuis: Excited to announce that this fall I'll be joining @jacobandreas's amazing lab at MIT for a postdoc to work on interp. for r….
0
11
0
RT @cedcolas: i just got an art grant from the council for the arts at MIT!. *Tangible Dreams* will let visitors experiment and play with a….
0
12
0
RT @nlp_mit: MIT NLP @ ICLR 2025 - catch.@MehulDamani2 at poster 219, Thursday 3PM to chat about "Learning How Hard to Think: Input Adaptiv….
0
1
0
RT @ShikharMurty: New #NAACL2025 paper! 🚨.Transformer LMs are data hungry, we propose a new auxiliary loss function (TreeReg) to fix that.….
0
24
0
RT @akyurekekin: ✨ Big life updates ✨. - @afeyzaakyurek and I welcomed our baby!.- Successfully defended my PhD and graduated from MIT 🎓.-….
0
12
0
RT @gabe_grand: Tackling complex problems with LMs requires search/planning, but how should test-time compute be structured?. Introducing S….
0
38
0
RT @ben_lipkin: New preprint on controlled generation from LMs!. I'll be presenting at NENLP tomorrow 12:50-2:00pm. Longer thread coming so….
0
11
0
RT @gabe_grand: New preprint is live! Tweet thread coming 🚧🔜. 📅 Excited to present this work in-person:.- 4/11: Poster at New England NLP (….
0
11
0
RT @hou_bairu: 1/ Long chain-of-thought (CoT) reasoning boosts LLM performance—but with a computational overhead. Checkout our new paper,….
0
19
0
RT @LChoshen: Human feedback is critical for aligning LLMs, so why don’t we collect it in the open ecosystem?🧐.We (15 orgs) gathered the ke….
0
50
0
RT @nsaphra: Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I cou….
0
23
0
RT @nlp_mit: Hello everyone! We are quite a bit late to the twitter party, but welcome to the MIT NLP Group account! follow along for the l….
0
52
0