Csordás Róbert
@robert_csordas
Followers
1K
Following
661
Media
24
Statuses
220
Research scientist @OpenAI. Ex postdoc at Stanford working on systematic generalization and algorithmic reasoning. Ex IDSIA PhD, Ex @DeepMind intern.
London, England
Joined June 2016
Attending @NeurIPSConf? Stop by our poster "Do Language Models Use Their Depth Efficiently?" with @chrmanning and @ChrisGPotts today at poster #4011 in Exhibit Hall C, D, E from 4:30pm.
2
13
67
Sharing a story of generosity at #NeurIPS25. I had to cancel my trip due to illness. @maxmbeck, @julien_siems and @robert_csordas, who are not coauthors, generously agreed to present my poster. I'm really moved by the kindness of this incredible group of RNN stars.
1
2
7
Come do a PhD with me at Columbia! My lab tackles basic problems in alignment, interpretability, safety, and capabilities of language systems. If you love adventuring in model internals and behaviors---to understand and improve---let's do it together! pic: a run in central park
12
129
948
Good morning Suzhou! @amelia_f_hardy and I will be at @emnlpmeeting to present our work *TODAY, Hall C, 12:30PM; paper number 426* Come learn: ✅ why likelihood is important to simultaneously optimize with attack success ✅ online preference learning tricks for LM falsification
New Paper Day! For EMNLP findings—in LM red-teaming, we show you have to optimize for **both** perplexity and toxicity for high-probability, hard to filter, and natural attacks!
1
9
18
Thrilled to release new paper: “Scaling Latent Reasoning via Looped Language Models.” TLDR: We scale up loop language models to 2.6 billion parameters, and pretrained on > 7 trillion tokens. The resulting model is on par with SOTA language models of 2 to 3x size.
21
148
667
Stanford NLP 25th Anniversary🤩🤩🤩
Today, we’re overjoyed to have a 25th Anniversary Reunion of @stanfordnlp. So happy to see so many of our former students back at @Stanford. And thanks to @StanfordHAI for the venue!
9
39
601
@akshatgupta57 @neuranna @GopalaSpeech @berkeley_ai We (@robert_csordas @chrmanning and I) have a paper coming out in NeurIPS 2025 with similar findings, though with a less positive picture regarding the relationship between depth and problem complexity for MQuAKE:
arxiv.org
Modern LLMs are increasingly deep, and depth correlates with performance, albeit with diminishing returns. However, do these models use their depth efficiently? Do they compose more features to...
1
2
19
Today in Nature Machine Intelligence, Kazuki Irie and I discuss 4 classic challenges for neural nets — systematic generalization, catastrophic forgetting, few-shot learning, and reasoning. We argue there is a unifying fix: the right incentives & practice. https://t.co/2MWJ61XweG
2
37
200
🧠 Do Language Models Use Their Depth Efficiently? It was a pleasure attending today’s #BayArea #MachineLearning Symposium — where Prof. Christopher Manning gave an insightful and humorous talk on how #LLMs use their depth. https://t.co/CiWDMqZHMu
0
3
58
The Training team @OpenAI is hiring researchers in London 🚀 Our twin missions are to train better LLMs, and serve them more cheaply Get in touch if you are excited to collaborate on architecture design, reliable scaling, and faster optimization
11
38
477
@siddarthv66 🫡 MoEUT + Thoughtbubbles would be unstoppable CC @robert_csordas
0
2
2
Introducing 𝘁𝗵𝗼𝘂𝗴𝗵𝘁𝗯𝘂𝗯𝗯𝗹𝗲𝘀: a *fully unsupervised* LM for input-adaptive parallel latent reasoning ✅ Learn yourself a reasoning model with normal pretraining ✅ Better perplexity compared to fixed thinking tokens No fancy loss, no chain of thought labels 🚀
5
47
234
New paper! 🌈 In English, pie = 🥧. In Spanish, pie = 🦶. Multilingual tokenizers often share such overlapping tokens between languages. Do these “False Friends” hurt or help multilingual LMs? We find that overlap consistently improves transfer—even when it seems misleading. 🧵
1
21
100
Can we develop AI responsibly? Yes, and we prove it by example. Two weeks ago, we released our Apertus models, which set a new standard in transparency, inclusivity, and compliance while achieving competitive performance. 🧵
1
4
8
New Paper Day! For EMNLP findings—in LM red-teaming, we show you have to optimize for **both** perplexity and toxicity for high-probability, hard to filter, and natural attacks!
2
15
32
Hello friends! Presenting this poster at @aclmeeting in Vienna🇦🇹 on Monday, 6PM. Come learn about dropout, training dynamics, or just come to hang out. See you there 🫡
New Paper Day! For ACL Findings 2025: You should **drop dropout** when you are training your LMs AND MLMs!
3
10
37
🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with
227
652
4K
please come to East building poster #1108 (ballroom A) rn
ICML ✈️ this week. open to chat and learn mech interp from you. @aryaman2020 and i have cool ideas about steering, just come to our AxBench poster. new steering blog: https://t.co/ZPIIejq82M 中文:
2
8
42