Csordás Róbert @robert_csordas X Profile

Csordás Róbert

@robert_csordas

Followers

1K

Following

661

Media

24

Statuses

220

Research scientist @OpenAI. Ex postdoc at Stanford working on systematic generalization and algorithmic reasoning. Ex IDSIA PhD, Ex @DeepMind intern.

https://t.co/R9ZHmWfwJm

London, England

Joined June 2016

Don't wanna be here? Send us removal request.

Csordás Róbert

@robert_csordas

2 days

Paper:

arxiv.org

Modern LLMs are increasingly deep, and depth correlates with performance, albeit with diminishing returns. However, do these models use their depth efficiently? Do they compose more features to...

1

2

4

Csordás Róbert

@robert_csordas

2 days

Attending @NeurIPSConf? Stop by our poster "Do Language Models Use Their Depth Efficiently?" with @chrmanning and @ChrisGPotts today at poster #4011 in Exhibit Hall C, D, E from 4:30pm.

2

13

67

Kazuki Irie

@kzkirie

2 days

Sharing a story of generosity at #NeurIPS25. I had to cancel my trip due to illness. @maxmbeck, @julien_siems and @robert_csordas, who are not coauthors, generously agreed to present my poster. I'm really moved by the kindness of this incredible group of RNN stars.

1

2

7

John Hewitt

@johnhewtt

15 days

Come do a PhD with me at Columbia! My lab tackles basic problems in alignment, interpretability, safety, and capabilities of language systems. If you love adventuring in model internals and behaviors---to understand and improve---let's do it together! pic: a run in central park

12

129

948

Houjun Liu

@houjun_liu

29 days

Good morning Suzhou! @amelia_f_hardy and I will be at @emnlpmeeting to present our work *TODAY, Hall C, 12:30PM; paper number 426* Come learn: ✅ why likelihood is important to simultaneously optimize with attack success ✅ online preference learning tricks for LM falsification

Houjun Liu

@houjun_liu

4 months

New Paper Day! For EMNLP findings—in LM red-teaming, we show you have to optimize for **both** perplexity and toxicity for high-probability, hard to filter, and natural attacks!

1

9

18

Rui-Jie (Ridger) Zhu ✈️ NeurIPS 25

@RidgerZhu

1 month

Thrilled to release new paper: “Scaling Latent Reasoning via Looped Language Models.” TLDR: We scale up loop language models to 2.6 billion parameters, and pretrained on > 7 trillion tokens. The resulting model is on par with SOTA language models of 2 to 3x size.

21

148

667

Diyi Yang

@Diyi_Yang

1 month

Stanford NLP 25th Anniversary🤩🤩🤩

Stanford NLP Group

@stanfordnlp

1 month

Today, we’re overjoyed to have a 25th Anniversary Reunion of @stanfordnlp. So happy to see so many of our former students back at @Stanford. And thanks to @StanfordHAI for the venue!

9

39

601

Christopher Potts

@ChrisGPotts

1 month

@akshatgupta57 @neuranna @GopalaSpeech @berkeley_ai We (@robert_csordas @chrmanning and I) have a paper coming out in NeurIPS 2025 with similar findings, though with a less positive picture regarding the relationship between depth and problem complexity for MQuAKE:

arxiv.org

Modern LLMs are increasingly deep, and depth correlates with performance, albeit with diminishing returns. However, do these models use their depth efficiently? Do they compose more features to...

1

2

19

Brenden Lake @ NeurIPS

@LakeBrenden

2 months

Today in Nature Machine Intelligence, Kazuki Irie and I discuss 4 classic challenges for neural nets — systematic generalization, catastrophic forgetting, few-shot learning, and reasoning. We argue there is a unifying fix: the right incentives & practice. https://t.co/2MWJ61XweG

2

37

200

Junfan Zhu✈️NeurIPS

@junfanzhu98

2 months

🧠 Do Language Models Use Their Depth Efficiently? It was a pleasure attending today’s #BayArea #MachineLearning Symposium — where Prof. Christopher Manning gave an insightful and humorous talk on how #LLMs use their depth. https://t.co/CiWDMqZHMu

0

3

58

Samuel L Smith

@SamuelMLSmith

2 months

The Training team @OpenAI is hiring researchers in London 🚀 Our twin missions are to train better LLMs, and serve them more cheaply Get in touch if you are excited to collaborate on architecture design, reliable scaling, and faster optimization

11

38

477

Houjun Liu

@houjun_liu

2 months

@siddarthv66 🫡 MoEUT + Thoughtbubbles would be unstoppable CC @robert_csordas

0

2

Houjun Liu

@houjun_liu

2 months

Introducing 𝘁𝗵𝗼𝘂𝗴𝗵𝘁𝗯𝘂𝗯𝗯𝗹𝗲𝘀: a *fully unsupervised* LM for input-adaptive parallel latent reasoning ✅ Learn yourself a reasoning model with normal pretraining ✅ Better perplexity compared to fixed thinking tokens No fancy loss, no chain of thought labels 🚀

5

47

234

Julie Kallini ✨

@JulieKallini

2 months

New paper! 🌈 In English, pie = 🥧. In Spanish, pie = 🦶. Multilingual tokenizers often share such overlapping tokens between languages. Do these “False Friends” hurt or help multilingual LMs? We find that overlap consistently improves transfer—even when it seems misleading. 🧵

1

21

100

Imanol Schlag

@ImanolSchlag

3 months

Can we develop AI responsibly? Yes, and we prove it by example. Two weeks ago, we released our Apertus models, which set a new standard in transparency, inclusivity, and compliance while achieving competitive performance. 🧵

1

4

8

CSCS Lugano

@cscsch

3 months

@EPFL , @ETH_en and #CSCS today released Apertus, Switzerland's first large-scale, multilingual language model (LLM). As a fully open LLM, it serves as a building block for developers and organizations to create their own applications: https://t.co/7bJlINiIdn #Apertus #AI

17

46

163

Houjun Liu

@houjun_liu

4 months

New Paper Day! For EMNLP findings—in LM red-teaming, we show you have to optimize for **both** perplexity and toxicity for high-probability, hard to filter, and natural attacks!

2

15

32

Houjun Liu

@houjun_liu

4 months

Hello friends! Presenting this poster at @aclmeeting in Vienna🇦🇹 on Monday, 6PM. Come learn about dropout, training dynamics, or just come to hang out. See you there 🫡

Houjun Liu

@houjun_liu

6 months

New Paper Day! For ACL Findings 2025: You should **drop dropout** when you are training your LMs AND MLMs!

3

10

37

Guan Wang

@makingAGI

5 months

🚀Introducing Hierarchical Reasoning Model🧠🤖 Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoning power on complex tasks like ARC-AGI and expert-level Sudoku using just 1k examples, no pretraining or CoT! Unlock next AI breakthrough with

227

652

4K

Aryaman Arora

@aryaman2020

5 months

please come to East building poster #1108 (ballroom A) rn

Zhengxuan Wu

@ZhengxuanZenWu

5 months

ICML ✈️ this week. open to chat and learn mech interp from you. @aryaman2020 and i have cool ideas about steering, just come to our AxBench poster. new steering blog: https://t.co/ZPIIejq82M 中文:

2

8

42