syhw Profile Banner
Gabriel Synnaeve Profile
Gabriel Synnaeve

@syhw

Followers
17K
Following
9K
Media
319
Statuses
9K

Nerd & Dad. RL & CodeGen research since before it was cool.

Paris
Joined October 2009
Don't wanna be here? Send us removal request.
@robertnishihara
Robert Nishihara
2 days
Marin, which @percyliang, @dlwh, and many others are building is *fully open* (not just open weights) and has been used to build high-quality, competitive models. Come to the talk next week at Ray Summit 😀
@percyliang
Percy Liang
3 days
Open AI means AI that is open.
2
7
64
@tydsh
Yuandong Tian
9 days
Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
474
287
7K
@AbrahamWatkins
Abraham Watkins
11 days
Abraham Watkins Law Firm is incredibly honored to have represented a remarkable family who suffered unimaginable tragedy — the loss of a loved one and life-altering injuries. This $60 million mid-trial settlement brings justice and closure to a family that placed their trust in
12
30
57
@realJessyLin
Jessy Lin
10 days
🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full
52
295
2K
@francoisfleuret
François Fleuret
10 days
TL;DR: I made a Transformer that conditions its generation on latent variables. To do so an encoder Transformer only needs a source of randomness during generation, but then it needs an encoder for training, as a [conditional] VAE. 1/5
20
54
589
@syhw
Gabriel Synnaeve
14 days
AI can both be awesome today, tomorrow, and a ton of work is left to do for a while!
@dwarkesh_sp
Dwarkesh Patel
14 days
The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self
4
5
85
@capitalresearch
Capital Research Center
4 days
The scariest stories this October aren’t fiction—they’re funded. Read the new Capital Research magazine issue on our website!
4
14
105
@FabianGloeckle
Fabian Gloeckle
17 days
Replicate IMO-Gold in less than 500 lines: https://t.co/XHQXDaJ452 The prover-verifier workflow from Huang & Yang: Winning Gold at IMO 2025 with a Model-Agnostic Verification-and-Refinement Pipeline ( https://t.co/MD4ZNZeRPF), original code at https://t.co/MJhU5BLEDJ
4
20
158
@syhw
Gabriel Synnaeve
22 days
This is an excellent history of LLMs, doesn't miss seminal papers I know. Reminds you we're standing on the shoulders of giants, and giants are still being born today.
12
117
691
@TacoCohen
Taco Cohen
23 days
🚨 Attention aspiring PhD students 🚨 Meta / FAIR is looking for candidates for a joint academic/industry PhD! Keywords: AI for Math & Code. LLMs, RL, formal and informal reasoning. You will be co-advised by prof. @Amaury_Hayat from ecole des ponts and yours truly. You'll have
24
120
896
@jm_alexia
Alexia Jolicoeur-Martineau
24 days
New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: https://t.co/w5ZDsHDDPE Code: https://t.co/7UgKuD9Yll Paper:
Tweet card summary image
arxiv.org
Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on...
137
654
4K
@syhw
Gabriel Synnaeve
25 days
- The mainstream wave depleted training data, is going into more synthetic and posttraining-aligned data, more execution traces collection. - The hipster wave is fed up with Transformers. But we only fund arch research at "small" scale. => Eventually new data will need new archs.
0
0
3
@opensea
OpenSea
18 hours
Spreadsheets aren’t a portfolio. Meet Portfolio View.
0
0
11
@syhw
Gabriel Synnaeve
25 days
1
0
1
@syhw
Gabriel Synnaeve
25 days
Also start there if you don't know about abstract interpretation
1
1
3
@syhw
Gabriel Synnaeve
25 days
Code World Model is necessary but not sufficient to do grounded planning. Simple take: pretrain like you'll posttrain (agentic coding). Bright future (research) take: neural concrete interpretation will converge to neural abstract interpretation.
1
1
20
@syhw
Gabriel Synnaeve
26 days
it's what we do in Code World Model too
@DBahdanau
🇺🇦 Dzmitry Bahdanau
6 months
I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: https://t.co/AgEyxXb7Xi Blog: https://t.co/n4FRxiEcrr
3
13
117
@XiaoyangWu_
Xiaoyang Wu
3 days
Introducing Concerto 🎶 Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations. What is it: Concerto is a self-supervised Point Transformer V3 that jointly learns from 2D and 3D modalities, producing rich spatial representations. It can take both point clouds and
3
31
157
@maxmbeck
Maximilian Beck
28 days
🚀 Excited to share our new paper on scaling laws for xLSTMs vs. Transformers. Key result: xLSTM models Pareto-dominate Transformers in cross-entropy loss. - At fixed FLOP budgets → xLSTMs perform better - At fixed validation loss → xLSTMs need fewer FLOPs 🧵 Details in thread
13
39
210
@syhw
Gabriel Synnaeve
29 days
Humans are awesome
0
1
6
@syhw
Gabriel Synnaeve
1 month
Everything I know in RL in one tweet: exploration>exploitation, easy to leverage off-policy positive rewards, hard to leverage off-policy negative rewards, update the policy often, focus on throughput, self-play or find asymmetric grounding, clip everything but check statistics.
11
29
490
@CarinaLHong
Carina Hong
1 month
Today, I am launching @axiommathai At Axiom, we are building a self-improving superintelligent reasoner, starting with an AI mathematician.
184
260
2K
@DigitalBDinc
DigitalBDinc
4 hours
@EnergyVaultInc $NRGV $300M from OIC to develop pipeline and IPP strategy accelerating path to $100M+ in recurring annual EBITDA. Expects deployment of 1.5GW+ energy storage across high-growth markets in the U.S., Australia, and Europe. Squeeze started at $2.53. Links in thread:
3
3
13