syhw Profile Banner
Gabriel Synnaeve Profile
Gabriel Synnaeve

@syhw

Followers
17K
Following
9K
Media
319
Statuses
9K

Nerd & Dad. RL & CodeGen research since before it was cool.

Paris
Joined October 2009
Don't wanna be here? Send us removal request.
@tydsh
Yuandong Tian
5 days
Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
474
287
7K
@realJessyLin
Jessy Lin
6 days
🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full
50
285
2K
@Tesla
Tesla
14 days
Teslas have the lowest maintenance & repair costs of any brand
0
872
8K
@francoisfleuret
François Fleuret
7 days
TL;DR: I made a Transformer that conditions its generation on latent variables. To do so an encoder Transformer only needs a source of randomness during generation, but then it needs an encoder for training, as a [conditional] VAE. 1/5
19
53
585
@syhw
Gabriel Synnaeve
10 days
AI can both be awesome today, tomorrow, and a ton of work is left to do for a while!
@dwarkesh_sp
Dwarkesh Patel
10 days
The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self
4
5
87
@FabianGloeckle
Fabian Gloeckle
13 days
Replicate IMO-Gold in less than 500 lines: https://t.co/XHQXDaJ452 The prover-verifier workflow from Huang & Yang: Winning Gold at IMO 2025 with a Model-Agnostic Verification-and-Refinement Pipeline ( https://t.co/MD4ZNZeRPF), original code at https://t.co/MJhU5BLEDJ
4
20
159
@capitalresearch
Capital Research Center
4 days
Seth Klarman avoids the spotlight—but his funding helps shape the Left’s policy machine.
2
1
15
@syhw
Gabriel Synnaeve
18 days
This is an excellent history of LLMs, doesn't miss seminal papers I know. Reminds you we're standing on the shoulders of giants, and giants are still being born today.
12
116
692
@TacoCohen
Taco Cohen
19 days
🚨 Attention aspiring PhD students 🚨 Meta / FAIR is looking for candidates for a joint academic/industry PhD! Keywords: AI for Math & Code. LLMs, RL, formal and informal reasoning. You will be co-advised by prof. @Amaury_Hayat from ecole des ponts and yours truly. You'll have
24
119
898
@jm_alexia
Alexia Jolicoeur-Martineau
20 days
New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: https://t.co/w5ZDsHDDPE Code: https://t.co/7UgKuD9Yll Paper:
Tweet card summary image
arxiv.org
Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on...
136
653
4K
@syhw
Gabriel Synnaeve
22 days
- The mainstream wave depleted training data, is going into more synthetic and posttraining-aligned data, more execution traces collection. - The hipster wave is fed up with Transformers. But we only fund arch research at "small" scale. => Eventually new data will need new archs.
0
0
3
@REXShares
REX Shares
1 month
Introducing XRPR: The first U.S. ETF giving you spot exposure to XRP via a traditional ETF.
7
50
139
@syhw
Gabriel Synnaeve
22 days
1
0
1
@syhw
Gabriel Synnaeve
22 days
Also start there if you don't know about abstract interpretation
1
1
3
@syhw
Gabriel Synnaeve
22 days
Code World Model is necessary but not sufficient to do grounded planning. Simple take: pretrain like you'll posttrain (agentic coding). Bright future (research) take: neural concrete interpretation will converge to neural abstract interpretation.
1
1
20
@syhw
Gabriel Synnaeve
22 days
it's what we do in Code World Model too
@DBahdanau
🇺🇦 Dzmitry Bahdanau
6 months
I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: https://t.co/AgEyxXb7Xi Blog: https://t.co/n4FRxiEcrr
3
13
118
@CelsiusOfficial
CELSIUS Energy Drink
24 days
Frosted outside. Citrus inside.
33
58
1K
@maxmbeck
Maximilian Beck
25 days
🚀 Excited to share our new paper on scaling laws for xLSTMs vs. Transformers. Key result: xLSTM models Pareto-dominate Transformers in cross-entropy loss. - At fixed FLOP budgets → xLSTMs perform better - At fixed validation loss → xLSTMs need fewer FLOPs 🧵 Details in thread
13
39
212
@syhw
Gabriel Synnaeve
25 days
Humans are awesome
0
1
6
@syhw
Gabriel Synnaeve
27 days
Everything I know in RL in one tweet: exploration>exploitation, easy to leverage off-policy positive rewards, hard to leverage off-policy negative rewards, update the policy often, focus on throughput, self-play or find asymmetric grounding, clip everything but check statistics.
11
29
489
@CarinaLHong
Carina Hong
27 days
Today, I am launching @axiommathai At Axiom, we are building a self-improving superintelligent reasoner, starting with an AI mathematician.
184
261
2K
@MariaLomeli_
Maria Lomeli
29 days
🚨New paper: Stochastic activations We introduce stochastic activations. This novel strategy consists of randomly selecting between several non-linear functions in the feed-forward layers of a large language model.
8
16
123