 
            
              Gabriel Synnaeve
            
            @syhw
Followers
                17K
              Following
                9K
              Media
                319
              Statuses
                9K
              Nerd & Dad. RL & CodeGen research since before it was cool.
              
              Paris
            
            
              
              Joined October 2009
            
            
           Marin, which @percyliang, @dlwh, and many others are building is *fully open* (not just open weights) and has been used to build high-quality, competitive models. Come to the talk next week at Ray Summit 😀 
          
                
                2
              
              
                
                7
              
              
                
                64
              
             Several of my team members + myself are impacted by this layoff today. Welcome to connect :) 
          
                
                474
              
              
                
                287
              
              
                
                7K
              
             Abraham Watkins Law Firm is incredibly honored to have represented a remarkable family who suffered unimaginable tragedy — the loss of a loved one and life-altering injuries. This $60 million mid-trial settlement brings justice and closure to a family that placed their trust in 
          
                
                12
              
              
                
                30
              
              
                
                57
              
             🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with @AIatMeta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full 
          
                
                52
              
              
                
                295
              
              
                
                2K
              
             TL;DR: I made a Transformer that conditions its generation on latent variables. To do so an encoder Transformer only needs a source of randomness during generation, but then it needs an encoder for training, as a [conditional] VAE. 1/5 
          
                
                20
              
              
                
                54
              
              
                
                589
              
             AI can both be awesome today, tomorrow, and a ton of work is left to do for a while! 
           The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self 
            
                
                4
              
              
                
                5
              
              
                
                85
              
             The scariest stories this October aren’t fiction—they’re funded. Read the new Capital Research magazine issue on our website! 
          
                
                4
              
              
                
                14
              
              
                
                105
              
             Replicate IMO-Gold in less than 500 lines:  https://t.co/XHQXDaJ452  The prover-verifier workflow from Huang & Yang: Winning Gold at IMO 2025 with a Model-Agnostic Verification-and-Refinement Pipeline (  https://t.co/MD4ZNZeRPF),  original code at  https://t.co/MJhU5BLEDJ 
          
          
                
                4
              
              
                
                20
              
              
                
                158
              
             This is an excellent history of LLMs, doesn't miss seminal papers I know. Reminds you we're standing on the shoulders of giants, and giants are still being born today. 
          
                
                12
              
              
                
                117
              
              
                
                691
              
             🚨 Attention aspiring PhD students 🚨 Meta / FAIR is looking for candidates for a joint academic/industry PhD! Keywords: AI for Math & Code. LLMs, RL, formal and informal reasoning. You will be co-advised by prof. @Amaury_Hayat from ecole des ponts and yours truly. You'll have 
          
                
                24
              
              
                
                120
              
              
                
                896
              
             New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog:  https://t.co/w5ZDsHDDPE  Code:  https://t.co/7UgKuD9Yll  Paper: 
          
            
            arxiv.org
              Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on...
            
                
                137
              
              
                
                654
              
              
                
                4K
              
             - The mainstream wave depleted training data, is going into more synthetic and posttraining-aligned data, more execution traces collection. - The hipster wave is fed up with Transformers. But we only fund arch research at "small" scale. => Eventually new data will need new archs. 
          
                
                0
              
              
                
                0
              
              
                
                3
              
             Also start there if you don't know about abstract interpretation 
          
                
                1
              
              
                
                1
              
              
                
                3
              
             Code World Model is necessary but not sufficient to do grounded planning. Simple take: pretrain like you'll posttrain (agentic coding). Bright future (research) take: neural concrete interpretation will converge to neural abstract interpretation. 
          
                
                1
              
              
                
                1
              
              
                
                20
              
             Good analysis on Code World Model 
          
            
            artificialintelligencemadesimple.com
              How execution traces expose both the promise and brittleness of world models for code
            
                
                1
              
              
                
                3
              
              
                
                10
              
             it's what we do in Code World Model too 
           I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code:  https://t.co/AgEyxXb7Xi  Blog:  https://t.co/n4FRxiEcrr 
            
            
                
                3
              
              
                
                13
              
              
                
                117
              
             Introducing Concerto 🎶 Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations. What is it: Concerto is a self-supervised Point Transformer V3 that jointly learns from 2D and 3D modalities, producing rich spatial representations. It can take both point clouds and 
          
                
                3
              
              
                
                31
              
              
                
                157
              
             🚀 Excited to share our new paper on scaling laws for xLSTMs vs. Transformers. Key result: xLSTM models Pareto-dominate Transformers in cross-entropy loss. - At fixed FLOP budgets → xLSTMs perform better - At fixed validation loss → xLSTMs need fewer FLOPs 🧵 Details in thread 
          
                
                13
              
              
                
                39
              
              
                
                210
              
             Everything I know in RL in one tweet: exploration>exploitation, easy to leverage off-policy positive rewards, hard to leverage off-policy negative rewards, update the policy often, focus on throughput, self-play or find asymmetric grounding, clip everything but check statistics. 
          
                
                11
              
              
                
                29
              
              
                
                490
              
             Today, I am launching @axiommathai At Axiom, we are building a self-improving superintelligent reasoner, starting with an AI mathematician. 
          
                
                184
              
              
                
                260
              
              
                
                2K
              
            
            @EnergyVaultInc $NRGV $300M from OIC to develop pipeline and IPP strategy accelerating path to $100M+ in recurring annual EBITDA. Expects deployment of 1.5GW+ energy storage across high-growth markets in the U.S., Australia, and Europe. Squeeze started at $2.53. Links in thread:
          
          
                
                3
              
              
                
                3
              
              
                
                13
              
             
             
               
             
             
             
             
               
             
             
             
             
             
               
             
             
            