Ollie Liu
@olliezliu
Followers
493
Following
940
Media
16
Statuses
136
Research Intern with FAIR @AIatMeta 👨🍳 Oliver IRL; #ML PhD student @CSatUSC 🎓 Alum @mldcmu @MSFTResearch 🧐 Foundation Models, AI4Science, Decision Making
∂Conv { LA, NYC }
Joined March 2022
Introducing METAGENE-1🧬, an open-source 7B-parameter metagenomics foundation model pretrained on 1.5 trillion base pairs. Built for pandemic monitoring, pathogen detection, and biosurveillance, with SOTA results across many genomics tasks. 🧵1/
4
12
68
Check out Walrus, a foundation model for learning physical dynamics, shipped by the AMAZING team at Polymathic!
1/ Today with my colleagues @PolymathicAI, I'm excited to release our latest project, Walrus, a cross-domain foundation model for physical dynamics, into the world. https://t.co/ihv1MZGQM3 Paper: https://t.co/d6ah9LO4ud Git: https://t.co/s3p8qGhZQR HF: https://t.co/RufaBD9eJk
0
0
2
AI for mathematical discovery at Axiom. A dream come true! ... and yours truly with his French accent and manifest stage fright ...
Introducing Axiom’s discovery team, led by @f_charton: We build models that will create novel constructions, map problems into solutions and intuitions, and learn the structure of entire mathematical worlds. Built to attack hard open problems, one at a time.
1
2
23
We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big
79
336
2K
some hypotheses for what “better pretraining” could mean - integration with other training stages: i’m guessing they’re finally at a point where post-training perf (eg SWE-Bench) can be used as signal for pretraining eng decisions - filtering: scaling approaches like influence
The secret behind Gemini 3? Simple: Improving pre-training & post-training 🤯 Pre-training: Contra the popular belief that scaling is over—which we discussed in our NeurIPS '25 talk with @ilyasut and @quocleix—the team delivered a drastic jump. The delta between 2.5 and 3.0 is
14
21
310
Very proud to release AION-1 and work with an incredible team!
I’m excited to announce that @PolymathicAI new astrophysics model, AION-1, has been accepted to #neurips2025 ! 🎉 Come see our poster on Dec 5 (Friday) at 1-4pm PT, poster session 5 OR hear our talk from @liamhparker and Francois Lanusse at #AI4Science workshop It’s the
0
0
1
🚀We’re looking for 2026 interns at @PolymathicAI (NYC)! Want to work on scientific foundation models + ML for physics, biology, astronomy, & more? Want to contribute to frontier research with a brilliant, fun, and friendly team? Please sign up on our interest form 👉
22
46
474
Another great work from Yu!
LLM CoT reasoning looks smart but can be logically flawed or... just made up. It's time to hold reasoning accountable! We built VeriCoT to do just that. VeriCoT extracts the core argument of the CoT using well-formed symbolic notions of logical support. It formalizes every CoT
0
0
1
I crossed an interesting threshold yesterday, which I think many other mathematicians have been crossing recently as well. In the middle of trying to prove a result, I identified a statement that looked true and that would, if true, be useful to me. 1/3
64
305
3K
In a new paper with @sang_yun_lee and @giuliacfanti, we study how to make use of negative reward signal in sparse reward tasks, including GSM8K examples giving 0 reward. The idea is to push the policy's new samples away from the observed negative samples with a Bayesian posterior
Can ML reliably solve big problems that humans cannot? We’ve seen post-training methods that learn from correct or successful samples. But we still don’t have good algorithms that learn solely from failures! Introducing BaNEL: a method for post-training from zero-reward samples.
1
16
92
Announcing 🔭✨Hubble, a suite of open-source LLMs to advance the study of memorization! Pretrained models up to 8B params, with controlled insertion of texts (e.g., book passages, biographies, test sets, and more!) designed to emulate key memorization risks 🧵
2
41
124
Another great work from Deqing 🥳 Be sure to check it out!
Why do Transformers fail at algorithmic reasoning? We find it's not a lack of power, but a capacity mismatch. Our new preprint proves a tight, non-asymptotic bound: an L-layer model can only solve graph connectivity on graphs with a diameter up to exactly 3^L.
0
0
6
New work: “GLASS Flows: Transition Sampling for Alignment of Flow and Diffusion Models”. GLASS generates images by sampling stochastic Markov transitions with ODEs - allowing us to boost text-image alignment for large-scale models at inference time! https://t.co/unsuG3mYer [1/7]
4
61
249
We now know that LoRA can match full-parameter RL training (from https://t.co/pGxoMLFIGv and our Tina paper https://t.co/dkXdxV3eNj), but what about DoRA, QLoRA, and more? We are releasing a clean LoRA-for-RL repo to explore them all. https://t.co/AsWWG1kmKt
LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.
13
71
567
When @ethansdyer and I joined Anthropic last Dec and spearheaded the discovery team, we decided to focus on unlocking computer-use as a bottleneck for scientific discovery. It has been incredible to work on improving computer-use and witness the fast progress. In OSWorld for
18
76
699
LoRA is real for Reasoning. https://t.co/pGxoMLFIGv
LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.
2
3
186
Efficient training of neural networks is difficult. Our second Connectionism post introduces Modular Manifolds, a theoretical step toward more stable and performant training by co-designing neural net optimizers with manifold constraints on weight matrices.
118
463
3K
New from Meta FAIR: Code World Model (CWM), a 32B-parameter research model designed to explore how world models can transform code generation and reasoning about code. We believe in advancing research in world modeling and are sharing CWM under a research license to help empower
104
226
1K
Keynote spotlight #2: COLM's first-day afternoon session will go polymathic 🔭 with @cosmo_shirley 🌌 from the Flatiron Institute
1
3
13
Super cool work!! We had a similar idea but from the perspective of steering across modalities ( https://t.co/J6Hf6NEFCL) Great to see these interpretability results!
arxiv.org
Steering methods have emerged as effective and targeted tools for guiding large language models' (LLMs) behavior without modifying their parameters. Multimodal large language models (MLLMs),...
Check out our COLM 2025 (oral) 🎤 SAEs reveal that VLM embedding spaces aren’t just "image vs. text" cones. They contain stable conceptual directions, some forming surprising bridges across modalities. 1/2
0
0
4