Armin W. Thomas
@athmsx
Followers
929
Following
2K
Media
37
Statuses
647
Co-Founder @RadicalNumerics | Prev: @LiquidAI_ and Data Science Fellow @StanfordData working with @HazyResearch and @russpoldrack | He/him
San Francisco, CA
Joined August 2016
Update: I co-founded @RadicalNumerics with @MichaelPoli6, @Massastrello , @exnx, and a stellar team. AI is changing everything — except itself. We’re building the engine for recursive self-improvement: AI that designs and refines AI, accelerating discovery in science+industry.
2
4
23
Diffusion is poised to overtake autoregression in language modeling: parallel, order-flexible generation makes inference faster and more steerable. With RND1, we’re taking a step: the largest open diffusion LM. Sparse MoE (30B, 3B active). Releasing the model, training recipe,
Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to
1
1
13
Grafting Diffusion Transformers accepted to #NeurIPS2025 as an Oral! We have lots of interesting analysis, a test bed for model grafting, and insights🚀 📄Paper: https://t.co/OjsrOZi7in 🌎Website:
arxiv.org
Designing model architectures requires decisions such as selecting operators (e.g., attention, convolution) and configurations (e.g., depth, width). However, evaluating the impact of these...
1/ Model architectures have been mostly treated as fixed post-training. 🌱 Introducing Grafting: A new way to edit pretrained diffusion transformers, allowing us to customize architectural designs on a small compute budget. 🌎 https://t.co/fjOTVqfVZr Co-led with @MichaelPoli6
7
36
206
In April '25, I shared the origin story of Evo on the TED stage. I talked about the motivation behind generating DNA with AI and how it could change what’s possible. It was an incredible experience. full video: https://t.co/tstT6cQI6u
1
8
55
Find out more here: https://t.co/jw2oXNeFIv ++ reach out if this resonates with you, we are hiring!
1
0
0
Life update: I started Radical Numerics with Stefano Massaroli, Armin Thomas, Eric Nguyen, and a fantastic team of engineers and researchers. We are building the engine for recursive self‑improvement (RSI): AI that designs and refines AI, accelerating discovery across science and
8
23
232
✨ Excited to share a few life updates! 🎤 My TED Talk is now live! I shared the origin story of Evo, titled: "How AI could generate new life forms" TED talk: https://t.co/dh7iWcPaBu ✍️ I wrote a blog post about what it’s *really* like to deliver a TED talk blog:
ted.com
If DNA is just a string of letters, could AI learn to read it … or even write it? Bioengineering researcher Eric Nguyen reveals how AI has upended the rules of biology, potentially creating a future...
17
28
174
LFM2 is live. We keep state coefficients time-invariant because Hyena gated short convs already supply the adaptive dynamics—and the data agree. Same accuracy, leaner compute budget. Efficiency in practice, not on paper. #LiquidAI
Today, we release the 2nd generation of our Liquid foundation models, LFM2. LFM2 set the bar for quality, speed, and memory efficiency in on-device AI. Built for edge devices like phones, laptops, AI PCs, cars, wearables, satellites, and robots, LFM2 delivers the fastest
1
7
24
It's easy (and fun!) to get nerdsniped by complex architecture designs. But over the years, I've seen hybrid gated convolutions always come out on top in the right head-to-head comparisons. The team brings a new suite of StripedHyena-style decoder models, in the form of SLMs
Liquid AI open-sources a new generation of edge LLMs! 🥳 I'm so happy to contribute to the open-source community with this release on @huggingface! LFM2 is a new architecture that combines best-in-class inference speed and quality into 350M, 700M, and 1.2B models.
2
13
47
We just released our new generation of Liquid Foundation Models, focused on edge devices, delivering great quality and best-in class latencies, made possible by gated short convolutions and our automated architecture design pipeline 🌟. Super proud of the team and everyone
Today, we release the 2nd generation of our Liquid foundation models, LFM2. LFM2 set the bar for quality, speed, and memory efficiency in on-device AI. Built for edge devices like phones, laptops, AI PCs, cars, wearables, satellites, and robots, LFM2 delivers the fastest
1
5
18
🧵1/6 How well does a model use its context? Even models with the same memory capacity (cache-size/state-size) may utilize their context differently, influencing recall capabilities, cache compressibility, etc. Proposing effective state-size (ESS), a proxy metric for memory
4
27
138
We showcase the first example of a model architecture optimized for smartphones: Hyena Edge. We used our automated model design framework (STAR, Oral at ICLR 2025) to sift through convolution-based multi-hybrid architectures. STAR iteratively evolved the population of designs,
2
6
42
[Liquid AI Research] Today, we introduce a Liquid architecture called Hyena Edge, a convolution-based multi-hybrid model that not only matches but outperforms strong Transformer-based baselines in computational efficiency and model quality on edge hardware, benchmarked on the
15
62
320
8/ Find out more about Hyena Edge in our blog post: https://t.co/7oTuX1YRyS And at our ICLR talk today at 10:42h local time in session 3C:
1
0
6
7/8 The result: Hyena Edge that outperforms a highly-optimized Transformer++ baseline on the S24 Ultra with up to 30% faster latency and lower memory usage. Hyena edge also outperforms the Transformer baseline across common language modeling benchmarks for small models (Wiki,
1
0
6
6/8 STAR showed Hyena-Y delivered the best balance of latency, memory, and performance. So we rewired our model with Hyena-Y at its core.
1
0
6