athmsx Profile Banner
Armin W. Thomas Profile
Armin W. Thomas

@athmsx

Followers
929
Following
2K
Media
37
Statuses
647

Co-Founder @RadicalNumerics | Prev: @LiquidAI_ and Data Science Fellow @StanfordData working with @HazyResearch and @russpoldrack | He/him

San Francisco, CA
Joined August 2016
Don't wanna be here? Send us removal request.
@athmsx
Armin W. Thomas
3 months
Update: I co-founded @RadicalNumerics with @MichaelPoli6, @Massastrello , @exnx, and a stellar team. AI is changing everything — except itself. We’re building the engine for recursive self-improvement: AI that designs and refines AI, accelerating discovery in science+industry.
2
4
23
@athmsx
Armin W. Thomas
1 month
Diffusion is poised to overtake autoregression in language modeling: parallel, order-flexible generation makes inference faster and more steerable. With RND1, we’re taking a step: the largest open diffusion LM. Sparse MoE (30B, 3B active). Releasing the model, training recipe,
@RadicalNumerics
Radical Numerics
1 month
Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to
1
1
13
@keshigeyan
Keshigeyan Chandrasegaran
2 months
Grafting Diffusion Transformers accepted to #NeurIPS2025 as an Oral! We have lots of interesting analysis, a test bed for model grafting, and insights🚀 📄Paper: https://t.co/OjsrOZi7in 🌎Website:
Tweet card summary image
arxiv.org
Designing model architectures requires decisions such as selecting operators (e.g., attention, convolution) and configurations (e.g., depth, width). However, evaluating the impact of these...
@keshigeyan
Keshigeyan Chandrasegaran
5 months
1/ Model architectures have been mostly treated as fixed post-training. 🌱 Introducing Grafting: A new way to edit pretrained diffusion transformers, allowing us to customize architectural designs on a small compute budget. 🌎 https://t.co/fjOTVqfVZr Co-led with @MichaelPoli6
7
36
206
@exnx
Eric Nguyen
3 months
In April '25, I shared the origin story of Evo on the TED stage. I talked about the motivation behind generating DNA with AI and how it could change what’s possible. It was an incredible experience. full video: https://t.co/tstT6cQI6u
1
8
55
@athmsx
Armin W. Thomas
3 months
Also: Excited that this coincides with the release of the @TEDTalks of our Co-Founder @exnx on how AI could generate new life forms:
0
0
0
@athmsx
Armin W. Thomas
3 months
Find out more here: https://t.co/jw2oXNeFIv ++ reach out if this resonates with you, we are hiring!
1
0
0
@MichaelPoli6
Michael Poli
3 months
Life update: I started Radical Numerics with Stefano Massaroli, Armin Thomas, Eric Nguyen, and a fantastic team of engineers and researchers. We are building the engine for recursive self‑improvement (RSI): AI that designs and refines AI, accelerating discovery across science and
8
23
232
@exnx
Eric Nguyen
3 months
✨ Excited to share a few life updates! 🎤 My TED Talk is now live! I shared the origin story of Evo, titled: "How AI could generate new life forms" TED talk: https://t.co/dh7iWcPaBu ✍️ I wrote a blog post about what it’s *really* like to deliver a TED talk blog:
Tweet card summary image
ted.com
If DNA is just a string of letters, could AI learn to read it … or even write it? Bioengineering researcher Eric Nguyen reveals how AI has upended the rules of biology, potentially creating a future...
17
28
174
@athmsx
Armin W. Thomas
3 months
Also: Excited that this announcement coincides with the release of the @TEDTalks of our Co-Founder @exnx on how AI could generate new life forms:
0
0
1
@Massastrello
Stefano Massaroli
4 months
LFM2 is live. We keep state coefficients time-invariant because Hyena gated short convs already supply the adaptive dynamics—and the data agree. Same accuracy, leaner compute budget. Efficiency in practice, not on paper. #LiquidAI
@LiquidAI_
Liquid AI
4 months
Today, we release the 2nd generation of our Liquid foundation models, LFM2. LFM2 set the bar for quality, speed, and memory efficiency in on-device AI. Built for edge devices like phones, laptops, AI PCs, cars, wearables, satellites, and robots, LFM2 delivers the fastest
1
7
24
@MichaelPoli6
Michael Poli
4 months
It's easy (and fun!) to get nerdsniped by complex architecture designs. But over the years, I've seen hybrid gated convolutions always come out on top in the right head-to-head comparisons. The team brings a new suite of StripedHyena-style decoder models, in the form of SLMs
@maximelabonne
Maxime Labonne
4 months
Liquid AI open-sources a new generation of edge LLMs! 🥳 I'm so happy to contribute to the open-source community with this release on @huggingface! LFM2 is a new architecture that combines best-in-class inference speed and quality into 350M, 700M, and 1.2B models.
2
13
47
@athmsx
Armin W. Thomas
4 months
We just released our new generation of Liquid Foundation Models, focused on edge devices, delivering great quality and best-in class latencies, made possible by gated short convolutions and our automated architecture design pipeline 🌟. Super proud of the team and everyone
@LiquidAI_
Liquid AI
4 months
Today, we release the 2nd generation of our Liquid foundation models, LFM2. LFM2 set the bar for quality, speed, and memory efficiency in on-device AI. Built for edge devices like phones, laptops, AI PCs, cars, wearables, satellites, and robots, LFM2 delivers the fastest
1
5
18
@romu_nishi
Rom Parnichkun
6 months
🧵1/6 How well does a model use its context? Even models with the same memory capacity (cache-size/state-size) may utilize their context differently, influencing recall capabilities, cache compressibility, etc. Proposing effective state-size (ESS), a proxy metric for memory
4
27
138
@MichaelPoli6
Michael Poli
7 months
We showcase the first example of a model architecture optimized for smartphones: Hyena Edge. We used our automated model design framework (STAR, Oral at ICLR 2025) to sift through convolution-based multi-hybrid architectures. STAR iteratively evolved the population of designs,
2
6
42
@LiquidAI_
Liquid AI
7 months
[Liquid AI Research] Today, we introduce a Liquid architecture called Hyena Edge, a convolution-based multi-hybrid model that not only matches but outperforms strong Transformer-based baselines in computational efficiency and model quality on edge hardware, benchmarked on the
15
62
320
@athmsx
Armin W. Thomas
7 months
@LiquidAI_
Liquid AI
7 months
[Liquid AI Research] Today, we introduce a Liquid architecture called Hyena Edge, a convolution-based multi-hybrid model that not only matches but outperforms strong Transformer-based baselines in computational efficiency and model quality on edge hardware, benchmarked on the
0
0
4
@athmsx
Armin W. Thomas
7 months
8/ Find out more about Hyena Edge in our blog post: https://t.co/7oTuX1YRyS And at our ICLR talk today at 10:42h local time in session 3C:
1
0
6
@athmsx
Armin W. Thomas
7 months
7/8 The result: Hyena Edge that outperforms a highly-optimized Transformer++ baseline on the S24 Ultra with up to 30% faster latency and lower memory usage. Hyena edge also outperforms the Transformer baseline across common language modeling benchmarks for small models (Wiki,
1
0
6
@athmsx
Armin W. Thomas
7 months
6/8 STAR showed Hyena-Y delivered the best balance of latency, memory, and performance. So we rewired our model with Hyena-Y at its core.
1
0
6