allen_ai Profile Banner
Ai2 Profile
Ai2

@allen_ai

Followers
77K
Following
3K
Media
627
Statuses
3K

Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj

Seattle, WA
Joined September 2015
Don't wanna be here? Send us removal request.
@allen_ai
Ai2
3 days
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
23
106
711
@allen_ai
Ai2
24 minutes
We're excited about what's next, including scaling byteifying to larger models. 📝 Blog: https://t.co/pQS0CJpQ9d ⬇️ Download Bolmo 7B: https://t.co/KMaN1gND4C | 1B: https://t.co/LUObc0BRCw 📄 Report:
Tweet card summary image
huggingface.co
0
0
6
@allen_ai
Ai2
24 minutes
On our eval suite & character-focused benchmarks like CUTE & EXECUTE, Bolmo matches/surpasses subword models while excelling at character-level reasoning. Once you byteify a base model, you can import capabilities from post-trained checkpoints via weight arithmetic.
1
0
8
@allen_ai
Ai2
24 minutes
We keep Olmo 3's original backbone & capabilities, adding a lightweight byte stack so Bolmo can reason over bytes without discarding prior work. The result: a byte-level model with Olmo 3's strengths + finer-grained text understanding. 🚀
1
0
6
@allen_ai
Ai2
24 minutes
Bolmo takes an existing Olmo 3 7B checkpoint and retrofits it into a fast, flexible byte-level architecture. It skips hand-engineered vocabularies and operates directly on UTF-8 bytes, handling spelling, edge cases, & multilingual scripts naturally.
1
1
7
@allen_ai
Ai2
24 minutes
Most LMs still speak in subword tokens (e.g., ▁inter + national + ization). They work, but struggle with character-level edits, whitespace, rare words, & multilingual support—and every token gets the same compute, regardless of complexity.
1
0
5
@allen_ai
Ai2
24 minutes
Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3—and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧵
1
12
71
@allen_ai
Ai2
3 days
@AllenInstitute NeuroDiscoveryBench was created on openly available Allen Institute datasets—resources that have become foundational for the field. We're inviting researchers to help advance AI-assisted neuroscience discovery. 🔬 📂 Dataset: https://t.co/EHgjB3dYdi 📝 Read more:
Tweet card summary image
github.com
Contribute to allenai/neurodiscoverybench development by creating an account on GitHub.
1
1
5
@allen_ai
Ai2
3 days
@AllenInstitute We also found that raw, unprocessed datasets were much harder for AI agents, which struggled with the data transformations + complex joins required before analysis could even begin. Data wrangling remains a major challenge for AI in biology.
1
0
3
@allen_ai
Ai2
3 days
@AllenInstitute The answers to questions in NeuroDiscoveryBench can't be retrieved from memory or web search. AI systems have to actually analyze the data. Our baseline tests confirm this—models without data access score poorly, while data analysis agents perform substantially better. 📈
1
0
4
@allen_ai
Ai2
3 days
@AllenInstitute NeuroDiscoveryBench includes ~70 question-answer pairs drawn from major Allen Institute publications. These aren't simple factoid questions—they require deep data analysis to answer.
1
0
5
@allen_ai
Ai2
3 days
🧠 Introducing NeuroDiscoveryBench. Built with @AllenInstitute, it’s the first benchmark for evaluating AI systems like our Asta DataVoyager agent on neuroscience data. The benchmark tests whether AI can truly extract insights from complex brain datasets.
4
22
106
@VictoriaWGraf
Victoria Graf @NeurIPS ☀️
3 days
Olmo 3 Instruct is now bigger and better 🚀 Olmo 3 Think? Better too Check out Olmo 3.1! ✨
@allen_ai
Ai2
3 days
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
1
3
18
@faeze_brh
Faeze Brahman
3 days
We’re dropping Olmo 3.1 as a little end-of-year surprise. Think of it as Olmo 3, but with holiday upgrades. 🎁🎄
@allen_ai
Ai2
3 days
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
3
5
37
@hamishivi
Hamish Ivison
3 days
If u chatted to me at neurips and I got distracted looking at my computer it was cuz i was babysitting this run! Here's full curves from our in-loop evaluations. Sit and wait and the model just gets better (no changes from the initial recipe we announced, just run for longer!)
@allen_ai
Ai2
3 days
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
3
24
106
@allen_ai
Ai2
3 days
Alongside 3.1 Think & Instruct, we’re also upgrading our RL-Zero 7B models for math & code with Olmo 3.1 RL Zero 7B Code & Olmo 3.1 RL Zero 7B Math. Both benefit from longer & more stable training runs—delivering stronger results + better baselines for RL researchers.
1
1
25
@allen_ai
Ai2
3 days
🛠️ Olmo 3.1 Instruct 32B is our best fully open 32B instruction-tuned model. It’s optimized for chat, tool use, & multi-turn dialogue—making it a much more performant sibling of Olmo 3 Instruct 7B and ready for real-world applications.
1
0
27
@allen_ai
Ai2
3 days
🧠 After the initial Olmo 3 Think 32B release, we extended RL training for 21 days with extra epochs on our Dolci-Think-RL dataset. Olmo 3.1 Think 32B gains +5 AIME, +4 ZebraLogic, & +20 IFBench vs Olmo 3 Think 32B—making it the strongest fully open reasoning model.
1
0
39
@allen_ai
Ai2
7 days
Now anyone can use DataVoyager as a transparent AI partner for data-driven discovery. Try it at https://t.co/qjwKAgDm3D → select “Analyze data,” upload a dataset, & start asking questions. Learn more in our updated blog: https://t.co/qmLMNS2sTe
0
0
10