Ai2
@allen_ai
Followers
77K
Following
3K
Media
627
Statuses
3K
Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj
Seattle, WA
Joined September 2015
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
23
106
711
We're excited about what's next, including scaling byteifying to larger models. 📝 Blog: https://t.co/pQS0CJpQ9d ⬇️ Download Bolmo 7B: https://t.co/KMaN1gND4C | 1B: https://t.co/LUObc0BRCw 📄 Report:
huggingface.co
0
0
6
On our eval suite & character-focused benchmarks like CUTE & EXECUTE, Bolmo matches/surpasses subword models while excelling at character-level reasoning. Once you byteify a base model, you can import capabilities from post-trained checkpoints via weight arithmetic.
1
0
8
We keep Olmo 3's original backbone & capabilities, adding a lightweight byte stack so Bolmo can reason over bytes without discarding prior work. The result: a byte-level model with Olmo 3's strengths + finer-grained text understanding. 🚀
1
0
6
Bolmo takes an existing Olmo 3 7B checkpoint and retrofits it into a fast, flexible byte-level architecture. It skips hand-engineered vocabularies and operates directly on UTF-8 bytes, handling spelling, edge cases, & multilingual scripts naturally.
1
1
7
Most LMs still speak in subword tokens (e.g., ▁inter + national + ization). They work, but struggle with character-level edits, whitespace, rare words, & multilingual support—and every token gets the same compute, regardless of complexity.
1
0
5
Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3—and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧵
1
12
71
@AllenInstitute NeuroDiscoveryBench was created on openly available Allen Institute datasets—resources that have become foundational for the field. We're inviting researchers to help advance AI-assisted neuroscience discovery. 🔬 📂 Dataset: https://t.co/EHgjB3dYdi 📝 Read more:
github.com
Contribute to allenai/neurodiscoverybench development by creating an account on GitHub.
1
1
5
@AllenInstitute We also found that raw, unprocessed datasets were much harder for AI agents, which struggled with the data transformations + complex joins required before analysis could even begin. Data wrangling remains a major challenge for AI in biology.
1
0
3
@AllenInstitute The answers to questions in NeuroDiscoveryBench can't be retrieved from memory or web search. AI systems have to actually analyze the data. Our baseline tests confirm this—models without data access score poorly, while data analysis agents perform substantially better. 📈
1
0
4
@AllenInstitute NeuroDiscoveryBench includes ~70 question-answer pairs drawn from major Allen Institute publications. These aren't simple factoid questions—they require deep data analysis to answer.
1
0
5
🧠 Introducing NeuroDiscoveryBench. Built with @AllenInstitute, it’s the first benchmark for evaluating AI systems like our Asta DataVoyager agent on neuroscience data. The benchmark tests whether AI can truly extract insights from complex brain datasets.
4
22
106
If u chatted to me at neurips and I got distracted looking at my computer it was cuz i was babysitting this run! Here's full curves from our in-loop evaluations. Sit and wait and the model just gets better (no changes from the initial recipe we announced, just run for longer!)
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
3
24
106
Olmo 3.1 offers the full model flow: weights, data, training recipes, & more. 💻 Download: https://t.co/3aKebBhFlD ➡️ Try: https://t.co/PL325bk3wn 📚 Blog: https://t.co/a8i7eTlwxU ✏️ Report:
allenai.org
Our new flagship Olmo 3 model family empowers the open source community with not only state-of-the-art open models, but the entire model flow and full traceability back to training data.
0
3
33
Alongside 3.1 Think & Instruct, we’re also upgrading our RL-Zero 7B models for math & code with Olmo 3.1 RL Zero 7B Code & Olmo 3.1 RL Zero 7B Math. Both benefit from longer & more stable training runs—delivering stronger results + better baselines for RL researchers.
1
1
25
🛠️ Olmo 3.1 Instruct 32B is our best fully open 32B instruction-tuned model. It’s optimized for chat, tool use, & multi-turn dialogue—making it a much more performant sibling of Olmo 3 Instruct 7B and ready for real-world applications.
1
0
27
🧠 After the initial Olmo 3 Think 32B release, we extended RL training for 21 days with extra epochs on our Dolci-Think-RL dataset. Olmo 3.1 Think 32B gains +5 AIME, +4 ZebraLogic, & +20 IFBench vs Olmo 3 Think 32B—making it the strongest fully open reasoning model.
1
0
39
Now anyone can use DataVoyager as a transparent AI partner for data-driven discovery. Try it at https://t.co/qjwKAgDm3D → select “Analyze data,” upload a dataset, & start asking questions. Learn more in our updated blog: https://t.co/qmLMNS2sTe
0
0
10