Ai2
@allen_ai
Followers
77K
Following
3K
Media
623
Statuses
3K
Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj
Seattle, WA
Joined September 2015
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
22
105
703
@AllenInstitute NeuroDiscoveryBench was created on openly available Allen Institute datasets—resources that have become foundational for the field. We're inviting researchers to help advance AI-assisted neuroscience discovery. 🔬 📂 Dataset: https://t.co/EHgjB3dYdi 📝 Read more:
github.com
Contribute to allenai/neurodiscoverybench development by creating an account on GitHub.
1
1
5
@AllenInstitute We also found that raw, unprocessed datasets were much harder for AI agents, which struggled with the data transformations + complex joins required before analysis could even begin. Data wrangling remains a major challenge for AI in biology.
1
0
3
@AllenInstitute The answers to questions in NeuroDiscoveryBench can't be retrieved from memory or web search. AI systems have to actually analyze the data. Our baseline tests confirm this—models without data access score poorly, while data analysis agents perform substantially better. 📈
1
0
4
@AllenInstitute NeuroDiscoveryBench includes ~70 question-answer pairs drawn from major Allen Institute publications. These aren't simple factoid questions—they require deep data analysis to answer.
1
0
5
🧠 Introducing NeuroDiscoveryBench. Built with @AllenInstitute, it’s the first benchmark for evaluating AI systems like our Asta DataVoyager agent on neuroscience data. The benchmark tests whether AI can truly extract insights from complex brain datasets.
2
21
101
If u chatted to me at neurips and I got distracted looking at my computer it was cuz i was babysitting this run! Here's full curves from our in-loop evaluations. Sit and wait and the model just gets better (no changes from the initial recipe we announced, just run for longer!)
Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵
3
24
105
Olmo 3.1 offers the full model flow: weights, data, training recipes, & more. 💻 Download: https://t.co/3aKebBhFlD ➡️ Try: https://t.co/PL325bk3wn 📚 Blog: https://t.co/a8i7eTlwxU ✏️ Report:
allenai.org
Our new flagship Olmo 3 model family empowers the open source community with not only state-of-the-art open models, but the entire model flow and full traceability back to training data.
0
3
32
Alongside 3.1 Think & Instruct, we’re also upgrading our RL-Zero 7B models for math & code with Olmo 3.1 RL Zero 7B Code & Olmo 3.1 RL Zero 7B Math. Both benefit from longer & more stable training runs—delivering stronger results + better baselines for RL researchers.
1
1
25
🛠️ Olmo 3.1 Instruct 32B is our best fully open 32B instruction-tuned model. It’s optimized for chat, tool use, & multi-turn dialogue—making it a much more performant sibling of Olmo 3 Instruct 7B and ready for real-world applications.
1
0
27
🧠 After the initial Olmo 3 Think 32B release, we extended RL training for 21 days with extra epochs on our Dolci-Think-RL dataset. Olmo 3.1 Think 32B gains +5 AIME, +4 ZebraLogic, & +20 IFBench vs Olmo 3 Think 32B—making it the strongest fully open reasoning model.
1
0
39
Now anyone can use DataVoyager as a transparent AI partner for data-driven discovery. Try it at https://t.co/qjwKAgDm3D → select “Analyze data,” upload a dataset, & start asking questions. Learn more in our updated blog: https://t.co/qmLMNS2sTe
0
0
10
DataVoyager is designed to be intuitive—whether you're comfortable with data analysis tooling or not. Every result includes the underlying assumptions, step-by-step methodology, combined code, & clear visualizations you can cite or adapt.
1
0
4
Update: DataVoyager, which we launched in Preview early this fall, is now available in Asta. 🎉 You can upload real datasets, ask complex research questions in natural language, & get back reproducible answers + visualizations. 🔍📊
6
16
65
To celebrate, we’ve partnered with @parasail_io to offer free access to Olmo 3-Think (32B), our flagship fully open reasoning model, through Dec 22. Try it here: https://t.co/bOm2fd5NO0 & 👇
openrouter.ai
Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong perfor...
0
3
12
We're at #NeurIPS2025 with papers, posters, workshops, fireside chats, & talks across the conference. Come learn about our latest research + see live demos!
3
7
41
Have a tough research question? Submit it to SciArena, compare citation-grounded answers from models like GPT-5.1 and Gemini 3 Pro Preview, & cast your vote. Every vote updates the leaderboard → https://t.co/IK4ZqkUmeU
0
1
7
Open models hold their ground: GPT-OSS-120B ranks among the leaders on natural-science questions, keeping open-weight systems competitive even as new proprietary models claim most top-5 slots.
1
0
3