Ai2 @allen_ai X Profile

Ai2

@allen_ai

Followers

78K

Following

3K

Media

646

Statuses

3K

Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj

https://t.co/jDMIVASfNj

Seattle, WA

Joined September 2015

Don't wanna be here? Send us removal request.

Ai2

@allen_ai

6 days

Last year Molmo set SOTA on image benchmarks + pioneered image pointing. Millions of downloads later, Molmo 2 brings Molmo’s grounded multimodal capabilities to video 🎥—and leads many open models on challenging industry video benchmarks. 🧵

6

61

313

lmarena.ai

@arena

4 days

🚨New Model Update @Allen_AI’s Olmo-3.1-32B-Think is now available in the Text Arena! This open model is designed to perform strongly on reasoning, instruction following, and research-focused tasks. Bring your toughest prompts and see how it compares as community votes roll

Ai2

@allen_ai

10 days

Olmo 3.1 is here. We extended our strongest RL run and scaled our instruct recipe to 32B—releasing Olmo 3.1 Think 32B & Olmo 3.1 Instruct 32B, our most capable models yet. 🧵

8

9

85

Ai2

@allen_ai

4 days

Multi-turn report generation is live starting today. Try it at https://t.co/pCUcqGnLgb 💬 We're eager to hear how you use it + what would make it more useful → join our Discord & give feedback https://t.co/GnxLPhM3MW

0

2

Ai2

@allen_ai

4 days

📱 We've also improved Asta on mobile. Evidence now appears in cards instead of pop-ups that crowd the screen, navigation is smoother, & reports stream in without refreshing the page every time a new section appears.

1

3

Ai2

@allen_ai

4 days

📚 Reports draw from 108M+ abstracts and 12M+ full-text papers. Every sentence is cited, with citation cards that let you inspect sources, open the full paper, or view highlighted text where licensing allows. If data isn't in our library, Asta labels it as model-generated.

1

0

2

Ai2

@allen_ai

4 days

With multi-turn conversations, you can turn complex prompts into iterative investigations—adjusting scope, focus, or angle as you go. Ask follow-ups without losing context or citations, @-mention specific papers, & regenerate reports while keeping earlier drafts.

1

0

1

Ai2

@allen_ai

4 days

🆕 New in Asta: multi-turn report generation. You can now have back-and-forth conversations with Asta, our agentic platform for scientific research, to refine long-form, fully cited reports instead of relying on single-shot prompts.

1

8

68

Ai2

@allen_ai

5 days

@OpenRouterAI @huggingface 🔗 Olmo 3.1 32B Think API: https://t.co/G4yiIwyF78 🔗 Olmo 3.1 32B Instruct API: https://t.co/58fp0yTfBX Thanks to our partners @parasail_io, Public AI, & @Cirrascale 🤝

huggingface.co

0

5

Ai2

@allen_ai

5 days

Now you can use our most powerful models via API. Olmo 3.1 32B Think, our reasoning model for complex problems, is on @OpenRouterAI—free through 12/22. And Olmo 3.1 32B Instruct, our flagship chat model with tool use, is available through @huggingface Inference Providers. 👇

5

10

117

Luca Soldaini 🎀

@soldni

5 days

thanks to @kylelostat's heroics, Olmo 3 paper is finally on arXiv

4

3

76

Ai2

@allen_ai

5 days

We’re excited to see what the community builds with any-horizon video agents like SAGE. 🚀 🌐 Project page: https://t.co/ww72CPVNvQ 💻 Code: https://t.co/KAzalzqipi ⬇️ Models & data: https://t.co/7qGZmT4n8p 📝 Paper: https://t.co/FGmib07MJ2

0

1

8

Ai2

@allen_ai

5 days

SAGE hits ~68% accuracy on SAGE-Bench in roughly 8–9 seconds per video. Other agent systems often take tens of seconds to minutes to answer a video-related question, yet still trail SAGE in accuracy.

1

0

8

Ai2

@allen_ai

5 days

On SAGE-Bench with Qwen3-VL-8B, SAGE agents stay close to the direct baseline on short clips while pulling ahead on longer videos. Long videos result in more reasoning turns, but reinforcement learning cuts this versus a supervised-fine-tuning-only approach.

1

0

5

Ai2

@allen_ai

5 days

We curate SAGE-Bench, a manually verified 1.7K-question benchmark of entertainment videos with an average duration of >700 seconds, focused on open-ended and practical questions—unlike existing MCQ and diagnostic benchmarks.

1

0

6

Ai2

@allen_ai

5 days

Under the hood, SAGE-MM is the orchestrator, deciding when to call tools (e.g., web search) vs. give an answer. It's trained on ~6.6K YouTube videos (~99K Q&A pairs, 400K+ state-action examples) using a multi-reward RL recipe for any-horizon open-ended reasoning.

1

0

7

Ai2

@allen_ai

5 days

Most video reasoning models answer a question about a video in a single turn after ingesting many frames. SAGE instead examines short scenes & then jumps to later or earlier parts, & can also search transcribed audio or the web to obtain additional info about the target video.

1

0

10

Ai2

@allen_ai

5 days

🎥 Introducing SAGE, an agentic system for long video reasoning on entertainment videos—sports, vlogs, & more. It learns when to skim, zoom in, & answer questions directly. On our SAGE-Bench eval, SAGE with a Molmo 2 (8B)-based orchestrator lifts accuracy from 61.8% → 66.1%. 🧵

7

26

219

Yue Yang

@YueYangAI

6 days

Molmo 2 brings true openness to video + multi-image understanding! For multi-image, we’re releasing Molmo2-SynMultiImageQA: 1M+ synthetic text-rich images (charts, docs, etc.). Huge shoutout to my Ai2 teammates, let’s keep pushing open science! Data:

huggingface.co

Ai2

@allen_ai

6 days

Molmo 2 doesn't just answer questions about clips—it searches & points. The model returns coordinates & timestamps over videos + images, powering QA, counting, dense captioning, artifact detection, & subtitle-aware analysis. You can see exactly how it reasoned.

0

5

22

Jae Sung Park

@jjaesungPark

6 days

Adding tracking capability to Molmo2 was a fun experience! Molmo2 can track objects and assign IDs in text: “<tracks coords= t1 id1 x1 y1 id2 x2 y2…>” Demo: https://t.co/NWs16uViAH Rundown: https://t.co/Ko23RYFx81 Tips for best tracking 🧵👇 (Note: cup video is 2x speed)

3

7

24

Ai2

@allen_ai

6 days

🎗️Reminder, our Molmo 2 and Olmo 3 Reddit AMA begins soon at 1pm PST / 4pm EST.

reddit.com

Explore this post and more from the LocalLLaMA community

0

1

9

Ranjay Krishna

@RanjayKrishna

6 days

Check out what Molmo can do now.

Ai2

@allen_ai

6 days

Molmo 2 doesn't just answer questions about clips—it searches & points. The model returns coordinates & timestamps over videos + images, powering QA, counting, dense captioning, artifact detection, & subtitle-aware analysis. You can see exactly how it reasoned.

0

8

28