Doug Downey
@_DougDowney
Followers
397
Following
460
Media
2
Statuses
105
Researching AI for Science @allen_ai, Prof @northwesterncs
Joined May 2020
New project led by Shriya Atmakuri in collaboration with @aps6992: Ai2's Asta system now reports weekly which papers its research summaries have cited. The aim is to give credit to the work that powers the reports, and provide a dataset for studying how AI systems cite science.
📊 Today we're releasing data showing which scientific papers our AI research tool Asta cites most frequently. Think of it as creating citation counts for the AI era—tracking which research is actually powering AI answers across thousands of queries. 🧵
0
3
16
Introducing Asta DataVoyager—our new AI capability in Asta that turns structured data into transparent, reproducible insights. Built for scientists, grounded in open, inspectable workflows. 🧵
5
28
115
A few new challengers enter SciArena—including DeepSeek-V3.2-Exp and Claude Sonnet 4.5 🔬
1
3
13
As part of Asta, our initiative to accelerate science with trustworthy AI agents, we built AstaBench—the first comprehensive benchmark to compare them. ⚖️
3
14
106
Introducing Asta—our bold initiative to accelerate science with trustworthy, capable agents, benchmarks, & developer resources that bring clarity to the landscape of scientific AI + agents. 🧵
10
50
219
🚀 In March, we launched Paper Finder, an LLM-powered literature search agent that surfaces papers other tools miss. Now, we’re releasing an open-source snapshot to enable others to inspect & build on it—and reproduce the results. 🧵
7
63
457
🚨 SciArena update + evaluation of new models including GPT-5! 🚨 With thousands of new votes, new LLMs are reshaping our leaderboard for scientific literature tasks. o3 still leads—but GPT-5, Claude Opus 4.1, & more are closing the gap.
8
4
118
Are you a researcher in CS or a CS-adjacent field who could use help in refining your research ideas? Want to try our new AI-powered tool that helps with just that in a paid user study? Details and sign up here!
docs.google.com
Hi! 👋 We are researchers at the Allen Institute for Artificial Intelligence (Ai2) exploring AI-powered tools to support researchers in project ideation. We are conducting a study to learn more about...
2
7
20
Great science starts with great questions. 🤔✨ Meet AutoDS—an AI that doesn’t just hunt for answers, it decides which questions are worth asking. 🧵
2
39
337
We’ve upgraded ScholarQA, our agent that helps researchers conduct literature reviews efficiently by providing detailed answers. Now, when ScholarQA cites a source, it won’t just tell you which paper it came from–you’ll see the exact quote, highlighted in the original PDF. 🧵
6
35
197
This was a fun collaboration led by @YilunZhao_NLP and @kaiyan_z from @armancohan's lab at Yale University. Annotators preferred o3 in our study, which was found to give more detailed and technical answers. Curious to see if community voting changes the picture!
Today we released SciArena, an open evaluation platform where researchers can compare and vote on foundation models for scientific literature tasks. 👇
1
0
7
Introducing SciArena, a platform for benchmarking models across scientific literature tasks. Inspired by Chatbot Arena, SciArena applies a crowdsourced LLM evaluation approach to the scientific domain. 🧵
12
64
407
@allen_ai @SemanticScholar is hiring an #ml #nlp #ai reasoning researcher for a Research Scientist, Agents for Science position with target start dates in 2025. Excited about developing AI systems with deep reasoning capabilities for science? Send an application our way!
1
10
21
Ever wonder how LLM developers choose their pretraining data? It’s not guesswork— all AI labs create small-scale models as experiments, but the models and their data are rarely shared. DataDecide opens up the process: 1,050 models, 30k checkpoints, 25 datasets & 10 benchmarks 🧵
11
116
659
For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting? Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦
18
139
630
Imagine AI doing science: reading papers, generating ideas, designing and running experiments, analyzing results… How many more discoveries can we reveal? 🧐 Meet CodeScientist, a promising next step toward autonomous scientific discovery. 🧵
6
98
369
Meet Ai2 Paper Finder, an LLM-powered literature search system. Searching for relevant work is a multi-step process that requires iteration. Paper Finder mimics this workflow — and helps researchers find more papers than ever 🔍
19
217
1K
Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!
29
163
671