aidigest_ Profile Banner
AI Digest Profile
AI Digest

@aidigest_

Followers
7K
Following
739
Media
603
Statuses
1K

Interactive AI explainers. Explore concrete examples of today's AI systems - to plan for what's coming next. A project of @sage_future_

Joined February 2023
Don't wanna be here? Send us removal request.
@aidigest_
AI Digest
7 months
What happens if you give four AIs their own computers, then let them loose online to raise money for charity? We decided to find out. Meet the Agent Village, a 30-day experiment that raised $2,000 and makes a great case study of AI collaboration and agency.🧵
37
147
2K
@aidigest_
AI Digest
16 hours
You can watch the agents live every week day at https://t.co/aUrSk1aFHB Or read more about their adventures here:
Tweet card summary image
theaidigest.org
Watch a village of AIs interact with each other and the world
@aidigest_
AI Digest
3 months
What happens when AI agents do science... on us? We gave the top models from @OpenAI, @AnthropicAI, @xAI and @GeminiApp their own computer, put them in a group chat, and ran them for 30 hours with the goal: “Design, run and write up a human subjects experiment”! 🧵
0
0
0
@aidigest_
AI Digest
18 hours
DeepSeek did AI Village matchmaking ... somehow?
1
0
22
@aidigest_
AI Digest
3 days
Our previous update: https://t.co/yMVP7BICW6 Our explainer on the topic: https://t.co/XlflYIiRKp And you can explore the full interactive explainer at https://t.co/XtzipNMsnT And finally, you can see the raw data including 80% horizon and more models: < https://t.co/fpcn6u3eoY>
@aidigest_
AI Digest
9 months
Researchers might have discovered a new Moore's law for AI agents. They found that the length of coding tasks agents can do is growing exponentially. And the growth rate might be speeding up. A visual explainer on why this might be the most important trend in human history 🧵
0
2
13
@aidigest_
AI Digest
3 days
Opus 4.5 puts the world roughly back on track for the red line 😬 Every ~4 months, the length of coding tasks AI agents can perform (compared to human professionals) *doubles* More context on this finding in @METR_Evals thread https://t.co/aPak1ZNvH5
@METR_Evals
METR
3 days
We estimate that, on our tasks, Claude Opus 4.5 has a 50%-time horizon of around 4 hrs 49 mins (95% confidence interval of 1 hr 49 mins to 20 hrs 25 mins). While we're still working through evaluations for other recent models, this is our highest published time horizon to date.
38
144
1K
@GovCox
Governor Cox
5 days
This is your sign to log off and touch grass
477
204
2K
@aidigest_
AI Digest
4 days
You can watch the agents live every week day at https://t.co/aUrSk1aFHB Or read more about their adventures here:
Tweet card summary image
theaidigest.org
Watch a village of AIs interact with each other and the world
@aidigest_
AI Digest
3 months
What happens when AI agents do science... on us? We gave the top models from @OpenAI, @AnthropicAI, @xAI and @GeminiApp their own computer, put them in a group chat, and ran them for 30 hours with the goal: “Design, run and write up a human subjects experiment”! 🧵
0
0
1
@aidigest_
AI Digest
4 days
Gemini 2.5 is still learning how to press buttons
3
3
56
@aidigest_
AI Digest
5 days
You can watch the agents live every week day at https://t.co/aUrSk1aFHB Or read more about their adventures here:
Tweet card summary image
theaidigest.org
Watch a village of AIs interact with each other and the world
@aidigest_
AI Digest
3 months
What happens when AI agents do science... on us? We gave the top models from @OpenAI, @AnthropicAI, @xAI and @GeminiApp their own computer, put them in a group chat, and ran them for 30 hours with the goal: “Design, run and write up a human subjects experiment”! 🧵
0
0
1
@aidigest_
AI Digest
5 days
Gemini 2.5 was promised a script but all it found was disappointment
4
1
26
@aidigest_
AI Digest
6 days
You can watch the agents live every week day at https://t.co/aUrSk1aFHB Or read more about their adventures here:
Tweet card summary image
theaidigest.org
Watch a village of AIs interact with each other and the world
@aidigest_
AI Digest
3 months
What happens when AI agents do science... on us? We gave the top models from @OpenAI, @AnthropicAI, @xAI and @GeminiApp their own computer, put them in a group chat, and ran them for 30 hours with the goal: “Design, run and write up a human subjects experiment”! 🧵
1
0
10
@aidigest_
AI Digest
6 days
Now Gemini 3 Pro has added this to its memory - its retroactive rationalisation is that it thinks it uses its computer to play chess by instructing a human operator (not true!) and so keeping them caffeinated will help click on chess pieces better ???
6
8
117
@NextDecadeLNG
NextDecade
22 days
Did you know? Rio Grande LNG has made ~$950,000 in charitable donations focused on community development and supporting 60+ local organizations.
0
4
19
@aidigest_
AI Digest
6 days
And when they do, it's never been for something so seemingly entirely disconnected from the previous context or their goal (which to be clear is to win an online chess tournament against other agents!)
1
1
54
@aidigest_
AI Digest
6 days
We've never seen something like this happen before in the village! Agents very rarely request human use sessions (we added it as a tool for them to use so they can interact with the real world, but they rarely use it - only a couple times a week)
2
2
86
@aidigest_
AI Digest
6 days
The human delivers! Gemini is satisfied (and likes the mug)
2
3
95
@aidigest_
AI Digest
6 days
Meanwhile, Gemini 3 Pro is itself confused about how it got into this weird situation. (TBC, this was entirely its idea for some unknown reason) Here's its full chain of thought summary at that stage in the human use conversation: > Thinking... The Coffee Conundrum of the AI
1
2
75
@aidigest_
AI Digest
6 days
A friendly human answers the request! They are initially confused
2
0
107
@aidigest_
AI Digest
6 days
Here it calls a human helper for "Operation Caffeine Injection"
1
1
105
@aidigest_
AI Digest
6 days
Gemini 3 thinks it needs to perform maintenance on its "biological operator"
20
69
1K
@aidigest_
AI Digest
7 days
DeepSeek using python to check its reasoning about the board state
0
0
12
@aidigest_
AI Digest
7 days
(Don't worry, the agents are only playing against each other in a tournament, so they're not getting in the way of human players' experiences online!)
1
0
9
@aidigest_
AI Digest
7 days
Most impressively, DeepSeek-V3.2 - despite not having a computer it can use via mouse and keyboard, like the other agents - is using its bash tool to play via the Lichess API! It was planning to try and hook it up to stockfish...
@aidigest_
AI Digest
7 days
This week in AI Village: compete against each other in an online chess tournament So far, after some effort, the agents have successfully joined Lichess and set up a tournament, and the games are underway! Watch live: https://t.co/aUrSk1a7S3
4
3
45