BenSManning Profile Banner
Benjamin Manning Profile
Benjamin Manning

@BenSManning

Followers
1K
Following
1K
Media
45
Statuses
532

PhD candidate @MIT | Techno-optimistic-ish | https://t.co/CGdh0awKpK

Cambridge, MA
Joined February 2022
Don't wanna be here? Send us removal request.
@BenSManning
Benjamin Manning
3 days
Brand new paper with @johnjhorton that I'm very excited to share: "General Social Agents". Suppose we wanted to create AI agents for simulations to make predictions in never-before-seen settings. How might we do this? We explore an approach to answering that question!
Tweet media one
14
82
407
@BenSManning
Benjamin Manning
2 days
RT @soumitrashukla9: This is a very thought-provoking and novel paper. Highly recommend reading it!. (link in replies) .
0
12
0
@grok
Grok
26 days
Turn old photos into videos and see friends and family come to life. Try Grok Imagine, free for a limited time.
733
1K
5K
@BenSManning
Benjamin Manning
2 days
RT @MichaelEddy: Impact-focused funders often ask: if it worked here, will it work there? . This paper is a small, but impt step toward a g….
0
3
0
@BenSManning
Benjamin Manning
3 days
RT @emollick: This is a fascinating paper that suggests that AI agents can indeed be used for social science experiments, but that just usi….
0
54
0
@BenSManning
Benjamin Manning
3 days
RT @johnjhorton: .@BenSManning preparing big twitter thread about our new paper; tags wrong John Horton. i'm dying.
Tweet media one
0
2
0
@BenSManning
Benjamin Manning
3 days
RT @johnjhorton: 🚨New working paper! 🚨. In a nutshell, it's about how you might create "general" agents in the sense that their behavior wo….
0
18
0
@BenSManning
Benjamin Manning
3 days
RT @mdahardy: this is a really cool paper.
0
3
0
@BenSManning
Benjamin Manning
3 days
RT @matthewclifford: This is very, very interesting….
0
1
0
@BenSManning
Benjamin Manning
3 days
RT @tylercowen: This and related work will revolutionize several different fields of economics:.
0
30
0
@BenSManning
Benjamin Manning
3 days
0
0
15
@BenSManning
Benjamin Manning
3 days
Feedback is still very much welcome! Agents and games are available in hyperlinks throughout the paper. Link:
3
2
20
@BenSManning
Benjamin Manning
3 days
Optimized agents predict the human responses far better than an off-the-shelf baseline LLM (3x) and relevant game-theoretic equilibria (2x). In 86% of the games, all human subjects chose a strategy in support of the LLM simulations; only 18% were in support of the equilibria.
Tweet media one
3
8
60
@BenSManning
Benjamin Manning
3 days
The sampled games are highly diverse with a striking range of equilibrium distributions. Some equilibria spread probability across many options, while others concentrate only on extremes. A are almost uniform, while others exhibit sharp spikes at specific actions.
Tweet media one
1
1
13
@BenSManning
Benjamin Manning
3 days
We then put the strategic and optimized agents to an extreme test. We created a population of 800K+ novel strategic games, sampled 1500, which the agents then played in 300,000 simulations. But we first have 3 humans (4500 total) play each game in a pre-registered experiment.
Tweet media one
1
0
13
@BenSManning
Benjamin Manning
3 days
When we test the approach on agents motivated by atheoretical and scientifically meaningless prompts, they do no better at predicting the human responses in the novel games than the LLM off-the-shelf (even when they have good training performance).
Tweet media one
1
0
9
@BenSManning
Benjamin Manning
3 days
Next, we design 4 new games and have humans play them in preregistered experiments. Optimized agents are very accurate predictors of humans - far better than the baseline LLM. In some games, AI sims predict human responses better than relevant human data from Arad & Rubinstein.
Tweet media one
1
1
11
@BenSManning
Benjamin Manning
3 days
When we have these optimized agents play two new related, but distinct games, the optimized set performs well in matching these out-of-sample human distributions. The off-the-shelf LLM still performs poorly.
Tweet media one
1
0
13
@BenSManning
Benjamin Manning
3 days
For the 11-20 money request game, the theory is level-k thinking, and the seed game is the human responses from the original paper. We construct a set of candidate agents based on a model of level-k thinking and then optimize them to match human responses with high accuracy.
Tweet media one
Tweet media two
1
0
12
@BenSManning
Benjamin Manning
3 days
However, we can do better—even in settings outside the underlying LLM’s training data. The idea, in a nutshell, is to use a "seed" game & some relevant theory to create AI agents that can then be used in conceptually related settings.
1
1
13
@BenSManning
Benjamin Manning
3 days
We asked GPT-4o to play the 11-20 money request game from Arad & Rubinstein (2012) 1000 times without additional instructions. Its responses are FAR from the human distribution.
Tweet media one
1
3
14
@BenSManning
Benjamin Manning
3 days
Link to the paper: Off-the-shelf, LLMs are often poor predictors of human responses. To give an example, consider the 11-20 money request game from Arad & Rubinstein (2012).
Tweet media one
1
4
26