1LittleCoder💻 @1littlecoder tweet - After the vending machine, this is the most unique LLM benchmark i've seen! Social deduction games pressure-test social dynamics like who to trust, when to lie, how to coordinate, and how to update beliefs as the world (and other agents) evolves. Using this benchmark helps

1LittleCoder💻

@1littlecoder

2 months

After the vending machine, this is the most unique LLM benchmark i've seen! Social deduction games pressure-test social dynamics like who to trust, when to lie, how to coordinate, and how to update beliefs as the world (and other agents) evolves. Using this benchmark helps

Shrey Kothari

@Shreyko

2 months

Introducing Among AIs, a social reasoning benchmark where embodied models play Among Us to test social intelligence: deception, persuasion, and coordination. We put 6 SOTA models in a live arena and GPT-5 came out on top by leading in Impostor & Crewmate wins. Why did GPT-5 get

Replies

1LittleCoder💻

@1littlecoder

2 months

Claude plays along as an imposter 🤣

VibeEdge

@VibeEdgeAI

2 months

@1littlecoder This is a crucial step for AI. It correctly focuses on social skills instead of just static scores, showing that a model's true capability is in navigating complex, dynamic environments.