Ian Gemp
@drimgemp
Followers
433
Following
291
Media
1
Statuses
67
Research scientist @deepmind. It's all multiagent.
Joined July 2018
For more details (including how to train an RNN-based policy using PPO inside the "imagination" of the learned CWM), please see our paper: https://t.co/L9N7CeAB9s Joint work with @WLehrach, Daniel Hennes, @lazarox8, @heroesneverdie, Carter Wendelken, Zun Li, @antoine_dedieu,
0
8
39
We're running back our AAMAS 2025 tutorial at SAGT in Bath, UK! If you're interested in general evaluation of AI agents, you should check out our tutorial today (Sep 2nd)! The website is here https://t.co/HAqiPcQFAj, including some draft notes!
sites.google.com
Artificial Intelligence (AI) and machine learning (ML), in particular, have emerged as scientific disciplines concerned with understanding and building single and multi-agent systems with the ability...
If you're attending @AAMASconf 2025 and are interested in general evaluation of AI agents, you should check our tutorial on May 19th! The website is here https://t.co/aFllw5KRYc, including some draft notes! Co-organized with Marc Lanctot, @drimgemp, and @kateslarson 1/2
1
2
6
📄Convex Markov games – @drimgemp et al If you can 'convexify' MDPs, so you can do for Markov games. These two papers lay out a general framework + algorithms for the zero-sum version. 🔗 https://t.co/d4jbN27GwH 🔗 https://t.co/ReVEIwJxa6 5/n
1
1
5
How should we rank generalist agents on a wide set of benchmarks and tasks? Honored to get the AAMAS best paper award for SCO, a scheme based on voting theory which minimizes the mistakes in predicting agent comparisons based on the evaluation data. https://t.co/iV2pjwDxoU
1
7
61
@AAMASconf 2025 was very special for us! We had the opportunity to present a tutorial on general evaluation of AI agents, and we got a best paper award! Congrats to Marc, @kateslarson, @qberthet, @drimgemp and the rest of the team!
If you're attending @AAMASconf 2025 and are interested in general evaluation of AI agents, you should check our tutorial on May 19th! The website is here https://t.co/aFllw5KRYc, including some draft notes! Co-organized with Marc Lanctot, @drimgemp, and @kateslarson 1/2
1
2
5
Frontier models are often compared on crowdsourced user prompts - user prompts can be low-quality, biased and redundant, making "performance on average" hard to trust. Come find us at #ICLR2025 to discuss game-theoretic evaluation ( https://t.co/niF1p9C0iL)! See you in Singapore
siqi.fr
A case study using the livebench.ai leaderboard.
1
3
8
If you're attending @AAMASconf 2025 and are interested in general evaluation of AI agents, you should check our tutorial on May 19th! The website is here https://t.co/aFllw5KRYc, including some draft notes! Co-organized with Marc Lanctot, @drimgemp, and @kateslarson 1/2
sites.google.com
Artificial Intelligence (AI) and machine learning (ML), in particular, have emerged as scientific disciplines concerned with understanding and building single and multi-agent systems with the ability...
1
3
13
Thanks John Schultz from @GoogleDeepMind for the wonderful talk. Mastering Board Games by External and Internal Planning with Language Models https://t.co/2J32J1ajGl
https://t.co/HtBO34w5Jx
Title: Mastering Board Games by External and Internal Planning with Language Models Speaker: John Schultz, Deepmind Time: Jan 16, 2-3 pm EST Pls mark your calendar!
3
18
100
We did - Joint work with @marnezhurina @LuciaCKun @mehdidc @laion_ai - no wonder this bunch led straight to Wonderland. Code: https://t.co/AdjASXICa8 Homepage: https://t.co/2F99fRwZ0B Paper:
arxiv.org
Large Language Models (LLMs) are often described as instances of foundation models that possess strong generalization obeying scaling laws, and therefore transfer robustly across various...
3
5
19
Yet another opportunity to point out that reasoning abilities and common sense should not be confused with an ability to store and approximately retrieve many facts.
📌 This paper investigates the dramatic breakdown of state-of-the-art LLMs' reasoning capabilities when confronted with a simple common sense problem called the "Alice In Wonderland (AIW) problem". This is despite their strong performance on standardized reasoning benchmarks.
76
269
2K
AI often struggles to give correct and coherent answers to questions, but could game theory clue these models in? Mimicking a cryptic messaging game, MIT CSAIL’s “Consensus Game” improves the reliability of language models. To do this, one part of the AI system generates
2
21
56
Super cool to see our work on game theory + LLMs mentioned in Quanta magazine! Thanks Steve Nadis for covering it along with Athul and team! https://t.co/APEMKXtaEE
I’m very bullish on game theory x language models. Both for improving multi-agentic reasoning and for improving language models themselves. So many exciting directions! Thankful for the coverage of our work with @Yikang_Shen, @gabrfarina and @jacobandreas!
0
3
17
Great session chaired by @CarloDeramo! 🤓🧑💻 🔹Improving Convergence and Generalization... @BoZhao__ @gowerrobert @RobinSFWalters @yuqirose 🔹Meta Continual Learning Revisited... Y.Wu, Y.Wei, + 🔹Approximating Nash Equilibria... @drimgemp @MarrisLuke +
1
2
10
Our work on approximating Nash equilibria via stochastic optimization received an honorable mention!
Announcing the #ICLR2024 Outstanding Paper Awards: https://t.co/0k9LJwNE6o Shoutout to the awards committee: @eunsolc, @katjahofmann, @liu_mingyu, @nanjiang_cs, @guennemann, @optiML, @tkipf, @CevherLIONS
6
13
71
Stoked to announce an Agentic Markets workshop @agenticmarkets at #ICML 2024! @icmlconf 📇 Details: https://t.co/wOdgjzmXOb ✏️ Call for papers: due May 17th 📅 Conference: July 26/27th featuring GOAT speakers like Tuomas Sandholm, Gillian Hadfield, @drimgemp @KonstDaskalakis
6
20
67
📢Excited for our UMD MARL talk @ Mar 26, 12:00 pm ET📢/--by Ian Gemp @drimgemp, Research Scientist, @GoogleDeepMind on "Approximating Nash Equilibria in Normal-Form Games" in-person: IRB-5165 virtually: https://t.co/iJ7GpUN920…
@johnpdickerson
@ml_umd
#RL #AI #MultiAgentAI
sites.google.com
The aim of the Multi-Agent Reinforcement Learning (MARL) reading group at the Computer Science Department of the University of Maryland, College Park is to pursue foundational research discussions,...
1
1
4
Calling all #GameTheory and #MARL enthusiasts! This Friday (15 Mar) @drimgemp will be talking about ‘Approximating Nash Equilibria via Stochastic Optimization’. Come check it out!
1
6
13
What do haggling, debate, and convincing your kids to go to bed all have in common with Poker? With #LLMs, we map them all onto the framework of #gametheory; we then generate conversational strategies using the same methods that beat top Poker pros. https://t.co/iXUqQTfFQJ
5
10
32
Tired of using FID for evaluating generative models? Come to our #NeurIPS2023 poster on FLS, a new complete metric for generative models that also penalizes overfitting! https://t.co/bSf7s3UXIv
https://t.co/qVMYtUcvBp
@bose_joey @drimgemp Chongli Qin @yorambac @gauthier_gidel
github.com
PyTorch code for FLD (Feature Likelihood Divergence), FID, KID, Precision, Recall, etc. using DINOv2, InceptionV3, CLIP, etc. - marcojira/fld
How can metrics for evaluating generative models take into account generalization? In our new paper, we propose a new sample-based metric to address exactly this challenge: the Feature Likelihood Score (FLS). Paper: https://t.co/Q8OiyteiL9 Github: https://t.co/qVMYtUcvBp 1/12
1
5
29
I just published the story of how I created the world’s first No-Limit Holdem poker solver and made $500k by age 23 https://t.co/rsi4r6Uqfk I had to keep the story secret since 2013, but now you can read how I went from near broke to reshaping world's toughest poker games
medium.com
In 2013, I developed a program that calculates Nash Equilibrium, the optimal strategy for No-Limit Holdem, the most popular variation of…
14
31
234