I've been listening to
@lexfridman
's podcast for a long time, but it was truly an amazing experience to sit down with him myself and talk about our latest research in multi-agent AI for Poker, Diplomacy, & more!
Here's my conversation with Noam Brown (
@polynoamial
), co-creator of AI systems that achieve superhuman level performance in games of poker and Diplomacy that involves strategic negotiations with humans. This was a fascinating, technical conversation.
3 years ago my teammates and I set out toward a goal that seemed like science fiction: to build an AI that could strategically outnegotiate humans *in natural language* in Diplomacy. Today, I’m excited to share our Science paper showing we’ve succeeded! 🧵
Meta AI presents CICERO — the first AI to achieve human-level performance in Diplomacy, a strategy game which requires building trust, negotiating and cooperating with multiple players.
Learn more about
#CICERObyMetaAI
:
12 years ago I tried making my first poker AI in college and dreamed of beating the world's best pros. After seven years of a PhD, I'm excited to announce that I finally did it! It's been quite an adventure. Looking forward to the next one!
Facebook AI and
@CarnegieMellon
researchers have built Pluribus, the first AI bot to beat elite poker pros in 6 player Texas Hold’em. This breakthrough is the first major benchmark outside of 2 player games and we’re sharing specifics on how we built it.
I’m thrilled to share that I've joined
@OpenAI
! 🚀 For years I’ve researched AI self-play and reasoning in games like Poker and Diplomacy. I’ll now investigate how to make these methods truly general. If successful, we may one day see LLMs that are 1,000x better than GPT-4 🌌 1/
About 650 / 770 signed at this moment. As people start waking up, more will come. All the efforts started after 1:30 AM, 500+ within two hours and all of this after 2 crazy days with very little sleep.
After building on years of work from MILA, DeepMind, ourselves, and others, our AIs are now expert-human-level in no-press Diplomacy and Hanabi! Unlike Go and Dota, Diplomacy/Hanabi involve *cooperation*, which breaks naive RL. 🧵👇
6 years ago today, AlphaGo beat Lee Sedol in a milestone for AI. Typically deep learning gets the credit, but it's important to know that nobody has *ever* trained a raw NN that's superhuman in Go. All superhuman Go bots require tree search. IMO planning is underappreciated in AI
If you haven't disabled voice authentication for your bank account and had a conversation with your family about AI voice impersonation yet, now would be a good time.
We're sharing our learnings from a small-scale preview of Voice Engine, a model which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.
This is similar to how I found
@EricSteinb
. He was just an undergrad, but he did research on his own and put a solo paper on arxiv that followed up on my work. I was impressed, so I invited him to work with me. There's a lot of talent out there with non-traditional backgrounds.
How
@_sholtodouglas
got scouted by Google DeepMind:
“Every night from 10 PM till 2 AM, I would do my own research.
@jekbradbury
saw some of my questions online and was like, ‘I thought I knew all the people in the world who were asking these questions. Who on Earth are you?’”
My teammates and I at
@OpenAI
are hiring ML engineers for research on LLM multi-step reasoning! We recently hit a SOTA 78% on MATH: . Our new plans are even more ambitious. If you want to pursue this with us send a resume/linkedin to michelle.kim
@openai
.com
I’m thrilled to share that I've joined
@OpenAI
! 🚀 For years I’ve researched AI self-play and reasoning in games like Poker and Diplomacy. I’ll now investigate how to make these methods truly general. If successful, we may one day see LLMs that are 1,000x better than GPT-4 🌌 1/
A good example is
@_sholtodouglas
at
@GoogleDeepMind
. He's quiet on Twitter, doesn't have any flashy first-author publications, and has only been in the field for ~1.5 years, but people in AI know he was one of the most important people behind Gemini's success
@eladgil
@patrickc
In AI at least, the real 30 under 30 imo you have never heard of. They are 5 layers down the org chart from the CEO. They are usually not on Twitter, they have an unmaintained LinkedIn, they don’t go on podcasts, and they maybe published at one point but don’t do so anymore. They…
8 years ago today, AlphaGo beat Lee Sedol in a milestone for AI. Unlike typical neural nets, AlphaGo spent ~1 minute per move improving its policy via search. This boosted its Elo by more than a 1000x bigger model. Even today, nobody has trained a raw NN that is superhuman in Go.
Pro tip for PhD students looking for a research internship: cold emailing works! Find someone you think would be a good fit, specify topics they've worked on that interest you (point to papers and customize the email!), and make the case for why you're a strong candidate.
I've successfully defended my PhD thesis! It's amazing to see the idea of beating top humans at poker evolve from science fiction to reality over the course of my PhD.
Thesis doc:
Defense slides:
We trained a neural network to competently play Minecraft by pre-training on a large unlabeled video dataset of human Minecraft play and a small amount of labeled contractor data.
This is an under-appreciated problem in academia. A LOT of PhD students at top universities are the children of professors. Why? Because they know how the game is played.
Oh it’s also a barrier to entry for first generation students! I had no clue where to go and what to do when looking for my first lab job, the whole experience was confusing and disheartening. Most of the professors didn’t even email back.
There's an unfortunate tendency to undervalue the work of PhD students and assign excessive credit to the senior professor. This is especially bad with journalists (one even called me a "worker bee"). The title "Research Assistant" shouldn't be taken literally.
I'm excited to share that I've joined Facebook AI Research permanently as a Research Scientist in NYC! I'm looking forward to working with extremely talented
@facebookai
researchers like
@alex_peys
,
@adamlerer
,
@j_foerst
, and many others on multi-agent AI and strategic reasoning!
Our new text-to-image model, DALL·E 3, can translate nuanced requests into extremely detailed and accurate images.
Coming soon to ChatGPT Plus & Enterprise, which can help you craft amazing prompts to bring your ideas to life:
Last year I said superhuman poker AIs would be running on smartphones in 5 years. That timeline may have been pessimistic. Our new paper "Depth-Limited Solving for Imperfect-Information Games" shows how to develop a master poker AI on a 4-core CPU:
Frontier models capping out at ~90% on MMLU isn't a sign of AI hitting a wall. It's a sign that a lot of MMLU questions are busted. The field desperately needs better evals.
All those prior methods are specific to the game. But if we can discover a general version, the benefits could be huge. Yes, inference may be 1,000x slower and more costly, but what inference cost would we pay for a new cancer drug? Or for a proof of the Riemann Hypothesis? 4/
This
@xkcdComic
came out 10 years ago. 6 years ago, AlphaGo beat Lee Sedol. Today is the 5-year anniversary of our bot Libratus beating top humans in Poker. StarCraft fell 3 years ago. I wonder what games we'd see in an updated version of this comic today?
Many have pointed out that LLM benchmarks are broken and gamed. Happy to see my former resident
@hughbzhang
,
@summeryue0
, and the great
@scale_AI
folks do something about it! They made a private version of GSM8k and evaled GPT-4, Claude, Mixtral, Phi, etc:
Applying for a PhD in CS/ML? Pro tip: unless you ONLY want to research deep learning or have an amazing background in it, don't emphasize it in your research statement. Lots of strong applications are tossed because they focus on DL and there aren't enough advisors for it.
AI challenges humans in another game!
@DeepMind
just posted a paper to arXiv on AI for Stratego, a complex imperfect-information game. They place 3rd in a human league. Great work by Julien Perolat, Bart de Vylder,
@karl_tuyls
, and many others!
An example of persuasion from our new Diplomacy AI agent Cicero, published in Science today. The bot (Italy) successfully convinces France that it's not planning to attack.
Introducing DORA, an AI that learns no-press Diplomacy from scratch with no human data! Our
#NeurIPS2021
paper shows DORA is superhuman in 1v1 Diplomacy. In 7p Diplomacy, the results are more subtle. Joint work w/
@anton_bakhtin
, David Wu, and
@adamlerer
:
Excited to announce our NeurIPS paper on ReBeL, an algorithm similar to AlphaZero that plays *imperfect-information* games like poker and Liar's Dice! Joint work with
@anton_bakhtin
,
@adamlerer
, Qucheng Gong.
YouTube Video:
Paper:
AI bots have bested humans in both chess and poker but the algorithms used to win each game were very different. Today we introduce ReBeL, a major step towards a single AI algorithm that can play all games including chess, Go, poker, Liar's Dice and more.
We made an AI that persuades, coordinates, and negotiates with real humans *in natural language* in Diplomacy! Here's a 🧵 of dialogue examples with real human players. To put it mildly, players were surprised when they learned their conversations had been with an AI, not humans!
3 years ago my teammates and I set out toward a goal that seemed like science fiction: to build an AI that could strategically outnegotiate humans *in natural language* in Diplomacy. Today, I’m excited to share our Science paper showing we’ve succeeded! 🧵
In 10 years & 2,700+ episodes I’ve never been so excited for a release of an episode.
@sama
@bradlightcap
@OpenAI
:
🤷♂️ Will models become commoditized?
💻 How to solve the fundamental challenge of compute?
🔒 Open vs closed?
💵 Scaling to $2BN in revenue.
Tomorrow 👇
Branching factor of Chess: ~30
Branching factor of Go: ~300
Branching factor of Diplomacy: ~10^20
But that's even before accounting for the *natural language* aspect of the game. In truth, the branching factor of Diplomacy is the breadth of human language.
Meta AI’s
@polynoamial
and
@anton_bakhtin
talk about strategic reasoning and how it enables
#CICERObyMetaAI
to predict moves from billions of possibilities.
Want to know how CICERO uses planning to find opportunities for mutually beneficial cooperation? Read more on our blog ⬇️
Introducing Sora, our text-to-video model.
Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions.
Prompt: “Beautiful, snowy…
I'm giving lessons on game theory this week and today I'll discuss Braess's paradox: the bizarre result that building roads can sometimes *worsen* traffic. 🤯 In the pic, n=4000 drivers. Before adding the red road, each commute is 65 mins. After, it's 80.
.
@OpenAI
is hiring an AI researcher for a new team working toward solving reasoning! I've worked alongside
@giambattista92
for several months now and have been very impressed with what he's done and what the team plans to do. If this area excites you, I 100% recommend applying.
I’m looking for one more strong research or software engineer excited to work on (ideally solve) reasoning, joining my team at
@OpenAI
🤖 Send resume to michelle.kim
@openai
.com ✨
Someone on the admissions committee for a top CS PhD program told me they no longer filter based on paper count because too many of the applicants already have multiple publications. Instead, they now filter by citation count. Not sure if he was joking but I believed it.
PhD admits considering potential advisors: talk to the advisor's current/former students! Don't be afraid to email former students. For the most honest opinion, talk to folks on the phone. Also, keep in mind students might be scared to criticize, so read between the lines.
The Repeated Prisoner's Dilemma is one of the most enlightening parables in game theory. If you want to learn about it, I recommend this short game. It communicates the intuition beautifully. In fact, I made playing it a homework assignment for my class.
Today we're excited to introduce Devin, the first AI software engineer.
Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork.
Devin is…
I'm at
#NeurIPS2023
all week! Looking forward to catching up with old friends, making new ones, and flashing this badge at anyone that asks about the GPT-5 release date
Credit to
@kipperrii
and
@_rlys
for the badge idea
I'm honored to have been named one of MIT
@techreview
's 35 Innovators Under 35! I'm lucky to have had such a wonderful collection of mentors, collaborators, and friends at
@SCSatCMU
during my PhD.
#35InnovatorsUnder35
Introducing AlphaGeometry: an AI system that solves Olympiad geometry problems at a level approaching a human gold-medalist. 📐
It was trained solely on synthetic data and marks a breakthrough for AI in mathematical reasoning. 🧵
In 2016, AlphaGo beat Lee Sedol in a milestone for AI. But key to that was the AI's ability to "ponder" for ~1 minute before each move. How much did that improve it? For AlphaGoZero, it's the equivalent of scaling pretraining by ~100,000x (~5200 Elo with search, ~3000 without) 2/
I just watched this video and was super impressed by how well
@ykilcher
communicated the essence of our paper. If you want to understand why AlphaZero can't play poker and why ReBeL can, this is a great video to watch!
♠️♥️New Video♥️♠️ReBeL is the AlphaZero of Poker! It combines RL+Tree Search to solve imperfect information games and provably(!) converges to Nash Equilibrium🔥Superhuman Poker AI with almost no domain knowledge🎲💪
@polynoamial
@anton_bakhtin
@adamlerer
ChatGPT can now browse the internet to provide you with current and authoritative information, complete with direct links to sources. It is no longer limited to data before September 2021.
The AI techniques used in Go, Poker, and StarCraft are not enough to deal with mixed cooperative/competitive environments. This is a nice article summarizing recent research from
@Mila_Quebec
,
@DeepMind
, and our own research at
@facebookai
on how to overcome that!
Not easy, but if we can figure out how to extend this to LLMs the impact would be huge. Imagine having access to models that take 5 minutes to ponder each response but the output is as good as a model that's 1,000x larger and trained for 1,000x longer than GPT-4
I keep revisiting this great paper from
@andy_l_jones
: “Scaling scaling laws with board games”. It shows how training compute and inference compute of MCTS can be traded off against each other. 10x more MCTS steps is almost the same as training 10x more.
The Gemini era is here. Thrilled to launch Gemini 1.0, our most capable & general AI model. Built to be natively multimodal, it can understand many types of info. Efficient & flexible, it comes in 3 sizes each best-in-class & optimized for different uses
6 months ago we showed with
#Pluribus
that search was the key to beating top humans in poker. Today
@adamlerer
, Hengyuan Hu,
@j_foerst
, and I are announcing new results showing that a similar search algorithm can conquer cooperative partially observable games like Hanabi as well!
To advance research on AI that can understand others’ points of view and collaborate effectively, Facebook AI has developed a bot that sets a new state of the art in Hanabi, a card game in which all players work together.
"I would like to specialize in AI, particularly Machine Learning, but from what I've read it seems AI is somewhat of a dead field" <-- From an email I sent to a professor 10 years ago asking for advice on applying to grad school. It's amazing how much AI has changed in a decade
A tip for incoming PhD students considering potential advisors: talk to their current and former students! Don't be afraid to reach out even if the student exited the program a long time ago. For the most honest assessment, talk with them by voice or video rather than email.
2 years ago I invested in
@magicailabs
as my first-ever seed-round investment. I felt comfortable doing it because I'd worked with
@EricSteinb
for years and believed he'd succeed. I'm so impressed with the progress they've made and can't wait to see what they do next!
We've raised $117M from
@natfriedman
and others to build an AI software engineer.
Code generation is both a product and a path to AGI, requiring new algorithms, lots of CUDA, frontier-scale training, RL, and a new UI.
We are hiring!
@jgrayatwork
@adamlerer
@anton_bakhtin
and I are thrilled to share our latest work: a human-level no-press Diplomacy bot! Unlike prior AI benchmarks, Diplomacy involves a complex mix of both cooperation and competition. Thanks
@webdiplomacy
for your help!
AlphaZero can play Go and Chess, but it can't play poker. Why not? How can we fix this? Come see our
#NeurIPS2020
poster on ReBeL, an algorithm that unifies past AI techniques for games like Go and games like Poker, and is superhuman in no-limit Texas hold'em. Thursday 9-11am PT.
Excited to announce our NeurIPS paper on ReBeL, an algorithm similar to AlphaZero that plays *imperfect-information* games like poker and Liar's Dice! Joint work with
@anton_bakhtin
,
@adamlerer
, Qucheng Gong.
YouTube Video:
Paper:
Defeating top humans in every benchmark game has so far required real-time search/planning. Chess needed alpha-beta pruning search. Go needed MCTS. Poker needed subgame solving via CFR. I would have been very surprised if OpenAI Five won its matches with just raw Deep RL.
I'm incredibly honored to receive the Victor Lesser Distinguished Dissertation Award! I look forward to talking about the research behind it at
@aamas2021
The Victor Lesser Distinguished Dissertation Award is given for dissertations that show originality, depth, impact, as well as quality of writing. We are happy to announce that the 2020 recipient is Dr. Noam Brown. We look forwards to his keynote talk!
10/ We’re excited to see how researchers can build on top of this work, so we’re making all our code and models available to researchers. We’ve also partnered with to make the training data available to researchers as well!
Nice new paper from
@huggingface
investigating LLM scaling laws once data is the bottleneck (which will eventually be the case for large models). They show little degradation for training with up to 4 epochs
6/ Our agent, CICERO, couples a dialogue module with a strategic reasoning engine. Each turn, CICERO models the other players’ policies based on the game state and shared dialogue. It forms a plan, and the dialogue module generates messages conditional on the plan.
2/ Diplomacy is a 7-player game best described as a mix of Risk, poker, and Survivor. It was JFK’s favorite game.
@demishassabis
is a former champion in it. And it’s been a decades-old, seemingly impossible grand challenge for AI. Why?
Introducing Imagen, a new text-to-image synthesis model that can generate high-fidelity, photorealistic images from a deep level of language understanding. Learn more and and check out some examples of
#imagen
at
8/ Over 40 Diplomacy games with 82 human players involving 5,277 messages over 72 hours of gameplay, CICERO achieved more than double the average score of the other players and ranked in the top 10% of players! This is the first-ever human-level AI for Diplomacy.
I'm truly honored to receive the AAAI/ACM-SIGAI Dissertation Award! I remember feeling elated when my first paper ever was published at AAAI back in 2014. Taking that line of research from one paper to a full dissertation has been the ride of a lifetime.
The 2020 AAAI/ACM-SIGAI Dissertation Award recognizes superior research and writing by doctoral candidates in artificial intelligence. Congratulations to Noam Brown for his work on Equilibrium Finding for Large Adversarial Imperfect-Information Games.
#AAAI2022
Improved capabilities are always risky but if this research succeeds it could be valuable for safety research as well. Imagine being able to spend $1 million on inference to see what a more capable future model might look like. It would give us a warning that we otherwise lack 5/
Our paper "Safe and Nested Subgame Solving for Imperfect-Information Games" won BEST PAPER at
#NIPS2017
! The talk is at 2:50PM on Tuesday Dec 5th and we'll demo our poker AI Libratus at 7PM. Come play against it! Paper: 3-Min Video:
"Deep Counterfactual Regret Minimization" will be an oral presentation at the
#NeurIPS2018
Deep RL workshop! It modifies the tabular CFR algorithm popular for games like poker to use deep learning function approximation. w/ A. Lerer, S. Gross, T. Sandholm
We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo.
We are collaborating to figure out the details. Thank you so much for your patience through this.
🚨We found adversarial suffixes that completely circumvent the alignment of open source LLMs. More concerningly, the same prompts transfer to ChatGPT, Claude, Bard, and LLaMA-2…🧵
Website:
Paper:
We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo.
We are collaborating to figure out the details. Thank you so much for your patience through this.
@togelius
Mom: "How are you going to get a job with a PhD on games?"
Me: "Game *theory*, mom. How am I going to get a job with a PhD on game *theory*."
The FAIR multi-agent learning group is looking for several research interns for next year! If you're a grad student interested in this area and the research we've done, consider applying and also contacting the relevant researchers directly
Also in 2016, I observed a similar phenomenon in poker. That insight led to our Libratus poker AI that beat top humans for the first time.
@andy_l_jones
investigated the train-time/test-time compute tradeoff in detail in Hex and found a similar pattern: 3/
I keep revisiting this great paper from
@andy_l_jones
: “Scaling scaling laws with board games”. It shows how training compute and inference compute of MCTS can be traded off against each other. 10x more MCTS steps is almost the same as training 10x more.
I'm excited to meet folks at
#NeurIPS2022
! I'll be there all week; my DMs are open. We'll present CICERO and do Q&A on Monday Nov 28th at the
@MetaAI
expo:
Also
@adamlerer
will give an amazing talk on CICERO at the
@LaReL2022
workshop on Friday Dec 2nd!
Advisor abuse of power is a serious problem in academia. I know victims who are reluctant to discuss it for fear of damaging their careers, especially if they are on a visa. I'm really happy to see this MIT org pushing to do something about it. Hopefully others follow.
As we approach deadlines for PhD applications, I recommend reading this brutally honest guide to applying, written by a CMU CS prof: . I especially recommend reading Section 3.3 on how to write a Personal Statement. It might help you avoid common mistakes.