
Michael Dennis
@MichaelD1729
Followers
3K
Following
3K
Media
23
Statuses
1K
Open-Endedness RS @GoogleDeepMind. Building for an unspecifiable world | Unsupervised Environment Design, Game&Decision Theory, RL, AIS. prev @CHAI_Berkeley
Joined November 2019
It’s been a crazy 2 years seeing so many amazingly talented researchers bring GENerative Interactive Environments alive in Genie 1 and 2. the future is agents in generative environments.
Excited to reveal Genie 2, our most capable foundation world model that, given a single prompt image, can generate an endless variety of action-controllable, playable 3D worlds. Fantastic cross-team effort by the Open-Endedness Team and many other teams at @GoogleDeepMind! 🧞
2
6
44
RT @_rockt: Great opportunity at Google Research for folks interested in AutoML, evolutionary methods, meta-learning, and open-endedness: ….
0
21
0
looks like this could be a cool venue for people working on Unsupervised Environment Design. Scaling Environments for Agents Workshop @ NeurIPS.
🚨 [Call for Papers] SEA Workshop @ NeurIPS 2025 🚨.📅 December 6, 2025 | 📍 San Diego, USA.🌐: Environments are the "data" for training agents, which is largely missing in the open source ecosystem. We are hosting Scaling Environments for Agents (SEA)
1
2
19
RT @RLFrameWorkshop: We have a really exciting lineup of invited speakers this year 🔥 Kicking us off we have Prof. Erin Talvitie (Harvey Mu….
0
4
0
RT @RLFrameWorkshop: The Most Thought-Provoking paper award goes to Thinking is Another Form of Control 🏆 Congratulations to @JosiahHanna….
0
9
0
RT @Karim_abdelll: I will be presenting the work below at @RL_Conference on mitigating goal misgeneralization. The talk will take place o….
0
5
0
RT @RLFrameWorkshop: Thanks to everyone who submitted; we enjoyed reviewing another fantastic set of papers this year! Check out all 27 acc….
sites.google.com
Talks Thinking is Another Form of Control. Josiah P. Hanna and Nicholas E. Corrado. Awarded: Most Thought-provoking paper. Analogy making as amortised model construction. David G Nagy, Tingke Shen,...
0
5
0
This was a fun visual of the robustness of an RL policy. Made it very clear where things were working. I hope more UED papers include visuals like this!.
We also visualize the performance of our agents in a maze for each possible location of the goal in the environment. The results show that agents trained with the regret objective achieve near-maximum return for almost all goal locations.
0
0
13
RT @aditimavalankar: On my way to #ICML2025 to present our algorithm that strongly scales with inference compute, in both performance and s….
0
10
0
RT @MatthewFdashR: At least for me, the big-picture motivation behind our RLC paper is a research vision for scalable AI alignment via mini….
0
10
0
RT @Karim_abdelll: *New AI Alignment Paper*. 🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the….
0
31
0
In a rare combination of theory and empirics @Karim_abdelll & @MatthewFdashR show that UED via minimax-regret can mitigate goal misgeneralization. They're the first to connect UED to AI Safety. Expect more to follow!. There's gold for UED/AIS researchers in the 30 page appendix🚀.
*New AI Alignment Paper*. 🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the human's intended goal. 😇 We show that training with a minimax regret objective provably mitigates it, promoting safer and better-aligned RL policies!
0
6
34
RT @gdrtodd_: Excited to introduce the first version of Ludax, a domain-specific language for board games that compiles directly into JAX c….
0
75
0
Finally a way to save compute when hyper parameter sweeping in jax 🎉. Great to see an open source solution to a very common problem 🚀.
🚀 Excited to announce Hyperoptax, a library for parallel hyperparameter tuning in JAX. Implements Grid, Random, and Bayesian search in pure JAX so that you can rapidly search across parameter configurations in parallel ‖. 📦 pip install hyperoptax.
0
0
10
Glad to see more work getting RL to maintain plasticity in non-stationary PCG levels! It's been a folk theory for a while that plasticity loss is reducing the effectiveness of unsupervised environment design but I haven't seen it confirmed in a paper. Anyone trying this in UED?.
(1/8)🔥Excited to share that our paper “Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn” has been accepted to #ICML2025!🎉. RL agents struggle to adapt in continual learning. Why? We trace the problem to something subtle: churn. 👇🧵@Mila_Quebec
0
3
30
RT @MartinKlissarov: As AI agents face increasingly long and complex tasks, decomposing them into subtasks becomes increasingly appealing.….
0
64
0
A cool exploration of the historical rhymes around open-endeness, from cybernetics and VNM replicators to evolutionary algorithms, alpha zero and the modern push for open-ended AI.
Most AI systems today follow the same predictable pattern: they're built for specific tasks and optimized for objectives rather than exploration. Meanwhile, humans are an open-ended species—driven by curiosity and constantly questioning the unknown. From inventing new musical
1
4
8
Sounds like a great opportunity to be serious about making RL work in reality. Whoever does this will gain some knowledge and experience few others have. Also @EugeneVinitsky is a great person, and amazing to work with.
We now know RL agents can zero-shot crush driving benchmarks. Can we put them on a car and replace the planning stack? We're hiring a postdoc at NYU to find out! .Email me if interested and please help us get the word out.
1
4
28
RT @EugeneVinitsky: We now know RL agents can zero-shot crush driving benchmarks. Can we put them on a car and replace the planning stack?….
0
47
0