Johan Obando-Ceron @ NeurIPS’25 👍🏽
@johanobandoc
Followers
2K
Following
48K
Media
49
Statuses
3K
Graduate student @Mila_Quebec @UMontrealDIRO | RL/Deep Learning/AI | De Cali/Colombia pal’ Mundo 🇨🇴 | #JuntosProsperamos⚡#TogetherWeThrive| 🌱🌎
Montréal, Québec
Joined February 2017
I’m very happy to share our paper, Revisiting Rainbow, accepted in the Deep RL workshop @Neurips_Conf. I’ve been working with @pcastr for a few months, and I wanted to share my story, and how this work came to be, in the hope it motivates others in similar situations to mine.
Happy to share "Revisiting Rainbow" w/ @JS_Obando where we argue small/mid-scale envs can promote more insightful & inclusive deep RL research. 📜Paper: https://t.co/d5I62kAqdc ✍️🏾Blog: https://t.co/WMVJJjPaLm 🐍Code: https://t.co/WdtgsZFP84 📽️Video: https://t.co/VfJFQqsGds 🧵1/X
8
37
232
This #NeurIPS2025 was tiring, but it was fantastic to connect with so many friends and colleagues! I was so busy I didn't get a chance to tweemote our papers at the conference, so I'll remedy that with this post-hoc thread: 👇🏾
1
4
69
It was a great experience talking at the @SEAWorkshop at #NeurIPS. This was my first workshop talk and it’s all thanks to the organizers @lawhy_X @guohao_li, @robertarail for recommending me and @AlexDGoldie for leading DiscoBench.
Invited Talk 4 "Automated Algorithmic Discovery for AI Research" from Deepak Nathani @deepaknathani11 (UC Santa Barbara)
3
7
56
We've been deeply thinking about deep thinking with LLMs. Come chat with us about RSA, test-time scaling and RL with LLMs at the FoRLM workshop!
We’ll be presenting this at the FoRLM workshop between 10:15-11:30am room 33 tomorrow! Drop by if you’d like to chat about this paper, or RL for LLMs in general (I got some juicy new insights)
0
2
15
Definitely check out our work on improving reasoning through test time scaling! Drop by if you are curious about RL post-training from the ground-level fundamentals without the often unnecessary something-something POs.
We’ll be presenting this at the FoRLM workshop between 10:15-11:30am room 33 tomorrow! Drop by if you’d like to chat about this paper, or RL for LLMs in general (I got some juicy new insights)
1
4
11
We’ll be presenting this at the FoRLM workshop between 10:15-11:30am room 33 tomorrow! Drop by if you’d like to chat about this paper, or RL for LLMs in general (I got some juicy new insights)
NO verifiers. NO Tools. Qwen3-4B-Instruct can match DeepSeek-R1 and o3-mini (high) with ONLY test-time scaling. Presenting Recursive Self-Aggregation (RSA) — the strongest test-time scaling method I know of! Then we use aggregation-aware RL to push further!! 📈📈 🧵below!
3
7
28
Amazing opportunity to work with @ermgrant
Thrilled to announce I'll start in 2026 as faculty in Psych & CS @UAlberta + @AmiiThinks Fellow!! 🥳 Recruiting students to develop theories of cognition in natural and artificial systems 🤖💭🧠. Find me at #NeurIPS2025 workshops (talk at @CogInterp & organising @DataOnBrainMind)
0
1
13
Fourth #runconference at #NeurIPS2025 had the best turnout yet! Help us beat it tomorrow, same place, same time (follow tweet thread for details)! 🤖🏃🏾
Third #runconference at #NeurIPS2025 was great, including special guest @JeffDean ! Same place, same time, tomorrow morning if you want to join! 🤖🏃🏾
2
4
97
🧊 Off-policy RL for LLMs is hard. Dr. GRPO collapses at 10 steps off-policy. TBA doesn't. @Kimi_Moonshot K2's approach is robust too – both independently landed on the same key ingredients 🤝 We ablate RL recipe ingredients + show the 2 small changes giving off-policy
6
28
228
📍 Come see our poster at NeurIPS, Friday @ 11! Paper v2: https://t.co/JTVTWuU2P5 Code v2: https://t.co/1lwaZk9dbj Website: https://t.co/ldKGC6pnJa Thanks again to my amazing collaborators at @Mila_Quebec, @kaist_ai, @Livermore_Comp: @siddarthv66, James Diffenderfer,
github.com
testing TBA in prime-rl, a codebase for decentralized async RL training at scale - bbartoldson/tba-prime
1
2
19
@creus_roger and @johanobandoc will be presenting this #NeurIPS2025 spotlight paper today from 11-2:30 poster #310! Also, if you are looking for someone amazing at getting GPUs go🔥 for large-scale DeepRL or agentic model training, @creus_roger is looking for an internship.
🚨 Excited to share our new work: "Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning"! 📈 We propose gradient interventions that enable stable, scalable learning, achieving significant performance gains across agents and environments! Details below 👇
0
3
18
Here is a one-minute summary of PhD students UdeM/Mila Roger Creus Castanyer (@creus_roger) and Johan S. Obando C.’s (@johanobandoc) work “Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning”. Come see the spotlight poster at @NeurIPSConf today at 11 am:
1
4
17
🚀I'm excited to present our work at Neurips! I will present our poster on, "Normalizing Flows are Capable Models for Continuous Control". We show that Normalizing Flows can be used as plug and play models in imitation learning, offline RL and unsupervised RL algorithms to reap
0
4
37
Interested in the theory of agents / definitions / RL under minimal assumptions? I'll be at poster #512 today (Thurs.) in the 4.30pm-7.30pm session. Come on by 🙂
Thrilled to share our new #NeurIPS2025 paper done at @GoogleDeepMind, Plasticity as the Mirror of Empowerment We prove every agent faces a trade-off between its capacity to adapt (plasticity) and its capacity to steer (empowerment) Paper: https://t.co/prWpkdPojb 🧵🧵🧵👇
3
10
65
📢Thrilled to dock ⚓️ at #NeurIPS2025 in San Diego! Come say ahoy to SAILOR ⛵️ at our spotlight poster #2407 (11:00 AM – 2:00 PM PST)! Paper: https://t.co/A2UU3unmA8 Code: https://t.co/pq7FlKtuuG
Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
0
11
32
At #NeurIPS2025, I’m presenting our paper “Tight Lower Bounds and Improved Convergence in Performative Prediction.” We introduce the first lower bounds for iterative retraining and show that history helps: using past snapshots can break expand the convergence region. 🧵👇
2
5
13
.@creus_roger and @johanobandoc did an awesome job presenting our paper at the @_LXAI workshop at #NeurIPS . If you missed it, come to our poster #310 on Friday 11am, exhibit hall C,D,E.
Accepted to NeurIPS 2025 as a spotlight (top ~3% of submissions)!!! ✨🌟 Paper: https://t.co/krTCwlpoXW Code:
0
7
30
Another fantastic new initiative at IFM building around exciting new staff members. We’re thrilled to have @ssahoo_ building with us. I’m selfishly anticipating doing some great RL on top of the diffusion LLMs we build.
Building a new team focused on @diffusion_llms at the @mbzuai- Institute of Foundation Models (@llm360). ✨Looking for FT hires and interns with experience in LLM architecture or post-training. (Discrete diffusion experience is a bonus, not required.) ✨Come say hi at Exhibit
0
1
13
Excited to share that I’m at #NeurIPS2025 this year! 🧠 I’ll be presenting at the UniReps Workshop on my work on Social Tokens with @su1001v and @thisismyhat, where we explore how to inject socially relevant visual features into language models to improve social reasoning and
0
3
9