Kianté Brantley @NeurIPS
@xkianteb
Followers
2K
Following
3K
Media
12
Statuses
3K
Assistant Professor at Harvard @KempnerInst and SEAS | Fitness enthusiast | (He/Him/His)
Joined May 2009
Exploration seems to emerge from self-supervised RL ... but why?🤔 @MBastankhah and @GraceLiu78's poster tomorrow at the ARLET Workshop @ #NeurIPS2025 helps explain why! 3:30pm in Upper Level Room 31ABC.
I’m excited to present our work “Demystifying the Mechanisms Behind Emergent Exploration in Goal-Conditioned RL” at the Aligning RL Experimentalists & Theorists workshop at #NeurIPS2025! 📅 Dec 6 🕞 3:30 PM 📍 Poster Session, Upper Level Room 31ABC @princeton_rl @GraceLiu78
0
13
100
I’m at NeurIPS 2025. Feel free to DM me if you’d like to meet or chat. I’m hiring PhD students and a postdoc in RL for LLMs this year. I have a few directions I am interested in pursuing, ranging from general RL optimization for LLMs to AI alignment issues.
4
6
76
If you're at NeurIPS, don't miss this! :)
Happening this Tuesday 1:30 PST @ NeurIPS: Foundations of Imitation Learning: From Language Modeling to Continuous Control A tutorial with Adam Block & Max Simchowitz (@max_simchowitz).
0
2
54
🧐🧐 Why do we pretrain LLMs with log likelihood? Why does action chunking work so well in robotics? Why is EMA so ubiquitous? And could their be a mathematical basis for Moravec’s paradox? 🤖🤖 Come check out our NeurIPS 2025 Tutorial “Foundations of Imitation Learning” with
6
38
267
Some RL researchers agree that current paradigms lack elements necessary for lifelong learning. Michael Bowling @MichaelHBowling has an interesting perspective ( https://t.co/OTwME45Q5f).
@nanjiang_cs @roydanroy @canondetortugas i will try and clarify my original post, which was really about a misalignment between the high level motivation and the actual research lines being pursued. i can see many reasons to pursue the research lines people are working on, but all of them for me differ substantially
0
6
41
Making value functions work is really important. Although it’s not exclusively a PG RL concept, enhancing the quality of the value function nicely complements policy gradient methods.
Don't throw the baby out with the bathwater: lots of useful ideas / concepts in RL beyond policy gradients -- for example, getting value functions to work being the one in zeitgeist
0
0
9
Wild imposter-syndrome moment: easily the least qualified guest on this podcast, and super honored to chat about my work with @DeltaInstitutes! Huge thanks for having me on and letting me share a slot with people whose research I seriously look up to :)
Huge thanks to Jeffrey Ma for coming on the Delta Podcast! Check out the podcast episode here: https://t.co/LxTqfoE1HI
1
3
12
Excited for tomorrow, where Pure Exploration in RL kicks off at Boston University CDS 🎉! Lectures will be recorded & uploaded - stay tuned. Many thanks to @BU_CDS & @aldopacchiano for allowing this opportunity. Website: https://t.co/ojwAWZVIuD
#BU #RL #PureExploration
0
2
10
I’m recruiting PhD and Masters students at the University of British Columbia! Looking to work with students interested in LLM interpretability, and intersections with cog ling. The UBC NLP group is growing and awesome, and Vancouver is great. Come to 🇨🇦!
Do you want to understand how language models work, and how they can change language science? I'm recruiting PhD students at UBC Linguistics! The research will be fun, and Vancouver is lovely. So much cool NLP happening at UBC across both Ling and CS! https://t.co/IxKvy4Um1I
50
292
1K
Congrats @ruijie_zheng12
Life update: I’ve successfully defended my PhD thesis today and will soon be joining the GEAR Lab as a research scientist for building humanoid robot foundation model. It’s been such a wonderful journey at Maryland, already starting to miss it!
0
0
1
Alexander’s tutorial is really awesome! I learned a lot. Thanks for putting it together and presenting.
The slides from our INFORMS tutorial on "The Gittins Index: A Design Principle for Decision-making Under Uncertainty" - specifically for my part - are now online! If you're interested - check them out - link below.
0
1
6
Congrats @codezakh, well deserved!
🥳 Honored and grateful to be awarded an NDSEG Fellowship in Computer Science! 💫🇺🇸 Big thanks to my advisor @mohitban47 for his guidance, and shoutout to my lab mates at @unc_ai_group, collaborators, internship advisors, and mentors for their support 🤗 Excited to continue
1
0
4
How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing
2
44
90
SSMs promised efficient language modeling for long context, but so far seem to underperform compared to Transformers in many settings. Our new work suggests that this is not a problem with SSMs, but with how we are currently using them. Arxiv: https://t.co/bCzxawF452 🧵
6
84
414
We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.
73
249
2K
1/6 Introducing Seesaw: a principled batch size scheduling algo. Seesaw achieves theoretically optimal serial run time given a fixed compute budget and also matches the performance of cosine annealing at fixed batch size.
2
33
245
1/8 Second Order Optimizers like SOAP and Muon have shown impressive performance on LLM optimization. But are we fully utilizing the potential of second order information? New work: we show that a full second order optimizer is much better than existing optimizers in terms of
26
80
595
New work with BU student @Dragazis_Spyros and @TBaharav!
When should you run an expensive test vs. making a prediction? Our new algorithm provides no-regret guarantees for safe online classification with costly labels. With @aldopacchiano and @TBaharav. https://t.co/V6aDNmGhcT
0
1
8