Max Simchowitz Profile
Max Simchowitz

@max_simchowitz

Followers
1K
Following
227
Media
6
Statuses
67

Assistant Professor @mldcmu. Formerly: Postdoc @MITEECS, PhD @Berkeley_EECS, Math Undergrad @Princeton. New to Twitter. https://t.co/67bMOAyqK6

Joined May 2024
Don't wanna be here? Send us removal request.
@max_simchowitz
Max Simchowitz
3 months
There’s a lot of awesome research about LLM reasoning right now. But how is  learning in the physical world 🤖different than in language 📚?. In a new paper, show that imitation learning in continuous spaces can be exponentially harder than for discrete state spaces, even when
3
37
212
@max_simchowitz
Max Simchowitz
27 days
Very cool! In addition to optimizing inference-time search as a learning desideratum, this really speaks to power of building reward models purely from expert trajectories, via discriminative objectives. Excited to see how far this can go!.
@g_k_swamy
Gokul Swamy
27 days
Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
1
4
33
@max_simchowitz
Max Simchowitz
2 months
RT @GuanyaShi: I am giving a talk "From Sim2Real 1.0 to 4.0 for Humanoid Whole-Body Control and Loco-Manipulation" at the RoboLetics 2.0 wo….
0
13
0
@max_simchowitz
Max Simchowitz
2 months
RT @NicholasEPfaff: Want to scale robot data with simulation, but don’t know how to get large numbers of realistic, diverse, and task-relev….
0
24
0
@max_simchowitz
Max Simchowitz
2 months
RT @canondetortugas: RL and post-training play a central role in giving language models advanced reasoning capabilities, but many algorithm….
0
7
0
@max_simchowitz
Max Simchowitz
2 months
RT @maiamindel: The Chicago Pope implies the existence of a New Keynesian Pope and a Behavioral Pope.
0
2K
0
@max_simchowitz
Max Simchowitz
2 months
RT @nmboffi: really excited about this one -- please submit your best work, and come join us in beautiful Lyon to talk about machine learni….
0
1
0
@max_simchowitz
Max Simchowitz
2 months
RT @CMU_Robotics: Congrats to Andrea Bajcsy (@andrea_bajcsy) on receiving the NSF CAREER award! 👏 . Her work: “Formalizing Open World Safet….
0
7
0
@max_simchowitz
Max Simchowitz
2 months
RT @aleks_madry: Building AI systems is now a fragmented process spanning multiple organizations & entities. In new work (w/ @aspenkhopkin….
0
25
0
@max_simchowitz
Max Simchowitz
2 months
Update: 2pm is the likely time.
1
0
2
@max_simchowitz
Max Simchowitz
2 months
Please check my account for updates on the actual time ! Unfortunately, I was advised not to post a link here, but if you google “RL Theory Virtual Seminar”, and click “Next Seminar” on the sidebar, there will be registration form to view the talk through a Google meet.
1
0
1
@max_simchowitz
Max Simchowitz
2 months
Regarding time: The seminar is listed both at 6pm UTC (which would be 2pm ET) and 1pm ET; these seem to be off my a daylight savings time adjustment. We are actively in contact with the organizers to resolve the issue. .
1
1
1
@max_simchowitz
Max Simchowitz
2 months
Hey Everyone!! I will be giving a lecture at the RL Theory Virtual Seminar tomorrow, on my new paper about the “Pitfalls of Imitation Learning" in continuous action spaces. 🧵 below; please read because the time is somewhat TBD. 🧐
Tweet media one
4
9
90
@max_simchowitz
Max Simchowitz
2 months
RT @aviral_kumar2: Before the (exciting) workshops on Sun, catch Vincent’s oral talk at the #ICLR2025 main conference on this paper today a….
0
5
0
@max_simchowitz
Max Simchowitz
3 months
RT @LerrelPinto: So excited for this!!! . The key technical breakthrough here is that we can control joints and fingertips of the robot **w….
0
14
0
@max_simchowitz
Max Simchowitz
3 months
Check out our paper to find out more, or join in on the RL theory seminar on April 29th at 1pm ET (follow for more details).
3
4
42
@max_simchowitz
Max Simchowitz
3 months
What does make a difference is using a richer policy parametrization! We give evidence that even for imitating this smooth, deterministic - i.e, unimodal - expert, Diffusion and action chunking have surprising benefits!
Tweet media one
1
2
15
@max_simchowitz
Max Simchowitz
3 months
This result holds for imitating a smooth, deterministic expert demonstrator in a smooth, exponentially stable control system. Exponential compounding error can't be avoided by using a fancier learning algorithm - BC, Adversarial IL, and Offline RL all suffer.
1
0
11
@max_simchowitz
Max Simchowitz
4 months
RT @Princeton: The Porter Ogden Jacobus Fellowship, the University’s top honor for @PrincetonGrad students, is awarded to one Ph.D. student….
0
4
0
@max_simchowitz
Max Simchowitz
5 months
RT @abhishekunique7: So we did a bunch of projects with real world reinforcement learning - but it was often too inefficient to be practica….
0
30
0