Erdem Bıyık
            
            @ebiyik_
Followers
                3K
              Following
                4K
              Media
                120
              Statuses
                1K
              Asst Prof @CSatUSC (cc @USC, @USCViterbi). Research on AI/ML for Robotics & HRI. Previously @CHAI_Berkeley, @StanfordAILab, @Google, @BilkentUniv.
              
              Los Angeles, CA
            
            
              
              Joined May 2010
            
            
           Actor-critic RL but there is no actor 🤯 because the critic can control the system even with a continuous action space! The result: More stable RL and better robustness against local optima (because there is no separate training for an actor) Check out our NeurIPS paper :) 👇 
           Can Q-learning alone handle continuous actions? Value-based RL (like DQN) is simple & stable, but typically limited to discrete actions. Continuous control usually needs actor-critic methods (DDPG, TD3, SAC) that are powerful but unstable & can get stuck in local optima. 
            
                
                1
              
              
                
                15
              
              
                
                132
              
             Combinatorial complexity is often the bane of imitation learning - including VLA models! @Jesse_Y_Zhang and @memmelma proposed a way around this, using VLMs to perform problem reduction for imitation. The insight is simple - 1) High-level VLM takes a complex scene/task and 
          
                
                3
              
              
                
                24
              
              
                
                146
              
             We’re excited to release the code for our CoRL 2025 (Oral) paper: “ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations.” 🌐 Website:  https://t.co/0yLFfgUn5O  📄 Arxiv:  https://t.co/H0EU1IpfT6  💻 Code:  https://t.co/BPgbtsRkNm 
             https://t.co/OoeZSiFE8E 
          
           Reward models that help real robots learn new tasks—no new demos needed! ReWiND uses language-guided rewards to train bimanual arms on OOD tasks in 1 hour! Offline-to-online, lang-conditioned, visual RL on action-chunked transformers. 🧵 
            
                
                4
              
              
                
                34
              
              
                
                149
              
             How can we help *any* image-input policy generalize better? 👉 Meet PEEK 🤖 — a framework that uses VLMs to decide *where* to look and *what* to do, so downstream policies — from ACT, 3D-DA, or even π₀ — generalize more effectively! 🧵 
          
                
                1
              
              
                
                31
              
              
                
                120
              
             How can we help *any* image-input policy generalize better to visual and semantic variations? 👉 Meet PEEK 🤖 — a framework that uses VLMs to decide *where* to look and *what* to do, so downstream policies — from ACT, 3D-DA, or even π₀ — generalize more effectively! 
          
                
                2
              
              
                
                13
              
              
                
                38
              
            
                
                0
              
              
                
                0
              
              
                
                6
              
             Thrilled to share that ReWiND kicks off CoRL as the very first oral talk! 🥳 📅 Sunday, 9AM — don’t miss it! @_abraranwar and I dive deeper into specializing robot policies in our USC RASC blog post (feat. ReWiND + related work): 👉 
           Reward models that help real robots learn new tasks—no new demos needed! ReWiND uses language-guided rewards to train bimanual arms on OOD tasks in 1 hour! Offline-to-online, lang-conditioned, visual RL on action-chunked transformers. 🧵 
            
                
                2
              
              
                
                6
              
              
                
                47
              
            
            @yusen_2001 @_abraranwar And I am looking forward to seeing my Korean-dubbed video! 😬
          
          
                
                0
              
              
                
                0
              
              
                
                3
              
             Today we were interviewed by journalists from Donga Science, the longest-running science magazine of South Korea, for this work. Getting video-interviewed still feels a little strange, but I am happy that this work is getting the attention it deserves :) @yusen_2001 @_abraranwar
          
           Reward models that help real robots learn new tasks—no new demos needed! ReWiND uses language-guided rewards to train bimanual arms on OOD tasks in 1 hour! Offline-to-online, lang-conditioned, visual RL on action-chunked transformers. 🧵 
            
                
                1
              
              
                
                6
              
              
                
                31
              
             This paper has now received the "Outstanding Paper Award on Empirical Reinforcement Learning Research" at #rlc2025 @RL_Conference🥳 Congratulations to all my co-authors! If you're interested in recruiting a best-paper-award-winner student, Xinhu Li will apply for PhD this year! 
           At @RL_Conference🍁, I'm presenting a talk and a poster on Aug 6, Track 1: Reinforcement Learning Algorithms. We find that Deterministic Policy Gradient methods like TD3 often get stuck at local optima under complex Q-functions, and propose a novel actor architecture! 🧵 
            
                
                0
              
              
                
                5
              
              
                
                63
              
             Honored that our @RL_Conference paper won the Outstanding Paper Award on Empirical Reinforcement Learning Research! 📜Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-Functions 📎  https://t.co/owm0hVVsUK  Grateful to my advisors @JosephLim_AI and @ebiyik_! 
           At @RL_Conference🍁, I'm presenting a talk and a poster on Aug 6, Track 1: Reinforcement Learning Algorithms. We find that Deterministic Policy Gradient methods like TD3 often get stuck at local optima under complex Q-functions, and propose a novel actor architecture! 🧵 
            
                
                9
              
              
                
                10
              
              
                
                72
              
             🚀 New Paper at IROS 2025! 🚀 《RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks》 I'm excited to share our latest work, RAILGUN, which proposes the first centralized learning-based method for solving MAPF problem. 
          
                
                1
              
              
                
                1
              
              
                
                6
              
             At @RL_Conference🍁, I'm presenting a talk and a poster on Aug 6, Track 1: Reinforcement Learning Algorithms. We find that Deterministic Policy Gradient methods like TD3 often get stuck at local optima under complex Q-functions, and propose a novel actor architecture! 🧵 
          
                
                1
              
              
                
                5
              
              
                
                24
              
             👀Teach your robots to see what you see—turns out, they get a lot smarter. 🎉Excited to share that our paper "GABRIL: Gaze-Based Regularization for Mitigating Causal Confusion in Imitation Learning" has been accepted to #IROS2025! (1/7) 
          
                
                1
              
              
                
                2
              
              
                
                9
              
             I keep updating the course material every year. Fall 2025 version will be up soon. If anyone has any feedback, I would love to hear. And if you use our course material and publicly acknowledge us, please let me know (they make me feel good 🙂) 
           Interested in robot learning but not sure where to start? I found 3 university courses with online materials (links below): 1. CMU: Introduction to Robot Learning @LeCARLab 2. USC: Robot Learning by @ebiyik_ and @Ishika_S_ 3. TU Berlin: Robot Learning by @Marc__Toussaint
            
          
                
                2
              
              
                
                0
              
              
                
                27
              
             My 1y: (grabs my hands and claps them) Me: Oh, sweetie, when a measure becomes a target, it ceases to be a good measure 
          
                
                21
              
              
                
                196
              
              
                
                3K
              
             In my undergrad, I had a professor whose website said he had an Erdős number of three which is unusually low in his field. I remember thinking it was one of the coolest things. I realized my Erdős number also became three recently. 20-year old me would be proud 😌 
          
                
                0
              
              
                
                0
              
              
                
                48
              
             Thanks @yigitkkorkmaz @ayano_hiranaka @aliangdw @hagenowrobotics Takayuki Kanda @daniel_s_brown @maegan_tucker1 for the workshop organization! Congrats to @JiahuiZhang__32 @yusen_2001 @_abraranwar @SOTA_kke @JosephLim_AI @_jessethomason_ @Jesse_Y_Zhang for the award! 
          
                
                0
              
              
                
                0
              
              
                
                13