Anirudha Majumdar
            
            @Majumdar_Ani
Followers
                6K
              Following
                3K
              Media
                196
              Statuses
                540
              Associate Professor in Robotics @Princeton. 20% Research Scientist @GoogleDeepMind in Princeton.
              
              Princeton, NJ
            
            
              
              Joined March 2020
            
            
           Interested to work on generalist robots that are safe, trustworthy, and capable? 🤖 📢 My group at @Princeton is looking for PhD students and postdocs this cycle! PhD: please apply through the MAE department (Dec. 1). Postdocs: please email me directly! 
          
                
                5
              
              
                
                31
              
              
                
                185
              
             How Confident are Video Models? Empowering Video Models to Express their Uncertainty Website:  https://t.co/Cwk0lnihuw  Paper: 
          
            
            arxiv.org
              Generative video models demonstrate impressive text-to-video capabilities, spurring widespread adoption in many real-world applications. However, like large language models (LLMs), video...
            
                
                0
              
              
                
                0
              
              
                
                1
              
             📸 How can we quantify the uncertainty of video models? They are prone to hallucinations, and cannot express: ⏩that they don't know something (epistemic uncertainty) ⏩ that the prompt is too ambiguous (aleatoric uncertainty). Check out our new paper on video model UQ! 👇 
          
                
                2
              
              
                
                2
              
              
                
                39
              
             Large-scale (imperfect) sim evals + small-scale real evals ➡️ statistically valid inferences on real policy performance Check out our new paper! 👇 
           Robotic manipulation has seen tremendous progress in recent years but rigorous evaluation of robot policies remains a challenge! We present our work: "Reliable and Scalable Robot Policy Evaluation with Imperfect Simulators"! 🧵 
            
                
                0
              
              
                
                5
              
              
                
                40
              
             Check out our paper, project website, and code for more information: 📑  https://t.co/BMVHW75ILa  🌐  https://t.co/uCTpDTQhWM  💻 
          
            
            github.com
              Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields - GitHub - irom-princeton/spine: Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields
            
                
                0
              
              
                
                0
              
              
                
                9
              
             Pastor Charles blesses us once again with this wonderful practical sermon that clarifies the many ways in which people who declare themselves as Christians will often rationalize away sinful behaviors (e.g. even anti-semitism, viewing pornography, committing adultery, etc.). They 
          
                
                1
              
              
                
                3
              
              
                
                21
              
             Do visual-geometry semantics offer an edge? They 🗼contain finer structural detail, 🏟 perform similarly to visual-only features in semantic localization, 📷 underperform in radiance field inversion, *unlike* what one would expect! 
          
                
                1
              
              
                
                0
              
              
                
                4
              
             👁️ Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields We find that visual-only features (DINO) outperform visual-geometry features (VGGT) in spatial tasks! 👇 
          
                
                7
              
              
                
                31
              
              
                
                243
              
             📉 Can S-QUBED estimate calibrated uncertainty? In the Panda-70M dataset, S-QUBED estimates uncertainties negatively correlated with accuracy at the 99.9% confidence level, showing calibration. 
          
                
                0
              
              
                
                0
              
              
                
                0
              
             Our key idea is to lift the prompt to a latent space, eliminating input vagueness from total predictive uncertainty. What remains is lack of knowledge! 
          
                
                1
              
              
                
                0
              
              
                
                0
              
             🏆 Congrats to the best paper winner @ CoRL '25's Eval&Deploy Workshop: 📃**Reliable and Scalable Robot Policy Evaluation with Imperfect Simulators** by @ApurvaBadithela! 🔗Paper:  https://t.co/o8IiUliDDP  Thanks to @JasonMa2020 and @DynaRobotics for sponsoring the prize! 
           Are current eval/deployment practices enough for today’s robot policies? Announcing the Eval&Deploy workshop at CoRL 2025 @corl_conf, where we'll explore eval + deployment in the robot learning lifecycle and how to improve it!  https://t.co/aGOqGEAZ85  🗓️ Submissions due Aug 30 
            
                
                3
              
              
                
                13
              
              
                
                126
              
             Kicking off our last session with an exciting talk from @Majumdar_Ani on robot safety! #NERC2025
          
          
                
                0
              
              
                
                3
              
              
                
                15
              
             🏟 Does geometry-grounding improve semantic object localization? We find that visual-geometry semantic features (🏔VGGT) perform similarly to visual-only features (🦕DINOv2 and 🦖DINOv3), which suggests that spatial grounding does not boost object-class semantic feature 
          
                
                0
              
              
                
                0
              
              
                
                1
              
             🏰 Do visual-geometry semantic features contain higher-fidelity spatial content? We find that visual-geometry semantics (🏔VGGT) provide finer spatial content, with more prominent structural detail, e.g., sharper edges, more accurate subpart decomposition, etc, which could be 
          
                
                1
              
              
                
                0
              
              
                
                1
              
             Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields Zhiting Mei, Ola Shorinwa, @Majumdar_Ani tl;dr: who cares, look at those dino icons! OK, distilling DINO into NERF -> better object localization, than VGGT.  https://t.co/1REAIAvksd 
          
          
                
                3
              
              
                
                27
              
              
                
                208
              
             Life Update: I'm excited to announce I've joined @GoogleDeepMind! I'll be focusing on robotics to help push the boundaries of physical AGI. A huge thank you to my former colleagues at Apple—it was a blast to work with such an excellent team this past year. Ready for the new 
          
                
                58
              
              
                
                46
              
              
                
                2K
              
             Our data-centric approach offers a simple path toward building generalist policies by closing the gap between large-scale foundation models and real-world robotic control. Work led by: @AsherJHancock in collaboration with @cindy_x_wu, @LihanZha , @orussakovsky . 
          
                
                1
              
              
                
                1
              
              
                
                9
              
             We demonstrate our "actions as language" representation is crucial via an ablation. A model trained to produce arbitrary action tokens (VLM2VLA-AT) from the VLM’s vocabulary retained its VQA abilities but failed in OOD robotic tasks. 
          
                
                1
              
              
                
                1
              
              
                
                7
              
             Where VLM2VLA truly shines is in out-of-distribution tasks. Its preserved knowledge from LoRA fine-tuning enables novel instruction following (e.g., in other languages) and reasoning about open-world concepts, zero-shot. 
          
                
                1
              
              
                
                3
              
              
                
                9