
Rui Yang
@RuiYang70669025
Followers
354
Following
1K
Media
19
Statuses
122
🤖Can MLLM agents reason about spatial relationships and plan atomic actions for navigation & manipulation?. 🔥 Meet EmbodiedBench 🏆—the first fine-grained benchmark for MLLM-based embodied agents!. 📄 Paper: 🌐 Website & code:
4
36
170
RT @Yong18850571: 🔥Our Goedel-Prover-V2-32B topped the PutnamBench Leaderboard by solving 86 problems —nearly 2× more than the previous SO….
0
15
0
RT @HolarisSun: 🚀 RL is powering breakthroughs in LLM alignment, reasoning, and agentic apps. Are you ready to dive into the RL x LLM front….
huggingface.co
0
12
0
RT @hc81Jeremy: Grateful for the chance to present EmbodiedBench at ICML as an Oral. A rewarding experience full of learning. Thanks for @….
0
3
0
RT @RickyRDWang: 🚀 Introducing MA-LoT Theorem Framework: An open-source multi-agent framework utilizing the Long Chain-of-Thought to boost….
0
9
0
RT @Yong18850571: (1/4)🚨 Introducing Goedel-Prover V2 🚨.🔥🔥🔥 The strongest open-source theorem prover to date. 🥇 #1 on PutnamBench: Solves 6….
0
85
0
My coauthor @hc81Jeremy will present EmbodiedBench at ICML 2025! 🤖.Oral Session 6A.📍 West Hall C 🕧July 17 3:30-3:45 pmPDT.📌 Poster Session.📍 East Hall A-B #E-2411🕜 July 17 4:30-7 pm PDT.Come say hi and let’s talk about VLM agent training, evaluation, and benchmarking! 😀
3
3
10
RT @zhenhailongW: Learning to perceive while learning to reason!.We introduce PAPO: Perception-Aware Policy Optimization, a direct upgrade….
0
14
0
RT @tinner_he: 🤩Mind-blowing discovery: Random policies can be surprisingly powerful for decision-making! Our ICML 2025 paper reveals how s….
0
3
0
RT @5000hui: @LiJunnan0409 Awesome work! 🥂 I feel like the design of our GUI-Actor — which can propose multiple candidate regions in one fo….
0
1
0
Insightful post on the scalability of off-policy RL.
Q-learning is not yet scalable. I wrote a blog post about my thoughts on scalable RL algorithms. To be clear, I'm still highly optimistic about off-policy RL and Q-learning! I just think we haven't found the right solution yet (the post discusses why).
1
0
7
RT @jackbai_jkb: 🧵 1/7 Should AI agents "think more" or "do more"? 🤔. The current trend is to scale test-time compute, making agents genera….
0
17
0
RT @FengLuo895614: 🚀 Can LLMs stop overthinking when detailed reasoning isn't needed?.Excited to share our latest work on LLM reasoning: Au….
0
4
0
Excited to share that EmbodiedBench was selected for an Oral at ICML 2025!. We recently added results for new models (InternVL3, Gemma3, Ovis2) and released a large agent trajectory dataset on 🤗: Try training and evaluating your MLLM for embodied agents!
🤖Can MLLM agents reason about spatial relationships and plan atomic actions for navigation & manipulation?. 🔥 Meet EmbodiedBench 🏆—the first fine-grained benchmark for MLLM-based embodied agents!. 📄 Paper: 🌐 Website & code:
2
21
93
Thanks for sharing our work!GUI-Actor is a new GUI grounding method that combines an attention-based action head with a grounding verifier, different from previous text-based coordinate prediction methods.
0
3
15
RT @shizhediao: Does RL truly expand a model’s reasoning🧠capabilities? Contrary to recent claims, the answer is yes—if you push RL training….
0
66
0
RT @qiancheng1231: 📢 New Paper Drop: From Solving to Modeling!.LLMs can solve math problems — but can they model the real world? 🌍. 📄 arXiv….
0
30
0
RT @hendrydong: 🚀 A unified strategy for parallel decoding: Fractured CoT Reasoning.We explore three dims of sampling:.- Reasoning trajecto….
0
23
0