Zihan Wang ✈️ NeurIPS @wzihanw X Profile

Zihan Wang ✈️ NeurIPS

@wzihanw

Followers

23K

Following

4K

Media

99

Statuses

900

PhD student @NorthwesternU & student researcher @Microsoft. Ex @yutori_ai @deepseek_ai @uiuc_nlp @RUC. I work on Reasoning Agent / RL / efficiency.

https://t.co/Mlk3L1NCQu

Joined March 2022

Don't wanna be here? Send us removal request.

Zihan Wang ✈️ NeurIPS

@wzihanw

8 months

Why does your RL training always collapse? In our new paper of RAGEN, we explore what breaks when you train LLM *Agents* with multi-turn reinforcement learning—and possibly how to fix it. 📄 https://t.co/z0U0612HWT 🌐 https://t.co/4DUfaees48 1/🧵👇

8

90

438

Gabriel Sarch @ NeurIPS

@GabrielSarch

7 days

I'll be at #NeurIPS2025 next week presenting ViGoRL! If you're interested in visual reasoning, RL, or agents, let's schedule a chat.

Gabriel Sarch @ NeurIPS

@GabrielSarch

6 months

How can we get VLMs to move their eyes—and reason step-by-step in visually grounded ways? 👀 We introduce ViGoRL, a RL method that anchors reasoning to image regions. 🎯 It outperforms vanilla GRPO and SFT across grounding, spatial tasks, and visual search (86.4% on V*). 👇🧵

2

5

30

Zihan Wang ✈️ NeurIPS

@wzihanw

3 days

VAGEN poster 𝐭𝐨𝐦𝐨𝐫𝐫𝐨𝐰 at #NeurIPS! 🎮🧠 - 🕚 11am–2pm Wed - 📍 Exhibit Hall C,D,E #5502 We had much fun exploring: • How 𝐰𝐨𝐫𝐥𝐝 𝐦𝐨𝐝𝐞𝐥𝐢𝐧𝐠 helps VLM RL agents learn better policies • 𝐌𝐮𝐥𝐭𝐢-𝐭𝐮𝐫𝐧 𝐏𝐏𝐎 credit assignment via 𝐭𝐰𝐨-𝐥𝐞𝐯𝐞𝐥

Zihan Wang ✈️ NeurIPS

@wzihanw

2 months

🚀Excited to share our NeurIPS 2025 paper VAGEN, a scalable RL framework that trains VLM agents to reason as world models. VLM agents often act without tracking the world: they lose state, fail to anticipate effects, and RL wobbles under sparse, late rewards. Our solution is

0

9

58

Zihan Wang ✈️ NeurIPS

@wzihanw

3 days

VAGEN poster 𝐭𝐨𝐦𝐨𝐫𝐫𝐨𝐰 at #NeurIPS! 🎮🧠 - 🕚 11am–2pm Wed - 📍 Exhibit Hall C,D,E #5502 We had much fun exploring: • How 𝐰𝐨𝐫𝐥𝐝 𝐦𝐨𝐝𝐞𝐥𝐢𝐧𝐠 helps VLM RL agents learn better policies • 𝐌𝐮𝐥𝐭𝐢-𝐭𝐮𝐫𝐧 𝐏𝐏𝐎 credit assignment via 𝐭𝐰𝐨-𝐥𝐞𝐯𝐞𝐥

Zihan Wang ✈️ NeurIPS

@wzihanw

2 months

🚀Excited to share our NeurIPS 2025 paper VAGEN, a scalable RL framework that trains VLM agents to reason as world models. VLM agents often act without tracking the world: they lose state, fail to anticipate effects, and RL wobbles under sparse, late rewards. Our solution is

0

9

58

Heng Ji

@hengjinlp

3 days

I’m thrilled to announce that I’m launching a new startup dedicated to patient-centric AI for drug discovery, and we’re hiring Founding AI Engineers who are passionate about advancing healthcare through cutting-edge AI. Apply here by Jan 10:

2

32

344

Manling Li

@ManlingLi_

6 days

I gradually realized just how lucky I was to have an advisor with such first reaction Let us protect our community together

Heng Ji

@hengjinlp

6 days

Re the openreview mess, I sent this to my students and wanted to share:

5

15

351

Zihan Wang ✈️ NeurIPS

@wzihanw

11 days

Congrats!

Brian Zhan

@brianzhan1

11 days

After three years at CRV, I am stepping onto Striker Venture Partners' founding team, leading the firm's AI investments. Thanks to @BusinessInsider for covering the move.

1

0

8

Qineng Wang @ NeurIPS

@qineng_wang

11 days

Most VLM benchmarks watch the world; few ask how actions *change* it from a robot's eye. Embodied cognition tells us that intelligence isn't just watching – it's enacted through interaction. 👉We introduce ENACT: A benchmark that tests if VLMs can track the evolution of a

7

56

235

Zihan Wang ✈️ NeurIPS

@wzihanw

12 days

Couldn’t agree more — being part of a lab where “aliveness” is protected is rare and precious. Grateful to grow in an environment where curiosity, weirdness, and ambitious ideas are actually encouraged. Excited for year 2 at MLL🎉🥳

Manling Li

@ManlingLi_

12 days

We are looking for PhDs and Postdocs! So proud of my students on achieving so many amazing things during their "very first year". I have been asked many times how I like being faculty, especially with funding cuts. My answer is always "it is the prefect job for me"! Still

1

5

91

Dhruv Batra ✈️ NeurIPS

@DhruvBatra_

16 days

Introducing Yutori Navigator 31 years ago, the modern web era began with Netscape Navigator. Today, we’re introducing Yutori Navigator — a web agent that autonomously navigates websites on its own cloud browser to complete tasks for you. Navigator achieves pareto-domination

28

47

245

Manling Li

@ManlingLi_

17 days

While discussing spatial intelligence of "VLMs", wanted to share an interesting finding we have in ICML25 paper: We actually opens the black box of why VLMs fail at even the simplest spatial question "where is A to B" - 90% of tokens are visual, yet they get only ~10% of the

Shiqi Chen

@shiqi_chen17

7 months

🚀🔥 Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. 🌍🔍 Paper:

10

87

580

Manling Li

@ManlingLi_

18 days

Spatial intelligence has long been one of the biggest bottleneck for VLMs. Two years ago in Sept 2023, when I just started my postdoc, I still remember vividly how we are excited about GPT-4V and how our “What GPT-4V still can’t do” slides were completely dominated by geometric

Fei-Fei Li

@drfeifei

26 days

AI’s next frontier is Spatial Intelligence, a technology that will turn seeing into reasoning, perception into action, and imagination into creation. But what is it? Why does it matter? How do we build it? And how can we use it? Today, I want to share with you my thoughts on

14

126

677

Weihao Tan

@WeihaoTan64

23 days

🚀Introducing Lumine, a generalist AI agent trained within Genshin Impact that can perceive, reason, and act in real time, completing hours-long missions and following diverse instructions within complex 3D open-world environments.🎮 Website: https://t.co/UxSwNKGZml 1/6

32

153

908

Fuli Luo

@luo_fuli14427

24 days

Intelligence will inevitably evolve from language to the physical world, unlocking spatial intelligence for multi-modal perception, reasoning, generation, and action—essential for true AGI. I'm working on building this at @XiaomiMiMo, spearheading a creative and talented team!

23

18

250

Manling Li

@ManlingLi_

24 days

🔥Our #NeurIPS challenge on Foundation Models meet Embodied Agents released the final eval for “Embodied Agent Interface". 🚀Come test your LLMs for Embodied Agent tasks! ⚒️We've newly annotated ~5000 data points for: - Goal Interpretation - Subgoal Decomposition - Action

2

9

25

Google Research

@GoogleResearch

28 days

Introducing Nested Learning: A new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing. Our proof-of-concept model, Hope, shows improved performance in language modeling. Learn more: https://t.co/fpdDlYaleL

142

816

5K

Ruohan Zhang ✈️ NeurIPS

@RuohanZhang76

26 days

I will join Northwestern University Computer Science as an Assistant Professor in Fall 2026! I am actively recruiting PhD students and seeking collaborations in robotics, human-robot interaction, brain-computer interfaces, cognitive science, societal impact of AI & automation,

75

210

2K

Canyu Chen✈️NeurIPS

@CanyuChen3

1 month

🔥The deadline (Nov 3, 2025 AoE) for 𝐍𝐞𝐮𝐫𝐈𝐏𝐒 𝟐𝟎𝟐𝟓 𝐖𝐨𝐫𝐤𝐬𝐡𝐨𝐩 𝐨𝐧 𝐒𝐨𝐜𝐢𝐚𝐥𝐥𝐲 𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞 𝐚𝐧𝐝 𝐓𝐫𝐮𝐬𝐭𝐰𝐨𝐫𝐭𝐡𝐲 𝐅𝐨𝐮𝐧𝐝𝐚𝐭𝐢𝐨𝐧 𝐌𝐨𝐝𝐞𝐥𝐬 (𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞𝐅𝐌) is approaching!🔥 📍 Hybrid (Hilton Mexico City Reforma +

0

15

40