wzihanw Profile Banner
Zihan Wang ✈️ NeurIPS Profile
Zihan Wang ✈️ NeurIPS

@wzihanw

Followers
23K
Following
4K
Media
99
Statuses
900

PhD student @NorthwesternU & student researcher @Microsoft. Ex @yutori_ai @deepseek_ai @uiuc_nlp @RUC. I work on Reasoning Agent / RL / efficiency.

Joined March 2022
Don't wanna be here? Send us removal request.
@wzihanw
Zihan Wang ✈️ NeurIPS
8 months
Why does your RL training always collapse? In our new paper of RAGEN, we explore what breaks when you train LLM *Agents* with multi-turn reinforcement learning—and possibly how to fix it. 📄 https://t.co/z0U0612HWT 🌐 https://t.co/4DUfaees48 1/🧵👇
8
90
438
@GabrielSarch
Gabriel Sarch @ NeurIPS
7 days
I'll be at #NeurIPS2025 next week presenting ViGoRL! If you're interested in visual reasoning, RL, or agents, let's schedule a chat.
@GabrielSarch
Gabriel Sarch @ NeurIPS
6 months
How can we get VLMs to move their eyes—and reason step-by-step in visually grounded ways? 👀 We introduce ViGoRL, a RL method that anchors reasoning to image regions. 🎯 It outperforms vanilla GRPO and SFT across grounding, spatial tasks, and visual search (86.4% on V*). 👇🧵
2
5
30
@wzihanw
Zihan Wang ✈️ NeurIPS
3 days
VAGEN poster 𝐭𝐨𝐦𝐨𝐫𝐫𝐨𝐰 at #NeurIPS! 🎮🧠 - 🕚 11am–2pm Wed - 📍 Exhibit Hall C,D,E #5502 We had much fun exploring: • How 𝐰𝐨𝐫𝐥𝐝 𝐦𝐨𝐝𝐞𝐥𝐢𝐧𝐠 helps VLM RL agents learn better policies • 𝐌𝐮𝐥𝐭𝐢-𝐭𝐮𝐫𝐧 𝐏𝐏𝐎 credit assignment via 𝐭𝐰𝐨-𝐥𝐞𝐯𝐞𝐥
@wzihanw
Zihan Wang ✈️ NeurIPS
2 months
🚀Excited to share our NeurIPS 2025 paper VAGEN, a scalable RL framework that trains VLM agents to reason as world models. VLM agents often act without tracking the world: they lose state, fail to anticipate effects, and RL wobbles under sparse, late rewards. Our solution is
0
9
58
@wzihanw
Zihan Wang ✈️ NeurIPS
3 days
VAGEN poster 𝐭𝐨𝐦𝐨𝐫𝐫𝐨𝐰 at #NeurIPS! 🎮🧠 - 🕚 11am–2pm Wed - 📍 Exhibit Hall C,D,E #5502 We had much fun exploring: • How 𝐰𝐨𝐫𝐥𝐝 𝐦𝐨𝐝𝐞𝐥𝐢𝐧𝐠 helps VLM RL agents learn better policies • 𝐌𝐮𝐥𝐭𝐢-𝐭𝐮𝐫𝐧 𝐏𝐏𝐎 credit assignment via 𝐭𝐰𝐨-𝐥𝐞𝐯𝐞𝐥
@wzihanw
Zihan Wang ✈️ NeurIPS
2 months
🚀Excited to share our NeurIPS 2025 paper VAGEN, a scalable RL framework that trains VLM agents to reason as world models. VLM agents often act without tracking the world: they lose state, fail to anticipate effects, and RL wobbles under sparse, late rewards. Our solution is
0
9
58
@hengjinlp
Heng Ji
3 days
I’m thrilled to announce that I’m launching a new startup dedicated to patient-centric AI for drug discovery, and we’re hiring Founding AI Engineers who are passionate about advancing healthcare through cutting-edge AI. Apply here by Jan 10:
2
32
344
@ManlingLi_
Manling Li
6 days
I gradually realized just how lucky I was to have an advisor with such first reaction Let us protect our community together
@hengjinlp
Heng Ji
6 days
Re the openreview mess, I sent this to my students and wanted to share:
5
15
351
@wzihanw
Zihan Wang ✈️ NeurIPS
11 days
Congrats!
@brianzhan1
Brian Zhan
11 days
After three years at CRV, I am stepping onto Striker Venture Partners' founding team, leading the firm's AI investments. Thanks to @BusinessInsider for covering the move.
1
0
8
@qineng_wang
Qineng Wang @ NeurIPS
11 days
Most VLM benchmarks watch the world; few ask how actions *change* it from a robot's eye. Embodied cognition tells us that intelligence isn't just watching – it's enacted through interaction. 👉We introduce ENACT: A benchmark that tests if VLMs can track the evolution of a
7
56
235
@wzihanw
Zihan Wang ✈️ NeurIPS
12 days
Couldn’t agree more — being part of a lab where “aliveness” is protected is rare and precious. Grateful to grow in an environment where curiosity, weirdness, and ambitious ideas are actually encouraged. Excited for year 2 at MLL🎉🥳
@ManlingLi_
Manling Li
12 days
We are looking for PhDs and Postdocs! So proud of my students on achieving so many amazing things during their "very first year". I have been asked many times how I like being faculty, especially with funding cuts. My answer is always "it is the prefect job for me"! Still
1
5
91
@DhruvBatra_
Dhruv Batra ✈️ NeurIPS
16 days
Introducing Yutori Navigator 31 years ago, the modern web era began with Netscape Navigator. Today, we’re introducing Yutori Navigator — a web agent that autonomously navigates websites on its own cloud browser to complete tasks for you. Navigator achieves pareto-domination
28
47
245
@ManlingLi_
Manling Li
17 days
While discussing spatial intelligence of "VLMs", wanted to share an interesting finding we have in ICML25 paper: We actually opens the black box of why VLMs fail at even the simplest spatial question "where is A to B" - 90% of tokens are visual, yet they get only ~10% of the
@shiqi_chen17
Shiqi Chen
7 months
🚀🔥 Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. 🌍🔍 Paper:
10
87
580
@ManlingLi_
Manling Li
18 days
Spatial intelligence has long been one of the biggest bottleneck for VLMs. Two years ago in Sept 2023, when I just started my postdoc, I still remember vividly how we are excited about GPT-4V and how our “What GPT-4V still can’t do” slides were completely dominated by geometric
@drfeifei
Fei-Fei Li
26 days
AI’s next frontier is Spatial Intelligence, a technology that will turn seeing into reasoning, perception into action, and imagination into creation. But what is it? Why does it matter? How do we build it? And how can we use it? Today, I want to share with you my thoughts on
14
126
677
@WeihaoTan64
Weihao Tan
23 days
🚀Introducing Lumine, a generalist AI agent trained within Genshin Impact that can perceive, reason, and act in real time, completing hours-long missions and following diverse instructions within complex 3D open-world environments.🎮 Website: https://t.co/UxSwNKGZml 1/6
32
153
908
@luo_fuli14427
Fuli Luo
24 days
Intelligence will inevitably evolve from language to the physical world, unlocking spatial intelligence for multi-modal perception, reasoning, generation, and action—essential for true AGI. I'm working on building this at @XiaomiMiMo, spearheading a creative and talented team!
23
18
250
@ManlingLi_
Manling Li
24 days
🔥Our #NeurIPS challenge on Foundation Models meet Embodied Agents released the final eval for “Embodied Agent Interface". 🚀Come test your LLMs for Embodied Agent tasks! ⚒️We've newly annotated ~5000 data points for: - Goal Interpretation - Subgoal Decomposition - Action
2
9
25
@GoogleResearch
Google Research
28 days
Introducing Nested Learning: A new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing. Our proof-of-concept model, Hope, shows improved performance in language modeling. Learn more: https://t.co/fpdDlYaleL
142
816
5K
@RuohanZhang76
Ruohan Zhang ✈️ NeurIPS
26 days
I will join Northwestern University Computer Science as an Assistant Professor in Fall 2026! I am actively recruiting PhD students and seeking collaborations in robotics, human-robot interaction, brain-computer interfaces, cognitive science, societal impact of AI & automation,
75
210
2K
@CanyuChen3
Canyu Chen✈️NeurIPS
1 month
🔥The deadline (Nov 3, 2025 AoE) for 𝐍𝐞𝐮𝐫𝐈𝐏𝐒 𝟐𝟎𝟐𝟓 𝐖𝐨𝐫𝐤𝐬𝐡𝐨𝐩 𝐨𝐧 𝐒𝐨𝐜𝐢𝐚𝐥𝐥𝐲 𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞 𝐚𝐧𝐝 𝐓𝐫𝐮𝐬𝐭𝐰𝐨𝐫𝐭𝐡𝐲 𝐅𝐨𝐮𝐧𝐝𝐚𝐭𝐢𝐨𝐧 𝐌𝐨𝐝𝐞𝐥𝐬 (𝐑𝐞𝐬𝐩𝐨𝐧𝐬𝐢𝐛𝐥𝐞𝐅𝐌) is approaching!🔥 📍 Hybrid (Hilton Mexico City Reforma +
0
15
40