Jianwei Yang @jw2yang4ai X Profile

Jianwei Yang

@jw2yang4ai

Followers

4K

Following

2K

Media

76

Statuses

441

RS @Meta SuperIntelligence Lab; ex-MSR; Core contributor of Project Florence, Phi-3V, Omniparser; (Co-)Inventor of FocalNet, SEEM, SoM, DeepStack and Magma.

https://t.co/ZDBfeGMkFk

Redmond, WA

Joined July 2016

Don't wanna be here? Send us removal request.

Jianwei Yang

@jw2yang4ai

5 months

Life Update: Now that I have finished the presentation of last @MSFTResearch project Magma at @CVPR, I am excited to share that I have joined @AIatMeta as a research scientist to further push forward the boundary of multimodal foundation models! I have always been passionate

52

6

384

Oier Mees

@oier_mees

14 days

If you have been impacted by today's layoffs at Meta's AI teams, please know that Microsoft Zurich is hiring for positions in multimodal foundation models & robot learning. You and your teams have given so much to the AI community, I hope we can all give back and support you now

0

3

100

Jianwei Yang

@jw2yang4ai

1 month

🚀Excited to see Qwen3-VL released as the new SOTA open-source vision-language model! What makes it extra special is that it’s powered by DeepStack, a technique I co-developed with Lingchen, who is now a core contributor of Qwen3-VL. When Lingchen and I developed this technique

Qwen

@Alibaba_Qwen

1 month

🚀 We're thrilled to unveil Qwen3-VL — the most powerful vision-language model in the Qwen series yet! 🔥 The flagship model Qwen3-VL-235B-A22B is now open-sourced and available in both Instruct and Thinking versions: ✅ Instruct outperforms Gemini 2.5 Pro on key vision

4

21

284

Jianwei Yang

@jw2yang4ai

2 months

Great work! Building a coherent representation for the complicated world - visual and semantic, 2D and 3D, spatial and temporal, is challenging but critical. Having a single tokenizer for all is definitely a great step stone to next generation of multimodal models!

Jiasen Lu

@jiasenlu

2 months

Vision tokenizers are stuck in 2020🤔while language models revolutionized AI🚀 Language: One tokenizer for everything Vision: Fragmented across modalities & tasks Introducing AToken: The first unified visual tokenizer for images, videos & 3D that does BOTH reconstruction AND

0

1

9

XuDong Wang

@XDWang101

2 months

🎉 Excited to share RecA: Reconstruction Alignment Improves Unified Multimodal Models 🔥 Post-train w/ RecA: 8k images & 4 hours (8 GPUs) → SOTA UMMs: GenEval 0.73→0.90 | DPGBench 80.93→88.15 | ImgEdit 3.38→3.75 Code: https://t.co/yFEvJ0Algw 1/n

6

30

80

Jianwei Yang

@jw2yang4ai

4 months

VLM struggles badly to interpret 3D from 2D observations, but what if it has a good mental model about the world? Checkout our MindJourney - A test-time scaling for spatial reasoning in 3D world. Without any specific training, MindJourney imagines (acts mentally) step-by-step

Yuncong Yang

@YuncongYY

4 months

Test-time scaling nailed code & math—next stop: the real 3D world. 🌍 MindJourney pairs any VLM with a video-diffusion World Model, letting it explore an imagined scene before answering. One frame becomes a tour—and the tour leads to new SOTA in spatial reasoning. 🚀 🧵1/

0

3

31

Yilun Du

@du_yilun

4 months

VLMs often struggle with physical reasoning tasks such as spatial reasoning. Excited to share how we can use world models + test-time search to zero-shot improve spatial reasoning in VLMs!

AK

@_akhaliq

4 months

MindJourney Test-Time Scaling with World Models for Spatial Reasoning

3

24

189

Jianwei Yang

@jw2yang4ai

5 months

Wow, this is so cool! Have been dreaming of building agents that can interact with humans via language communications, and the world via physical interaction (locomotion, manipulation, etc). Definitely a great step-stone and playground!

Chuang Gan

@gan_chuang

5 months

World Simulator, reimagined — now alive with humans, robots, and their vibrant society unfolding in 3D real-world geospatial scenes across the globe! 🚀 One day soon, humans and robots will co-exist in the same world. To prepare, we must address: 1️⃣ How can robots cooperate or

0

3

12

Jianwei Yang

@jw2yang4ai

5 months

@CVPR

0

Jianwei Yang

@jw2yang4ai

5 months

📢 Join us tomorrow morning at our CVPR 2025 poster session (#340, ExHall D, 10:30am–12:30pm) to chat about Project Magma 👉 https://t.co/Jtx3RaQhcs This is a big team effort to build a multimodal agentic model capable of understanding and acting in both digital and physical

4

16

80

Jiasen Lu

@jiasenlu

5 months

check our poster at 240 on exhibition hall D at 10:30 today!

Jiasen Lu

@jiasenlu

11 months

(1/10) 🔥Thrilled to introduce OneDiffusion—our latest work in unified diffusion modeling! 🚀 This model bridges the gap between image synthesis and understanding, excelling in a wide range of tasks: T2I, conditional generation, image understanding, identity preservation,

0

2

18

Jianwei Yang

@jw2yang4ai

5 months

Our afternoon session is about to start very soon with Prof. @RanjayKrishna at Room 101B!

Jianwei Yang

@jw2yang4ai

5 months

🔥@CVPR2025 CVinW 2025 is about to take place very soon!! We have a plenty of great talks and spotlight talks upcoming (@BoqingGo, @RanjayKrishna @furongh @YunzhuLiYZ @sainingxie @CordeliaSchmid, Shizhe Chen). Look forward to seeing you all at 101B from 9am-5pm, June 11th!

0

1

10

Jianwei Yang

@jw2yang4ai

5 months

🔥@CVPR2025 CVinW 2025 is about to take place very soon!! We have a plenty of great talks and spotlight talks upcoming (@BoqingGo, @RanjayKrishna @furongh @YunzhuLiYZ @sainingxie @CordeliaSchmid, Shizhe Chen). Look forward to seeing you all at 101B from 9am-5pm, June 11th!

Jianwei Yang

@jw2yang4ai

6 months

🚀 Excited to announce our 4th Workshop on Computer Vision in the Wild (CVinW) at @CVPR 2025! 🔗 https://t.co/Z5r48oh6iv ⭐We have invinted a great lineup of speakers: Prof. Kaiming He, Prof. @BoqingGo, Prof. @CordeliaSchmid, Prof. @RanjayKrishna, Prof. @sainingxie, Prof.

0

9

39

Furong Huang

@furongh

5 months

Excited to speak at the Workshop on Computer Vision in the Wild @CVPR 2025! 🎥🌍 🗓️ June 11 | 📍 Room 101 B, Music City Center, Nashville, TN 🎸 🧠 Talk: From Perception to Action: Building World Models for Generalist Agents Let’s connect if you're around! #CVPR2025 #robotics

2

17

65

Cohere Labs

@Cohere_Labs

6 months

Our community-led Computer Vision group is thrilled to host @jw2yang4ai, Principal Researcher at Microsoft Research for a session on "Magma: A Foundation Model for Multimodal AI Agents" Thanks to @cataluna84 and @Arkhymadhe for organizing this speaker session 👏

2

24

Jianwei Yang

@jw2yang4ai

6 months

Hope you all had a great #NeurIPS2025 submissions and have a good rest! We are still open to submissions to our CVinW workshop at @CVPR! Welcome to share your work on our workshop with a few clicks! 👉Submit Portal:

openreview.net

Welcome to the OpenReview homepage for CVPR 2025 Workshop CVinW

Jianwei Yang

@jw2yang4ai

6 months

🚀 Excited to announce our 4th Workshop on Computer Vision in the Wild (CVinW) at @CVPR 2025! 🔗 https://t.co/Z5r48oh6iv ⭐We have invinted a great lineup of speakers: Prof. Kaiming He, Prof. @BoqingGo, Prof. @CordeliaSchmid, Prof. @RanjayKrishna, Prof. @sainingxie, Prof.

0

4

54

Richard Sutton

@RichardSSutton

6 months

The latest episode of the Derby Mill Podcast is just out and focused on the "Era of Experience" paper by David Silver and myself. Substack: https://t.co/7jw3PcX2bL Spotify: https://t.co/2X4jsKKKAa Apple: https://t.co/RwdGlENz3f YouTube: https://t.co/fJxrnFYFg6

1

46

206

Ahmed Awadallah

@AhmedHAwadallah

6 months

Introducing Phi-4-reasoning, adding reasoning models to the Phi family of SLMs. The model is trained with both supervised finetuning (using a carefully curated dataset of reasoning demonstration) and Reinforcement Learning. 📌Competitive results on reasoning benchmarks with

4

34

140

Yiping Wang

@ypwang61

6 months

We only need ONE example for RLVR on LLMs to achieve significant improvement on math tasks! 📍RLVR with one training example can boost: - Qwen2.5-Math-1.5B: 36.0% → 73.6% - Qwen2.5-Math-7B: 51.0% → 79.2% on MATH500. 📄 Paper: https://t.co/D65XR9mMs2

14

89

424

Jianwei Yang

@jw2yang4ai

6 months

We have a large group of organizers for our workshop and challenges! 🙌 Feel free to reach out to any of them to get more information about our workshop! ✉️ Kudos to the whole team! 🎉

0

4