Zhiding Yu Profile
Zhiding Yu

@ZhidingYu

Followers
8K
Following
363
Media
32
Statuses
179

Working to make machines understand the world like human beings. Words are my own.

Santa Clara
Joined July 2020
Don't wanna be here? Send us removal request.
@ZhidingYu
Zhiding Yu
5 months
Thank you AK!. Excited to introduce Eagle 2.5, NVIDIA’s latest vision-language model that brings strong long-context capabilities across both image and video understanding — all with just 8B parameters. Most existing VLMs struggle with high-res inputs and long video contexts.
@arankomatsuzaki
Aran Komatsuzaki
5 months
Nvidia presents Eagle 2.5!. - A family of frontier VLMs for long-context multimodal learning.- Eagle 2.5-8B matches the results of GPT-4o and Qwen2.5-VL-72B on long-video understanding
Tweet media one
1
10
49
@ZhidingYu
Zhiding Yu
19 days
RT @wonmin_byeon: 🚀 New paper: STORM — Efficient VLM for Long Video Understanding. STORM cuts compute costs by up to 8× and reduces decodin….
0
26
0
@grok
Grok
20 days
Join millions who have switched to Grok.
664
774
6K
@ZhidingYu
Zhiding Yu
1 month
RT @cihangxie: 🚀 Excited to share GPT-Image-Edit-1.5M — our new large-scale, high-quality, fully open image editing dataset for the researc….
0
50
0
@ZhidingYu
Zhiding Yu
2 months
RT @shizhediao: New tech report out! 🚀.Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training.An expanded version of our….
Tweet card summary image
arxiv.org
Recent advancements in reasoning-focused language models such as OpenAI's O1 and DeepSeek-R1 have shown that scaling test-time computation-through chain-of-thought reasoning and iterative...
0
14
0
@ZhidingYu
Zhiding Yu
2 months
RT @FuEnYang1: 🤖 How can we teach embodied agents to think before they act?. 🚀 Introducing ThinkAct — a hierarchical Reasoning VLA framewor….
0
26
0
@ZhidingYu
Zhiding Yu
2 months
And today we have just opened sourced the Eagle 2.5 model.You are welcome to download and give a try!.We will also open source the fine-tuning code for Eagle 2/2.5 soon at Stay tuned.
Tweet media one
@ZhidingYu
Zhiding Yu
2 months
I did not notice this until just now. Thank you @andimarafioti for the recommendation! Very glad that even though Eagle 2 is not our latest work, people still find it very useful.
1
6
46
@ZhidingYu
Zhiding Yu
2 months
I did not notice this until just now. Thank you @andimarafioti for the recommendation! Very glad that even though Eagle 2 is not our latest work, people still find it very useful.
@andimarafioti
Andi Marafioti
3 months
The Eagle 2 paper from Nvidia is such a goldmine.
Tweet media one
1
3
18
@ZhidingYu
Zhiding Yu
3 months
Come to the T4V Workshop this Thursday (June 12th) and check the latest development in Transformers!.
@CMHungSteven
Min-Hung (Steve) Chen
3 months
@CVPR is around the corner!!.Join us at the Workshop on T4V at #CVPR2025 with a great speaker lineup (@MikeShou1, @jw2yang4ai, @WenhuChen, @roeiherzig, Yuheng Li, Kristen Grauman) covering diverse topics!. Website: #CVPR #Transformer #Vision #T4V2025 #T4V
Tweet media one
Tweet media two
0
2
18
@ZhidingYu
Zhiding Yu
3 months
Document and Enterprise Intelligence is arguably one of the most important applications of VLMs and cloud services. NVIDIA VLM technologies help to build commercial grade models excelling in this area. The Eagle VLM Team, together with other colleagues at NVIDIA, are proud to be.
@NVIDIAAIDev
NVIDIA AI Developer
3 months
🥇Our NVIDIA Llama Nemotron Nano VL model is #1 on the OCRBench V2 leaderboard. Designed for advanced intelligent document processing and understanding, this model extracts diverse info from complex documents with precision, all on a single GPU. 📗 Get the technical details
Tweet media one
0
3
17
@ZhidingYu
Zhiding Yu
3 months
RT @shizhediao: Does RL truly expand a model’s reasoning🧠capabilities? Contrary to recent claims, the answer is yes—if you push RL training….
0
67
0
@ZhidingYu
Zhiding Yu
4 months
RT @rohanpaul_ai: Cool paper from @nvidia. Prior methods for training LLMs for tool use rely on imitation or distilled reasoning, limiting….
0
44
0
@ZhidingYu
Zhiding Yu
4 months
Check this super cool work done by our intern @ShaokunZhang1 - RL + Tool Using is the future of LLM Agent!. Before joining NVIDIA, Shaokun was a contributor of the famous multi-agent workflow framework #AutoGen. Now, the age of agent learning is coming beyond workflow control!.
@ShaokunZhang1
Shaokun Zhang
4 months
Tool-using LLMs can learn to reason—without reasoning traces. 🔥 We present Nemotron-Research-Tool-N1, a family of tool-using reasoning LLMs trained entirely via rule-based reinforcement learning—no reasoning supervision, no distillation. 📄 Paper: 💻
Tweet media one
1
4
40
@ZhidingYu
Zhiding Yu
4 months
RT @CMHungSteven: The 4th Workshop on Transformers for Vision (T4V) at CVPR 2025 is soliciting self-nominations for reviewers. If you're in….
0
11
0
@ZhidingYu
Zhiding Yu
4 months
“仗打完了,他们赚什么”.
@JiaweiShen2568
渡边君
4 months
《投名状》是部极品神片.神片都有一个毛病.就是在上映时票房不怎么火.因为太复杂很多人没看懂.但时间深沉下来越来越火.仨老头可比晚清留下来的照片画像要吓人多了.这片子艺术成分很高.三四层楼那么高
0
0
3
@ZhidingYu
Zhiding Yu
4 months
Congrats @angli_ai and team!.
@SimularAI
Simular
4 months
The Simular team is proud to share:. 🎉 𝗔𝗴𝗲𝗻𝘁 𝗦 has won the 𝗕𝗲𝘀𝘁 𝗣𝗮𝗽𝗲𝗿 𝗔𝘄𝗮𝗿𝗱 at the Agentic AI for Science Workshop at #ICLR2025 @iclr_conf! 🎉 . It’s the first open-source computer-use agent, and the first to surpass 20% on OSWorld at at the time of its
Tweet media one
1
0
4
@ZhidingYu
Zhiding Yu
5 months
If you are interested, do not hesitate to DM us, or come to our poster!.
0
0
2
@ZhidingYu
Zhiding Yu
5 months
[9/9] Strong Image task performance . Eagle 2.5 shows consistent improvement over Eagle 2 thanks to the better vision encoder and mixed image-video training. An interesting observation here is that joint training with video also helps image understanding, which echoes the need to
Tweet media one
1
0
2
@ZhidingYu
Zhiding Yu
5 months
[8/9] Excellent long-context scaling. While certain public models show diminishing gain or even decreased results over longer inputs, Eagle-2.5 benefits from increased input length, leading to consistent improvement.
Tweet media one
1
0
1