leoyerrrr Profile Banner
HanRong YE Profile
HanRong YE

@leoyerrrr

Followers
892
Following
1K
Media
62
Statuses
371

@Nvidia Research Scientist | Open-Souce Omni-Modality LLMs

Hong Kong
Joined October 2012
Don't wanna be here? Send us removal request.
@leoyerrrr
HanRong YE
18 days
OmniVinci is now #1 paper on Huggingface!!! 🤗 Building omni-modal LLMs is MORE than just mixing tokens 😉 At @NVIDIA, we explored deeper possibilities in building truly omni-modal systems — leading to OmniVinci-9B, which introduces three key innovations: - OmniAlignNet – a
11
27
148
@leoyerrrr
HanRong YE
5 days
Well our research finds that RL using FP4+LORA actually achieves faster reward growth and higher final accuracy than 16-bit LoRA and QLoRA: https://t.co/SmcEQMPiFv
@zzlccc
Zichen Liu
7 days
BF16 -> FP16 is such a simple (one configuration change in Oat) yet fundamental fix for inference-training mismatch. With FP16, the most basic importance sampling PG outperforms all algorithmic fixes in BF16. Let's rethink RL stability from the precision perspective.🔎
0
0
4
@leoyerrrr
HanRong YE
5 days
0
0
1
@leoyerrrr
HanRong YE
5 days
📈 According to the official test results from OmniVideoBench using our open-sourced model weights, OmniVinci has once again claimed #1 performance in the 7B LLM category! In addition, the model has been out for just two weeks and downloads have already soared past 6K 🌟
2
0
4
@leoyerrrr
HanRong YE
6 days
560A27B omni-modal MoE has landed 🤗 …but OmniVinci-9B is still my favorite size 😃
@Meituan_LongCat
Meituan LongCat
6 days
🔥 LongCat-Flash-Omni: Multimodal + Low-Latency 🏆 Leading Performance among Open-Source Omni-modal Models ☎️ Real-time Spoken Interaction: Millisecond-level E2E latency 🕒 128K context + Supports > 8min real-time AV interaction 🎥 Multimodal I/O: Arbitrary Combination of
0
0
2
@zy27962986
Zongyu Lin
7 days
🚀Really excited to see this amazing arch change (KDA) finally coming out! Replacing global attention with linear hybrid arch: better pretraining ppls, long context evals, downstream math&code&stem evals after RL, >6 * throughput at 1M to unblock more downstream potentials to
@Kimi_Moonshot
Kimi.ai
8 days
Kimi Linear Tech Report is dropped! 🚀 https://t.co/LwNB2sQnzM Kimi Linear: A novel architecture that outperforms full attention with faster speeds and better performance—ready to serve as a drop-in replacement for full attention, featuring our open-sourced KDA kernels! Kimi
1
18
55
@leoyerrrr
HanRong YE
9 days
And we at #NVIDIA Research are still seeking research interns to explore omni-modal LLMs across a variety of domains, including robotics (VLA), visual agentic tool using, world modeling, and unified understanding and generation. Drop me an email if you are interested!
@leoyerrrr
HanRong YE
18 days
OmniVinci is now #1 paper on Huggingface!!! 🤗 Building omni-modal LLMs is MORE than just mixing tokens 😉 At @NVIDIA, we explored deeper possibilities in building truly omni-modal systems — leading to OmniVinci-9B, which introduces three key innovations: - OmniAlignNet – a
0
1
12
@leoyerrrr
HanRong YE
10 days
Jensen is streaming live - AI, 6G, Quantum, Models, Enterprise, Robotics, Factories 💚 #NVIDIAGTC #NVIDIA https://t.co/he1nst3nI9
0
1
6
@leoyerrrr
HanRong YE
10 days
🚀 📷🍌
@zhegan4
Zhe Gan
15 days
🎁🎁 We release Pico-Banana-400K, a large-scale, high-quality image editing dataset distilled from Nana-Banana across 35 editing types. 🔗 Data link: https://t.co/mi06ddf3mN 🔗Paper link: https://t.co/AaZM02xcJr It includes 258K single-turn image editing data, 72K multi-turn
0
0
4
@leoyerrrr
HanRong YE
10 days
Your model can do COT reasoning — but is it actually correct?👿
@sueqian111
Yusu Qian
10 days
🧩 New paper out! We introduce PRISM-Bench, a diagnostic benchmark for puzzle-based multimodal reasoning. Unlike standard VQA, PRISM-Bench tests not only if models can solve visual puzzles, but how their reasoning unfolds. 💡 Models must spot the first error in a
0
0
8
@leoyerrrr
HanRong YE
12 days
0
0
0
@tydsh
Yuandong Tian
15 days
Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
475
284
7K
@leoyerrrr
HanRong YE
18 days
@nvidia Joint work of @leoyerrrr , @yin_hongxu , @huckiyang , @goelarushi27 , @AaronWeiHuang , @LigengZhu , Yuanhang Su†, Sean Lin†, @anjjei , Zhen Wan†, @MXzBFhjFpS1jyMI , @YumingLou , Dong Yang†, @zhijianliu_ , @yukangchen_ , @AmbrishDantrey, @ehsanjjjjj , @SreyanG , Daguang Xu,
0
0
1
@leoyerrrr
HanRong YE
18 days
@nvidia 📺Also check the nice video made by @huckiyang
0
1
2
@leoyerrrr
HanRong YE
18 days
@nvidia 🔗 OmniVinci by NVIDIA Research 🌐 Webpage: https://t.co/qRscvzkSkh 💻 GitHub: https://t.co/2efUT4oS4C 🤖 Model: https://t.co/06nWW6gYO2 📄 Paper:
Tweet card summary image
huggingface.co
1
1
6
@leoyerrrr
HanRong YE
20 days
Off to ICCV! Also, we have an omni-modal LLM reveal coming next Monday… straight from Hawaiiiii 🌴
1
4
34
@shizhediao
Shizhe Diao
21 days
Proud to see NVIDIA recognized by AI World as a leader in the open-source AI ecosystem. From Nemotron and BioNeMo to Cosmos, GR00T, and Canary, our contributions span foundation models, scientific computing, and agentic reasoning. I feel sooo excited to be part of the Nemotron
6
7
37
@leoyerrrr
HanRong YE
1 month
If you’re interested, please send your CV and cover letter to hanrongy@nvidia.com
0
0
1