Lu Ling
@LuLing26466911
Followers
301
Following
1K
Media
16
Statuses
144
@NVIDIA research intern丨PhD @PurdueCS丨#AI 丨#ComputerVision丨Agentic AI丨4D/3D GenAI丨 Multimodals
Indiana, USA
Joined October 2018
Our DL3DV-10K dataset paper has been accpeted by #CVPR2024🎉! It provides scene-level videos at 4K resolution, RGB-images, camera pose, and point coulds. The DL3DV-3K is currently available and more versions come soon. Feel free to check our project page: https://t.co/llXwVpx5Oi
8
23
167
Tested several scenes Using SAM-3D, Very nice instance quality! Have seen much efforts on curating the training dataset. A questions come to my mind. How many training dataset do we need for a 3D foundation model? Do we see a clear scaling law on 3D?
Today we’re excited to unveil a new generation of Segment Anything Models: 1️⃣ SAM 3 enables detecting, segmenting and tracking of objects across images and videos, now with short text phrases and exemplar prompts. 🔗 Learn more about SAM 3: https://t.co/tIwymSSD89 2️⃣ SAM 3D
0
1
6
Amazing test of Gemini 3’s multimodal reasoning capabilities: try generating a threejs voxel art scene using only an image as input Prompt: I have provided an image. Code a beautiful voxel art scene inspired by this image. Write threejs code as a single-page
81
255
3K
Introducing Marble by World Labs: a foundation for a spatially intelligent future. Create your world at https://t.co/V267VJu1H9
289
589
3K
AI’s next frontier is Spatial Intelligence, a technology that will turn seeing into reasoning, perception into action, and imagination into creation. But what is it? Why does it matter? How do we build it? And how can we use it? Today, I want to share with you my thoughts on
170
629
3K
Introducing RTFM (Real-Time Frame Model): a highly efficient World Model that generates video frames in real time as you interact with it, powered by a single H100 GPU. RTFM renders persistent and 3D consistent worlds, both real and imaginary. Try our demo of RTFM today!
54
225
1K
Generated 3D worlds aren't just for viewing, you can also play projectile jenga inside link below
27
44
384
It feels like magic — now everyone can step inside a hobbit’s house and explore it in 3D!
0
0
2
Generate persistent 3D worlds from a single image, bigger and better than ever! We’re excited to share our latest results and invite you to try out our world generation model in a limited beta preview.
210
529
4K
A picture now is worth more than a thousand words in genAI; it can be turned into a full 3D world! And you can stroll in this garden endlessly long, it will still be there.
148
345
3K
What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
831
3K
14K
We're thrilled to release & open-source Hunyuan3D World Model 1.0! This model enables you to generate immersive, explorable, and interactive 3D worlds from just a sentence or an image. It's the industry's first open-source 3D world generation model, compatible with CG pipelines
179
606
3K
If you have limited computing source, come by and see what we can help on testing your methods on #DL3DV-140!
0
0
2
We provide an easy-to-use and flexible interface for benchmarking on #DL3DV-140. Just send us a request via Hugging Face or GitHub — we’ll help you visualize and compare your method against SOTA such as triangle-splat, 2D/3DGS, ZipNeRF. Save time. Benchmark smarter!
1
0
2
Congrats to "VGGT" wins the best paper award at #CVPR2025 ! We are happy that #DL3DV benefits "VGGT" and the community. We will host the #DL3DV Demo session this afternoon from 4:00-6:00 pm. Come by and see what is the new in DL3DV!
Many Congratulations to @jianyuan_wang, @MinghaoChen23, @n_karaev, Andrea Vedaldi, Christian Rupprecht and @davnov134 for winning the Best Paper Award @CVPR for "VGGT: Visual Geometry Grounded Transformer" 🥇🎉 🙌🙌 #CVPR2025!!!!!!
1
1
4
🚀 Struggling with the lack of high-quality data for AI-driven human-object interaction research? We've got you covered! Introducing HUMOTO, a groundbreaking 4D dataset for human-object interaction, developed with a combination of wearable motion capture, SOTA 6D pose
12
156
822
Supervised learning has held 3D Vision back for too long. Meet RayZer — a self-supervised 3D model trained with zero 3D labels: ❌ No supervision of camera & geometry ✅ Just RGB images And the wild part? RayZer outperforms supervised methods (as 3D labels from COLMAP is noisy)
5
81
436
Nvidia just released Cosmos-Transfer1 on Hugging Face Conditional World Generation with Adaptive Multimodal Control
8
97
514
Large spatial model code updated! We now enable scalable, feed-forward 3D reconstruction with near real-time semantic understanding. By integrating #DUSt3R, #GaussianSplatting, and #CLIP, we’re pushing semantic 3D mapping to the next level. Try it now: https://t.co/RAH40SAitG
0
21
133
One reason generative 3D is hard is that the world has barely any 3D training data. Today, most humans on earth will snap a few photos and write paragraphs of text. In contrast, earth has maybe ~50k 3D artists, and each will make maybe ~1k photoreal 3D models over their career.
35
31
391