Vidit Goel Profile
Vidit Goel

@ViditGoel7

Followers
480
Following
4K
Media
18
Statuses
307

GenerativeML and 3D @Snap prev @PicsartAI | @IITKGP '21 | Computer Vision, Deep learning

New York, USA
Joined November 2018
Don't wanna be here? Send us removal request.
@ViditGoel7
Vidit Goel
2 years
Check our latest updates and improved model for PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor🚀🚀 Project page: https://t.co/6aIxv2MAy2 ArXiv: https://t.co/ShNOGF7Ntz We show that 👇👇
@_akhaliq
AK
3 years
PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models @Gradio demo is out on @huggingface Spaces demo: https://t.co/hR63tPMl5i
3
25
105
@ViditGoel7
Vidit Goel
12 days
Possibly, we can only store high level information like semantics and also some 3D representation in efficient manner and use generative models to decode them to high quality 3D representation of the world where our system can take actions.
0
0
0
@ViditGoel7
Vidit Goel
12 days
Even with this large storage representation is highly important. If we record 30 fps video in 512 x 512 resolution we will only be able to record ~1-2 year of data even if we use whole of storage of brain 1-2 PB. Generative models might be useful for compressing data efficiently
1
0
0
@ViditGoel7
Vidit Goel
12 days
Interesting read. Further, I recently noticed that though the brain only takes 20W of energy, it can have 1-2 petabytes of storage! Moving forward I think we should relax some constraints on long term memory a world model can store.
@AjdDavison
Andrew Davison
13 days
Representation representation representation #SpatialAI See the SLAM Handbook Chapter 18 for my views! https://t.co/EdTa9zcl5F
1
0
14
@humphrey_shi
Humphrey Shi
2 months
This morning on the way to school, my 8-year-old daughter and I talked about fame and impact. We started with MrBeast and internet celebrities—whose work she knows well—but then I introduced Einstein, whose discoveries shaped the technology we use every day. Her big question:
1
1
26
@ViditGoel7
Vidit Goel
3 months
RT @humphrey_shi: Multi-agent coding systems (e.g., Claude Code) are sweeping the world like a storm this summer. The success rests on a si…
0
1
0
@ViditGoel7
Vidit Goel
5 months
Hi all, I will be at CVPR in Nashville from 10-15 June. Lets meet! Also drop by our paper Wonderland: Navigating 3D Scenes from a Single Image https://t.co/Qe43iqeNrJ When - Friday morning session Where - ExHall D Poster #59 #CVPR25
snap-research.github.io
Wonderland: Navigating 3D Scenes from a Single Image
1
0
2
@ViditGoel7
Vidit Goel
6 months
talk by @jon_barron . Completely agree, further if we move towards spatial computing 3D would be definitely needed but again a long term bet. Full talk: https://t.co/7tUzEKrA3K
0
0
2
@AliHassaniJr
Ali Hassani
7 months
Wondering what's happening with NATTEN in 2025? Check out Generalized Neighborhood Attention! Spoiler: NATTEN gets a new stride parameter, we made a simulator for all your analytical studies, AND a Blackwell kernel! Keep reading for more... (1 / 5)
1
6
27
@ViditGoel7
Vidit Goel
10 months
Hi @CVPR, all of my saved reviews for all the paper are deleted. Can you please look into it? It would be very difficult for me to write them again in next few days
1
0
1
@ViditGoel7
Vidit Goel
10 months
https://t.co/yPqUUUt3M9 They shows that we can represent fine-details of identity in a SINGLE token. The key idea is to extract query dependent values from the learned single token using attention and then use it in Cross-Attention, hence the name Nested Attention.
0
0
6
@jasonzada
Jason Zada
11 months
Introducing, The Heist. Every shot was done via text-to video with Google Veo 2. I did all the sound design, edit,ing and music. I can't wait to show you what's in store next year at @secret__level! 4K version here: https://t.co/oLgxWxYKCC
494
806
7K
@ViditGoel7
Vidit Goel
11 months
I have been trying to find a specific library that I used but cant remember the name. I just described what the library did in https://t.co/T4wK7V1nCC and violla, within secs had the exact github link of library that I was looking for. Can't image my work life without such tools
Tweet card summary image
perplexity.ai
Perplexity is a free AI-powered answer engine that provides accurate, trusted, and real-time answers to any question.
0
0
2
@ViditGoel7
Vidit Goel
11 months
Is anyone working on open DUSt3R? @janusch_patas
1
0
0
@ViditGoel7
Vidit Goel
11 months
Check out this amazing work! Now, you can precisely control the sequence of events using a video generation model.
@Dazitu_616
Ziyi Wu @ ICCV
11 months
📢MinT: Temporally-Controlled Multi-Event Video Generation📢 https://t.co/gEnm4DnkAC TL;DR: We identify a fundamental failure mode of existing video generators: they cannot produce videos with sequential events. MinT unlocks this capability with temporal grounding of events. 🧵
0
0
1
@ViditGoel7
Vidit Goel
1 year
Introducing 3D Capture in latest Lens Studio 🎉 . Convert any object video capture to splats and create AR experience using Snapchat Lens. Use it for fun, advertising and much more #GaussianSplatting #snapchat
@LensStudioDev
Lens Studio
1 year
New in Lens Studio: 3D Capture! 🎉 Take a video of any real life object and Lens Studio will reconstruct it as a Gaussian splat to use in your Lenses. Download the latest version of Lens Studio and start building now: https://t.co/B01USJitGF
0
0
6
@Ir1dXD
Dejia Xu
1 year
We present Cavia, the first framework that enables users to generate multiple videos of the same scene with precise control over camera motion, while simultaneously preserving object motion. ✨ https://t.co/id9p1OKlti (1/9)
3
25
138
@ViditGoel7
Vidit Goel
1 year
LLMs when used in real life 🙃
0
1
3
@ViditGoel7
Vidit Goel
1 year
LLMs when used in real life 🙃
0
1
3
@ViditGoel7
Vidit Goel
1 year
5. VideoGen would help to get 3D representation easily rather than relying on skilled captures. What are your thoughts?
0
0
4
@ViditGoel7
Vidit Goel
1 year
3. AR platforms such as @Snap would have benefit from 3D 4. Interactive scenes like games would definitely benefit from 3D. Rendering every view point using VideoGen is so inefficient compared to having few MB of asset that can be rendered from infinite camera trajectories
1
0
4