Vidit Goel @ViditGoel7 X Profile

Vidit Goel

@ViditGoel7

Followers

480

Following

4K

Media

18

Statuses

307

GenerativeML and 3D @Snap prev @PicsartAI | @IITKGP '21 | Computer Vision, Deep learning

https://t.co/otPutBuDgs

New York, USA

Joined November 2018

Don't wanna be here? Send us removal request.

Vidit Goel

@ViditGoel7

2 years

Check our latest updates and improved model for PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor🚀🚀 Project page: https://t.co/6aIxv2MAy2 ArXiv: https://t.co/ShNOGF7Ntz We show that 👇👇

AK

@_akhaliq

3 years

PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models @Gradio demo is out on @huggingface Spaces demo: https://t.co/hR63tPMl5i

3

25

105

Vidit Goel

@ViditGoel7

12 days

Possibly, we can only store high level information like semantics and also some 3D representation in efficient manner and use generative models to decode them to high quality 3D representation of the world where our system can take actions.

0

Vidit Goel

@ViditGoel7

12 days

Even with this large storage representation is highly important. If we record 30 fps video in 512 x 512 resolution we will only be able to record ~1-2 year of data even if we use whole of storage of brain 1-2 PB. Generative models might be useful for compressing data efficiently

1

0

Vidit Goel

@ViditGoel7

12 days

Interesting read. Further, I recently noticed that though the brain only takes 20W of energy, it can have 1-2 petabytes of storage! Moving forward I think we should relax some constraints on long term memory a world model can store.

Andrew Davison

@AjdDavison

13 days

Representation representation representation #SpatialAI See the SLAM Handbook Chapter 18 for my views! https://t.co/EdTa9zcl5F

1

0

14

Humphrey Shi

@humphrey_shi

2 months

This morning on the way to school, my 8-year-old daughter and I talked about fame and impact. We started with MrBeast and internet celebrities—whose work she knows well—but then I introduced Einstein, whose discoveries shaped the technology we use every day. Her big question:

1

26

Vidit Goel

@ViditGoel7

3 months

RT @humphrey_shi: Multi-agent coding systems (e.g., Claude Code) are sweeping the world like a storm this summer. The success rests on a si…

0

1

0

Vidit Goel

@ViditGoel7

5 months

Hi all, I will be at CVPR in Nashville from 10-15 June. Lets meet! Also drop by our paper Wonderland: Navigating 3D Scenes from a Single Image https://t.co/Qe43iqeNrJ When - Friday morning session Where - ExHall D Poster #59 #CVPR25

snap-research.github.io

Wonderland: Navigating 3D Scenes from a Single Image

1

0

2

Vidit Goel

@ViditGoel7

6 months

talk by @jon_barron . Completely agree, further if we move towards spatial computing 3D would be definitely needed but again a long term bet. Full talk: https://t.co/7tUzEKrA3K

0

2

Ali Hassani

@AliHassaniJr

7 months

Wondering what's happening with NATTEN in 2025? Check out Generalized Neighborhood Attention! Spoiler: NATTEN gets a new stride parameter, we made a simulator for all your analytical studies, AND a Blackwell kernel! Keep reading for more... (1 / 5)

1

6

27

Vidit Goel

@ViditGoel7

10 months

Hi @CVPR, all of my saved reviews for all the paper are deleted. Can you please look into it? It would be very difficult for me to write them again in next few days

1

0

1

Vidit Goel

@ViditGoel7

10 months

https://t.co/yPqUUUt3M9 They shows that we can represent fine-details of identity in a SINGLE token. The key idea is to extract query dependent values from the learned single token using attention and then use it in Cross-Attention, hence the name Nested Attention.

0

6

Jason Zada

@jasonzada

11 months

Introducing, The Heist. Every shot was done via text-to video with Google Veo 2. I did all the sound design, edit,ing and music. I can't wait to show you what's in store next year at @secret__level! 4K version here: https://t.co/oLgxWxYKCC

494

806

7K

Vidit Goel

@ViditGoel7

11 months

I have been trying to find a specific library that I used but cant remember the name. I just described what the library did in https://t.co/T4wK7V1nCC and violla, within secs had the exact github link of library that I was looking for. Can't image my work life without such tools

perplexity.ai

Perplexity is a free AI-powered answer engine that provides accurate, trusted, and real-time answers to any question.

0

2

Vidit Goel

@ViditGoel7

11 months

Is anyone working on open DUSt3R? @janusch_patas

1

0

Vidit Goel

@ViditGoel7

11 months

Check out this amazing work! Now, you can precisely control the sequence of events using a video generation model.

Ziyi Wu @ ICCV

@Dazitu_616

11 months

📢MinT: Temporally-Controlled Multi-Event Video Generation📢 https://t.co/gEnm4DnkAC TL;DR: We identify a fundamental failure mode of existing video generators: they cannot produce videos with sequential events. MinT unlocks this capability with temporal grounding of events. 🧵

0

1

Vidit Goel

@ViditGoel7

1 year

Introducing 3D Capture in latest Lens Studio 🎉 . Convert any object video capture to splats and create AR experience using Snapchat Lens. Use it for fun, advertising and much more #GaussianSplatting #snapchat

Lens Studio

@LensStudioDev

1 year

New in Lens Studio: 3D Capture! 🎉 Take a video of any real life object and Lens Studio will reconstruct it as a Gaussian splat to use in your Lenses. Download the latest version of Lens Studio and start building now: https://t.co/B01USJitGF

0

6

Dejia Xu

@Ir1dXD

1 year

We present Cavia, the first framework that enables users to generate multiple videos of the same scene with precise control over camera motion, while simultaneously preserving object motion. ✨ https://t.co/id9p1OKlti (1/9)

3

25

138

Vidit Goel

@ViditGoel7

1 year

LLMs when used in real life 🙃

🌱

@senyoramaris

1 year

https://t.co/n1gdcEclkh

0

1

3

Vidit Goel

@ViditGoel7

1 year

LLMs when used in real life 🙃

🌱

@senyoramaris

1 year

https://t.co/n1gdcEclkh

0

1

3

Vidit Goel

@ViditGoel7

1 year

5. VideoGen would help to get 3D representation easily rather than relying on skilled captures. What are your thoughts?

0

4

Vidit Goel

@ViditGoel7

1 year

3. AR platforms such as @Snap would have benefit from 3D 4. Interactive scenes like games would definitely benefit from 3D. Rendering every view point using VideoGen is so inefficient compared to having few MB of asset that can be rendered from infinite camera trajectories

1

0

4