Rahul Venkatesh @Rahul_Venkatesh X Profile

Rahul Venkatesh

@Rahul_Venkatesh

Followers

192

Following

73

Media

23

Statuses

67

CS Ph.D. student at Stanford @NeuroAILab @StanfordAILab

https://t.co/tDR2WHmtGN

Joined November 2009

Don't wanna be here? Send us removal request.

Rahul Venkatesh

@Rahul_Venkatesh

5 months

AI models segment scenes based on how things appear, but babies segment based on what moves together. We utilize a visual world model that our lab has been developing, to capture this concept — and what's cool is that it beats SOTA models on zero-shot segmentation and physical

6

15

55

Daniel Yamins

@dyamins

2 months

A couple of months ago @KordingLab trolled (in the best possible sense of that term) my happy-go-lucky @StanfordBrain podcast on Brain simulations. Actually he had some interesting points.... We decided to have an in-depth "podcast" about it, for your listening pleasure:

3

15

94

Xihui Liu

@XihuiLiu

2 months

Our part-aware 3D generation work, OmniPart, is accepted by Siggraph Asia 2025. Code and model released! Paper: https://t.co/vEAyV5kqD2 Project page: https://t.co/ovnAysSa7I Code: https://t.co/dToyRki7R8 Demo: https://t.co/9gcBmo2NdP

1

40

299

Rishubh Parihar

@RishubhParihar

2 months

“Make it red.” “No! More red!” “Ughh… slightly less red.” “Perfect!” ♥️ 🎚️Kontinuous Kontext adds slider-based control over edit strength to instruction-based image editing, enabling smooth, continuous transformations!

16

36

158

Keshigeyan Chandrasegaran

@keshigeyan

3 months

Super excited about this line of work! 🚀 A simple, scalable recipe for training diffusion language models using autoregressive models. We're releasing our tech report, model weights, and inference code!

Radical Numerics

@RadicalNumerics

3 months

Introducing RND1, the most powerful base diffusion language model (DLM) to date. RND1 (Radical Numerics Diffusion) is an experimental DLM with 30B params (3B active) with a sparse MoE architecture. We are making it open source, releasing weights, training details, and code to

0

9

68

Imran Thobani

@cogphilosopher

3 months

1/x Our new method, the Inter-Animal Transform Class (IATC), is a principled way to compare neural network models to the brain. It's the first to ensure both accurate brain activity predictions and specific identification of neural mechanisms. Preprint: https://t.co/hPqo5PrZoc

3

14

46

Daniel Bear

@recursus

3 months

These are the most impressive examples of physical understanding I've seen from a computer vision model (let alone one that's not hooked up to an LLM.) And IMO the first good explanation of *how* physical understanding can arise without supervision.

Klemen Kotar

@KlemenKotar

3 months

PSI enables some cool zero-shot applications: visual Jenga, physical video editing, and motion estimation for robotics.

0

2

12

Klemen Kotar

@KlemenKotar

3 months

1/ A good world model should be promptable like an LLM, offering flexible control and zero-shot answers to many questions. Language models have benefited greatly from this fact, but it's been slow to come to vision. We introduce PSI: a path to truly interactive visual world

3

35

131

Klemen Kotar

@KlemenKotar

3 months

PSI enables some cool zero-shot applications: visual Jenga, physical video editing, and motion estimation for robotics.

1

2

13

Rahul Venkatesh

@Rahul_Venkatesh

3 months

(4/) If you'd like to explore this more, we provide code and models here with detailed documentation:

github.com

Contribute to neuroailab/SpelkeNet development by creating an account on GitHub.

0

1

Rahul Venkatesh

@Rahul_Venkatesh

3 months

(3/) It also turns out that our segments are really useful for complex physical object manipulation.

1

0

Rahul Venkatesh

@Rahul_Venkatesh

3 months

(2/) This capability helps discover more physically meaningful object segments, compared to those from state-of-the-art models like SegmentAnything (SAM).

1

0

1

Rahul Venkatesh

@Rahul_Venkatesh

3 months

(1/) Once optical flow is integrated, it allows us to interact with the scene through virtual pokes.

1

0

1

Rahul Venkatesh

@Rahul_Venkatesh

3 months

Excited to share PSI — our new framework for building pure vision foundation world models via Probabilistic Structure Integration! World models that rely on language conditioning often fall short in enabling physical interaction. Integrating structured signals like optical flow

Klemen Kotar

@KlemenKotar

3 months

1/ A good world model should be promptable like an LLM, offering flexible control and zero-shot answers to many questions. Language models have benefited greatly from this fact, but it's been slow to come to vision. We introduce PSI: a path to truly interactive visual world

1

3

4

Daniel Yamins

@dyamins

3 months

Here is our best thinking about how to make world models. I would apologize for it being a massive 40-page behemoth, but it's worth reading.

Klemen Kotar

@KlemenKotar

3 months

1/ A good world model should be promptable like an LLM, offering flexible control and zero-shot answers to many questions. Language models have benefited greatly from this fact, but it's been slow to come to vision. We introduce PSI: a path to truly interactive visual world

5

41

220

Greta Tuckute

@GretaTuckute

4 months

Humans largely learn language through speech. In contrast, most LLMs learn from pre-tokenized text. In our #Interspeech2025 paper, we introduce AuriStream: a simple, causal model that learns phoneme, word & semantic information from speech. Poster P6, Aug 19 at 13:30, Foyer 2.2!

8

32

194

Judy Fan

@judyefan

5 months

Thrilled to welcome members of @cogsci_soc to the SF/Bay area for #CogSci2025 this week! Here's a preview of what the Cognitive Tools Lab 🧠🛠️ @Stanford @StanfordPsych will be presenting!

3

16

149

Gordon Wetzstein

@GordonWetzstein

5 months

🚀 Just published in Nature Photonics: synthetic aperture waveguide holography—a new path toward ultra-thin, high-quality 3D mixed reality displays. 📄 https://t.co/sjX7HpqvrT #Photonics #Holography #MR 1/5

10

65

362

Rahul Venkatesh

@Rahul_Venkatesh

5 months

I’m one of those people who still enjoys the archaic thrill of coding without AI tools—just me and the editor. But I recently tried Anycoder by @_akhaliq and was genuinely impressed. You describe it and it builds the app you want, and deploys on huggingface:

0

1

14

AK

@_akhaliq

5 months

Discovering and using Spelke segments

1

3

39