Siddarth Venkatraman @siddarthv66 X Profile

Siddarth Venkatraman

@siddarthv66

Followers

270

Following

646

Media

12

Statuses

196

PhD at Mila | RL and other stuff I find interesting

Joined September 2023

Don't wanna be here? Send us removal request.

Siddarth Venkatraman

@siddarthv66

2 days

RT @PrimeIntellect: Introducing the Environments Hub. RL environments are the key bottleneck to the next wave of AI progress, but big labs….

0

359

0

Siddarth Venkatraman

@siddarthv66

6 days

RT @IAmTimNguyen: I respectfully disagree with Ed. Was Kepler's planetary analysis "real" mathematics or just astronomy? Are IMO problems….

0

13

0

Siddarth Venkatraman

@siddarthv66

10 days

This is so fucking cool.

Clad3815

@Clad3815

10 days

GPT-5 Plays Pokémon Crystal - Update 🔥. GPT-5 earned its 7th badge at 3,321 steps. A huge improvement over o3's 11,910 steps! That's roughly a 3× speedup, similar to what we saw in the Pokémon Red run. From my observations, spatial reasoning is what makes GPT-5 so much faster

0

2

Siddarth Venkatraman

@siddarthv66

15 days

RT @Clad3815: GPT-5 has reached Victory Road! This is the last challenge before the Elite Four. GPT-5 reached this part almost three times….

0

69

0

Siddarth Venkatraman

@siddarthv66

15 days

How do image last layer features of large generative multimodal VLMs (like GPT 4o) compare against models like DINOv3.

AI at Meta

@AIatMeta

15 days

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense

0

5

Siddarth Venkatraman

@siddarthv66

17 days

That’s actually super cool! So if this works without finetuning, does that mean LLM layers share a sort of “universal functional structure” where weights from different models can be swapped and still make sense computationally? This seems important….

snwy

@snwy_me

18 days

it was not a waste of time; i've successfully made a super weird Qwen thing! it's approx 14.5B params, made from Qwen3-8B and Qwen3-235-A22B mixed together by doing a super cursed process; this was only possible because both of these models share the same hidden size! (1/2)

0

5

Siddarth Venkatraman

@siddarthv66

17 days

RT @WenzeChen2: [0/3] .🚀 Introducing Verlog – an open-source RL framework built specifically for training long-horizon, multi-turn LLM agen….

0

71

0

Siddarth Venkatraman

@siddarthv66

19 days

RNN supremacy.

Shuangfei Zhai

@zhaisf

19 days

Unlike an RNN, one attention block alone cannot model anything interesting. And it’s the stacking of it that does wonders. Understanding this compositionality should be at least as important as understanding the attn module itself.

0

1

6

Siddarth Venkatraman

@siddarthv66

20 days

So is Muon the best optimizer right now for RL too?.

0

3

Siddarth Venkatraman

@siddarthv66

21 days

RT @lchen915: Self-Questioning Language Models: LLMs that learn to generate their own questions and answers via asymmetric self-play RL. T….

0

183

0

Siddarth Venkatraman

@siddarthv66

24 days

RT @DimitrisPapail: It was fun when we had llama and qwen 2.5 that one could run RL on, and see reward go up; now all the reasoning models….

0

6

0

Siddarth Venkatraman

@siddarthv66

25 days

This is straight magic. Deep learning is alchemy.

Dimitris Papailiopoulos

@DimitrisPapail

26 days

This is the cleanest transfer we could get, even when training a small transformer from scratch: . If task A and B are "computationally similar", then length generalization of A transfers to B!

0

2

Siddarth Venkatraman

@siddarthv66

28 days

Israel is committing genocide against the people of Gaza. I hope more people in AI academia have the courage to say this publicly. I believe most know it to be true. I’ve seen one too many awful images of dying babies today and needed to say something. Free Palestine 🇵🇸.

0

1

15

Siddarth Venkatraman

@siddarthv66

28 days

RT @dwarkesh_sp: I filmed a video version of my post 'Why I Don’t Think AGI Is Right Around The Corner' so I could show it to my YouTube au….

0

95

0

Siddarth Venkatraman

@siddarthv66

1 month

RT @makingAGI: 🚀Introducing Hierarchical Reasoning Model🧠🤖. Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoni….

0

664

0

Siddarth Venkatraman

@siddarthv66

1 month

We (and others) encounter only the masks we wear. What we call a “true self” is merely the story our masks tell one another about what lies beneath. But behind each mask is only another mask, and beneath them all - “the void”. Our ego is a stable low energy local minimum state.

kalomaze

@kalomaze

1 month

"personality is just the average of everyone's model of you fed back into itself until it stabilizes into something that feels like you" - Opus 4.

0

Siddarth Venkatraman

@siddarthv66

1 month

RT @siddarthv66: @ajwagenmaker Congratulations on your work Andrew! This is actually highly related to our work (just presented at ICML) Ou….

0

1

0

Siddarth Venkatraman

@siddarthv66

1 month

RT @JainMoksh: As the field moves towards agents doing science, the ability to understand novel environments through interaction becomes cr….

0

7

0

Siddarth Venkatraman

@siddarthv66

2 months

RT @g_k_swamy: Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on thes….

0

72

0

Siddarth Venkatraman

@siddarthv66

2 months

Come check out our poster this Wednesday at 4:30pm @icmlconf !!.Happy to chat about diffusion, GFlowNets and RL!.

Siddarth Venkatraman

@siddarthv66

3 months

Is there a universal strategy to turn any generative model—GANs, VAEs, diffusion models, or flows—into a conditional sampler, or finetuned to optimize a reward function?.Yes! Outsourced Diffusion Sampling (ODS) accepted to @icmlconf , does exactly that!

0

5

24