Siddarth Venkatraman Profile
Siddarth Venkatraman

@siddarthv66

Followers
270
Following
646
Media
12
Statuses
196

PhD at Mila | RL and other stuff I find interesting

Joined September 2023
Don't wanna be here? Send us removal request.
@siddarthv66
Siddarth Venkatraman
2 days
RT @PrimeIntellect: Introducing the Environments Hub. RL environments are the key bottleneck to the next wave of AI progress, but big labs….
0
359
0
@siddarthv66
Siddarth Venkatraman
6 days
RT @IAmTimNguyen: I respectfully disagree with Ed. Was Kepler's planetary analysis "real" mathematics or just astronomy? Are IMO problems….
0
13
0
@siddarthv66
Siddarth Venkatraman
10 days
This is so fucking cool.
@Clad3815
Clad3815
10 days
GPT-5 Plays Pokémon Crystal - Update 🔥. GPT-5 earned its 7th badge at 3,321 steps. A huge improvement over o3's 11,910 steps! That's roughly a 3× speedup, similar to what we saw in the Pokémon Red run. From my observations, spatial reasoning is what makes GPT-5 so much faster
Tweet media one
Tweet media two
0
0
2
@siddarthv66
Siddarth Venkatraman
15 days
RT @Clad3815: GPT-5 has reached Victory Road! This is the last challenge before the Elite Four. GPT-5 reached this part almost three times….
0
69
0
@siddarthv66
Siddarth Venkatraman
15 days
How do image last layer features of large generative multimodal VLMs (like GPT 4o) compare against models like DINOv3.
@AIatMeta
AI at Meta
15 days
Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense
0
0
5
@siddarthv66
Siddarth Venkatraman
17 days
That’s actually super cool! So if this works without finetuning, does that mean LLM layers share a sort of “universal functional structure” where weights from different models can be swapped and still make sense computationally? This seems important….
@snwy_me
snwy
18 days
it was not a waste of time; i've successfully made a super weird Qwen thing! it's approx 14.5B params, made from Qwen3-8B and Qwen3-235-A22B mixed together by doing a super cursed process; this was only possible because both of these models share the same hidden size! (1/2)
Tweet media one
0
0
5
@siddarthv66
Siddarth Venkatraman
17 days
RT @WenzeChen2: [0/3] .🚀 Introducing Verlog – an open-source RL framework built specifically for training long-horizon, multi-turn LLM agen….
0
71
0
@siddarthv66
Siddarth Venkatraman
19 days
RNN supremacy.
@zhaisf
Shuangfei Zhai
19 days
Unlike an RNN, one attention block alone cannot model anything interesting. And it’s the stacking of it that does wonders. Understanding this compositionality should be at least as important as understanding the attn module itself.
Tweet media one
0
1
6
@siddarthv66
Siddarth Venkatraman
20 days
So is Muon the best optimizer right now for RL too?.
0
0
3
@siddarthv66
Siddarth Venkatraman
21 days
RT @lchen915: Self-Questioning Language Models: LLMs that learn to generate their own questions and answers via asymmetric self-play RL. T….
0
183
0
@siddarthv66
Siddarth Venkatraman
24 days
RT @DimitrisPapail: It was fun when we had llama and qwen 2.5 that one could run RL on, and see reward go up; now all the reasoning models….
0
6
0
@siddarthv66
Siddarth Venkatraman
25 days
This is straight magic. Deep learning is alchemy.
@DimitrisPapail
Dimitris Papailiopoulos
26 days
This is the cleanest transfer we could get, even when training a small transformer from scratch: . If task A and B are "computationally similar", then length generalization of A transfers to B!
Tweet media one
0
0
2
@siddarthv66
Siddarth Venkatraman
28 days
Israel is committing genocide against the people of Gaza. I hope more people in AI academia have the courage to say this publicly. I believe most know it to be true. I’ve seen one too many awful images of dying babies today and needed to say something. Free Palestine 🇵🇸.
0
1
15
@siddarthv66
Siddarth Venkatraman
28 days
RT @dwarkesh_sp: I filmed a video version of my post 'Why I Don’t Think AGI Is Right Around The Corner' so I could show it to my YouTube au….
0
95
0
@siddarthv66
Siddarth Venkatraman
1 month
RT @makingAGI: 🚀Introducing Hierarchical Reasoning Model🧠🤖. Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoni….
0
664
0
@siddarthv66
Siddarth Venkatraman
1 month
We (and others) encounter only the masks we wear. What we call a “true self” is merely the story our masks tell one another about what lies beneath. But behind each mask is only another mask, and beneath them all - “the void”. Our ego is a stable low energy local minimum state.
@kalomaze
kalomaze
1 month
"personality is just the average of everyone's model of you fed back into itself until it stabilizes into something that feels like you" - Opus 4.
0
0
0
@siddarthv66
Siddarth Venkatraman
1 month
RT @siddarthv66: @ajwagenmaker Congratulations on your work Andrew! This is actually highly related to our work (just presented at ICML) Ou….
0
1
0
@siddarthv66
Siddarth Venkatraman
1 month
RT @JainMoksh: As the field moves towards agents doing science, the ability to understand novel environments through interaction becomes cr….
0
7
0
@siddarthv66
Siddarth Venkatraman
2 months
RT @g_k_swamy: Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on thes….
0
72
0
@siddarthv66
Siddarth Venkatraman
2 months
Come check out our poster this Wednesday at 4:30pm @icmlconf !!.Happy to chat about diffusion, GFlowNets and RL!.
@siddarthv66
Siddarth Venkatraman
3 months
Is there a universal strategy to turn any generative model—GANs, VAEs, diffusion models, or flows—into a conditional sampler, or finetuned to optimize a reward function?.Yes! Outsourced Diffusion Sampling (ODS) accepted to @icmlconf , does exactly that!
Tweet media one
0
5
24