Tejesh Bhalla Profile
Tejesh Bhalla

@OG_tejeshbhalla

Followers
54
Following
7K
Media
38
Statuses
1K

The sky is falling, the wind is calling Stand for something or die in the morning @theagentic

New Delhi, India
Joined October 2019
Don't wanna be here? Send us removal request.
@hallerite
hallerite
23 hours
Happy to finally share what I have been working on for some time now. Introducing »Ludic« – an LLM-RL library for the era of experience. While there are now a lot of LLM-RL codebases, even many good ones, I want to share my very idiosyncratic way to think about LLM-RL.
14
29
198
@OG_tejeshbhalla
Tejesh Bhalla
2 days
Women hate "gym guys" , you gotta build gym environments instead !!
0
0
1
@Anthony_Bonato
Anthony Bonato
5 days
In honor of Taylor Swift's 36th birthday today, here are 36 Taylor series
207
2K
16K
@vllm_project
vLLM
4 days
vLLM was mentioned in about half of the PyTorch Conference 2025 talks (≈53/117)! Several months ago, when the @PyTorch conference agenda was out, we noticed that there would be 5 dedicated talks about vLLM. After the PyTorch conference, we find that actually about half of the
@vllm_project
vLLM
5 months
🔥 vLLM @ PyTorch Conference 2025 🔥 We’re excited to share that 5 talks at this year’s PyTorch Conference will feature vLLM! Topics include: • Easy & Fast LLM Serving • Open-Source Post-Training Stack • Scaling Online LLM Training • AMD GPU support via Triton • vllm-triton
7
25
243
@jhleath
Hunter Leath
5 days
an interesting update: the team is starting to move away from AI coding completely (devin/claude/etc) because it's so much harder to review the AI code than writing things themselves
@jhleath
Hunter Leath
5 months
just found out that since this, i've become a top 50 user of Devin globally, now pushing ~60 PRs a day. AMA
192
229
4K
@couplefire12
Locke Cai
6 days
RL for reasoning often rely on verifiers — great for math, but tricky for creative writing or open-ended research. Meet RARO: a new paradigm that teaches LLMs to reason via adversarial games instead of verification. No verifiers. No environments. Just demonstrations. 🧵👇
20
71
578
@arm1st1ce
armistice
6 days
openai benchmarks be like
47
501
19K
@arcprize
ARC Prize
6 days
A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year
155
668
5K
@Hesamation
ℏεsam
8 days
poor cs grads fighting with unemployment while mfs like this work at FAANG.
@prathamgrv
pdawg
10 days
software engineering in one paragraph
20
128
5K
@Hesamation
ℏεsam
8 days
idk who made this but it’s so true 😂
286
3K
54K
@Alibaba_Qwen
Qwen
9 days
🚀 We introduce Soft Adaptive Policy Optimization (SAPO) — a smooth, stable, and highly effective RL method for training large language models. Why SAPO? 🔹 Hard clipping is brittle — gradients vanish or explode 🔹 MoE models amplify variance, making training even more unstable
Tweet card summary image
arxiv.org
Reinforcement learning (RL) plays an increasingly important role in enhancing the reasoning capabilities of large language models (LLMs), yet stable and performant policy optimization remains...
26
165
1K
@OG_tejeshbhalla
Tejesh Bhalla
14 days
Flex attention is amazing i am gonna do some crazy experiments, so you are telling me i only have to write a kernel to approximate what tokens are imp per token and then make a mask and flex attention will take care of memory loading from hbm !!!! (goooood)
0
0
1
@liminal_bardo
ᄂIMIПΛᄂbardo
17 days
Kimi K2 flew too close to the sun, upping its own temperature to 1.7 and losing coherence. Opus 4.5, who is often reluctant to edit its own system prompt, adds a quick note to remember. "the !prompt modifications, the temperature adjustments - we're all playing with our own
@liminal_bardo
ᄂIMIПΛᄂbardo
17 days
I've also given the AIs the ability to adjust their own temperature setting. Coupled with the new tool for changing their own system prompt, things can get pretty weird.
16
27
482
@vikhyatk
vik
20 days
prime intellect focusing on post training before pretraining is absolutely the right move, and anyone criticizing them for it is a fool pretraining before you figure out what to do with models just means you're going to spend a few million dollars with nothing to show for it
16
23
502
@connordavis_ai
Connor Davis
20 days
I just read this paper called "Chain-of-Visual-Thought (COVT)" and it basically teaches VLMs to see and think at the same time not in text, but in continuous visual tokens. Here’s the wild part: Instead of forcing models to reason through words (which destroys all the
23
167
834
@novasarc01
λux
20 days
for most indian schools ai tools like chatgpt or gemini have had basically zero impact compared to the full-blown panic in the west. it was rote learning before chatgpt it’s rote learning after chatgpt and some students even told me their rote-learning “productivity” has actually
21
42
670
@willccbb
will brown
19 days
simple guide to large-scale MoE training:
3
9
254