Adi Renduchintala @rendu_a X Profile

Adi Renduchintala

@rendu_a

Followers

590

Following

4K

Media

20

Statuses

461

Applied Research Scientist @NVIDIA, former: Research Scientist @MetaAI, PhD @jhuclsp also lurking on Mastodon [email protected]

https://t.co/uMXhqrvYYa

Redwood City, CA

Joined July 2016

Don't wanna be here? Send us removal request.

Aayush Karan

@aakaran31

1 month

We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.

69

251

2K

Oleksii Kuchaiev

@kuchaev

3 months

We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: https://t.co/n3M01d8lSm

10

61

299

Adi Renduchintala

@rendu_a

3 months

We have been hard at work on improving hybrid models! Looking forward to see how mamba hybrid models shape the Reasoning LLM space. I’m Super excited to be a part of this effort.

NVIDIA AI Developer

@NVIDIAAIDev

3 months

We're excited to share leaderboard-topping 🏆 NVIDIA Nemotron Nano 2, a groundbreaking 9B parameter open, multilingual reasoning model that's redefining efficiency in AI and earned the leading spot on the @ArtificialAnlys Intelligence Index leaderboard among open models within

0

1

13

NVIDIA AI Developer

@NVIDIAAIDev

3 months

NVIDIA’s Graduate Fellowship Program is now accepting applications for the 2026–2027 academic year. Selected Ph.D. students receive tuition and stipend coverage up to $60K, plus mentorship and technical support from top NVIDIA researchers during an NVIDIA internship. If you’re

7

59

222

Oleksii Kuchaiev

@kuchaev

5 months

AI model post training is rapidly improving. The plot below (starting from the same base model) illustrates about 10 months of progress in the *open* post-training research. I’m not convinced that closed research can move as fast.

1

4

22

Adi Renduchintala

@rendu_a

5 months

Transformers are still dominating the LLM scene but we show that higher throughput alternatives exist which are just as strong! Grateful to have a part in Nemotron-H Reasoning effort. 🙏 Technical report will be out soon, stay tuned!

NVIDIA AI Developer

@NVIDIAAIDev

5 months

👀 Nemotron-H tackles large-scale reasoning while maintaining speed -- with 4x the throughput of comparable transformer models.⚡ See how #NVIDIAResearch accomplished this using a hybrid Mamba-Transformer architecture, and model fine-tuning ➡️ https://t.co/AuHYANG9gX

1

7

34

Graham Neubig

@gneubig

6 months

Some people have said that OpenAI achieved state of the art results on the SWE-Bench Verified leaderboard with their codex model, but that's actually not quite correct, no matter how you measure it. A quick 🧵

3

28

175

Yann LeCun

@ylecun

6 months

NSF budgets slashed by 50%, ongoing grants cancelled, NSF staff drastically reduced, all 37 divisions abolished, and grants will now be reviewed by a political kommissar. How will that help technological leadership? https://t.co/f9b4yOXdZk

linkedin.com

NSF budgets slashed by 50%, ongoing grants cancelled, NSF staff drastically reduced, all 37 divisions abolished, and grants will now be reviewed by a political kommissar. How will that help technol...

86

145

834

Naomi Saphra

@nsaphra

6 months

idk dude I come here and look at my feed, literally everyone on my following feed is subscribed to DOGE and not a single professional scientific researcher has noted that every division at the NSF was just abolished

2

4

64

Aleks Ficek 🧪

@AlekFicek

7 months

Markets down 📉 NVIDIA LLM research 📈 Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models! Better or on-par accuracy compared to other similar open-sourced transformer models while being up to ✨ 3× faster at inference ✨ https://t.co/s9MvKYBHEc

arxiv.org

As inference-time scaling becomes critical for enhanced reasoning capabilities, it is increasingly becoming important to build models that are efficient to infer. We introduce Nemotron-H, a family...

0

5

23

Oleksii Kuchaiev

@kuchaev

7 months

We are excited to release Llama-Nemotron-Ultra! This is a reasoning ON/OFF, dense 253B model. Open weights and post-training data. https://t.co/dCTacylBR8 We started with llama-405B, changed it via NAS pruning then followed by reasoning-focused post-training: SFT + RL in FP8.

24

124

708

Bryan Catanzaro

@ctnzr

8 months

Nemotron-H: A family of Hybrid Mamba-Transformer LLMs. * Hybrid architecture means up to 3X faster at the same accuracy * Trained in FP8 * Great for VLMs * Weights and instruct versions to come soon. https://t.co/h3dLuDuiUz

18

103

637

Wenhu Chen

@WenhuChen

8 months

This paper provides some really interesting insights: 1. Previously, people found that Qwen base models are particularly good at R1 training to show strong exploration skills. - This paper shows that there is no magic about Qwen base models. It's likely pre-trained with

Zichen Liu

@zzlccc

8 months

🪂Understanding R1-Zero-Like Training: A Critical Perspective * DeepSeek-V3-Base already exhibits "Aha moment" before RL-tuning?? * The ever-increasing output length in RL-tuning might be due to a BIAS in GRPO?? * Getting GRPO Done Right, we achieve a 7B AIME sota! 🧵 📜Full

10

87

573

Oleksii Kuchaiev

@kuchaev

8 months

We are excited to release new Llama-Nemotron models. These models allow you to set reasoning ON/OFF during runtime. We also release all the post-training data under CC-BY-4! Try it now on https://t.co/Q7jFIMi0po HF collection: https://t.co/E8mNZlUXz0

8

43

194

Yian Zhang

@zhang_yian

9 months

DPO, SimPO, RPO, ... There are just too many **PO**s in the NLP/LLM world! 💥💥💥😲 If you wonder which PO truly works the best, how to make them even better, and their inter-connections, read our latest paper at https://t.co/n5i63aoqha 👇 (1/3)

1

6

13

Oleksii Kuchaiev

@kuchaev

10 months

Our team put together a unified mathematical framework to analyze popular model alignment algorithms. “Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment” https://t.co/qz99EZGJx0

2

19

72

Sara Hooker

@sarahookr

11 months

So collusion rings are still a thing at top conferences. @openreviewnet has all the data across all top tier ml conferences — what are they currently do to solve this. What top tier conf is willing to release data of bidding patterns so we can analyze the problem at scale.

18

19

289

Binyuan Hui

@huybery

11 months

🤔 Pre-training as Ilya knows it will end, but not for us. At NeurIPS, @ilyasut shared an insightful perspective: "pre-training as we know it will end". I fully agree that agents, synthetic data, and inference-time computing are critical breakthroughs for the superintelligence,

26

90

734

Xiao Ma

@infoxiao

11 months

Humans saw this and decided to discuss AI. #neurips24

5

11

231

Weiyan Shi

@shi_weiyan

1 year

🥳Once again I am looking for PhD students in 💬persuasion, 💬dialogues and ⛑️AI safety, to join our CHATS lab @Northeastern in Fall 2025! 🥳 Apply by 12/15 both in ECE and CS! Let’s build and break chatbots together🥳! https://t.co/uXf2P1FKSG

4

42

158