Adi Renduchintala Profile
Adi Renduchintala

@rendu_a

Followers
590
Following
4K
Media
20
Statuses
461

Applied Research Scientist @NVIDIA, former: Research Scientist @MetaAI, PhD @jhuclsp also lurking on Mastodon [email protected]

Redwood City, CA
Joined July 2016
Don't wanna be here? Send us removal request.
@aakaran31
Aayush Karan
1 month
We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.
69
251
2K
@kuchaev
Oleksii Kuchaiev
3 months
We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: https://t.co/n3M01d8lSm
10
61
299
@rendu_a
Adi Renduchintala
3 months
We have been hard at work on improving hybrid models! Looking forward to see how mamba hybrid models shape the Reasoning LLM space. I’m Super excited to be a part of this effort.
@NVIDIAAIDev
NVIDIA AI Developer
3 months
We're excited to share leaderboard-topping 🏆 NVIDIA Nemotron Nano 2, a groundbreaking 9B parameter open, multilingual reasoning model that's redefining efficiency in AI and earned the leading spot on the @ArtificialAnlys Intelligence Index leaderboard among open models within
0
1
13
@NVIDIAAIDev
NVIDIA AI Developer
3 months
NVIDIA’s Graduate Fellowship Program is now accepting applications for the 2026–2027 academic year. Selected Ph.D. students receive tuition and stipend coverage up to $60K, plus mentorship and technical support from top NVIDIA researchers during an NVIDIA internship. If you’re
7
59
222
@kuchaev
Oleksii Kuchaiev
5 months
AI model post training is rapidly improving. The plot below (starting from the same base model) illustrates about 10 months of progress in the *open* post-training research. I’m not convinced that closed research can move as fast.
1
4
22
@rendu_a
Adi Renduchintala
5 months
Transformers are still dominating the LLM scene but we show that higher throughput alternatives exist which are just as strong! Grateful to have a part in Nemotron-H Reasoning effort. 🙏 Technical report will be out soon, stay tuned!
@NVIDIAAIDev
NVIDIA AI Developer
5 months
👀 Nemotron-H tackles large-scale reasoning while maintaining speed -- with 4x the throughput of comparable transformer models.⚡ See how #NVIDIAResearch accomplished this using a hybrid Mamba-Transformer architecture, and model fine-tuning ➡️ https://t.co/AuHYANG9gX
1
7
34
@gneubig
Graham Neubig
6 months
Some people have said that OpenAI achieved state of the art results on the SWE-Bench Verified leaderboard with their codex model, but that's actually not quite correct, no matter how you measure it. A quick 🧵
3
28
175
@ylecun
Yann LeCun
6 months
NSF budgets slashed by 50%, ongoing grants cancelled, NSF staff drastically reduced, all 37 divisions abolished, and grants will now be reviewed by a political kommissar. How will that help technological leadership? https://t.co/f9b4yOXdZk
Tweet card summary image
linkedin.com
NSF budgets slashed by 50%, ongoing grants cancelled, NSF staff drastically reduced, all 37 divisions abolished, and grants will now be reviewed by a political kommissar. How will that help technol...
86
145
834
@nsaphra
Naomi Saphra
6 months
idk dude I come here and look at my feed, literally everyone on my following feed is subscribed to DOGE and not a single professional scientific researcher has noted that every division at the NSF was just abolished
2
4
64
@AlekFicek
Aleks Ficek 🧪
7 months
Markets down 📉 NVIDIA LLM research 📈 Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models! Better or on-par accuracy compared to other similar open-sourced transformer models while being up to ✨ 3× faster at inference ✨ https://t.co/s9MvKYBHEc
Tweet card summary image
arxiv.org
As inference-time scaling becomes critical for enhanced reasoning capabilities, it is increasingly becoming important to build models that are efficient to infer. We introduce Nemotron-H, a family...
0
5
23
@kuchaev
Oleksii Kuchaiev
7 months
We are excited to release Llama-Nemotron-Ultra! This is a reasoning ON/OFF, dense 253B model. Open weights and post-training data. https://t.co/dCTacylBR8 We started with llama-405B, changed it via NAS pruning then followed by reasoning-focused post-training: SFT + RL in FP8.
24
124
708
@ctnzr
Bryan Catanzaro
8 months
Nemotron-H: A family of Hybrid Mamba-Transformer LLMs. * Hybrid architecture means up to 3X faster at the same accuracy * Trained in FP8 * Great for VLMs * Weights and instruct versions to come soon. https://t.co/h3dLuDuiUz
18
103
637
@WenhuChen
Wenhu Chen
8 months
This paper provides some really interesting insights: 1. Previously, people found that Qwen base models are particularly good at R1 training to show strong exploration skills. - This paper shows that there is no magic about Qwen base models. It's likely pre-trained with
@zzlccc
Zichen Liu
8 months
🪂Understanding R1-Zero-Like Training: A Critical Perspective * DeepSeek-V3-Base already exhibits "Aha moment" before RL-tuning?? * The ever-increasing output length in RL-tuning might be due to a BIAS in GRPO?? * Getting GRPO Done Right, we achieve a 7B AIME sota! 🧵 📜Full
10
87
573
@kuchaev
Oleksii Kuchaiev
8 months
We are excited to release new Llama-Nemotron models. These models allow you to set reasoning ON/OFF during runtime. We also release all the post-training data under CC-BY-4! Try it now on https://t.co/Q7jFIMi0po HF collection: https://t.co/E8mNZlUXz0
8
43
194
@zhang_yian
Yian Zhang
9 months
DPO, SimPO, RPO, ... There are just too many **PO**s in the NLP/LLM world! 💥💥💥😲 If you wonder which PO truly works the best, how to make them even better, and their inter-connections, read our latest paper at https://t.co/n5i63aoqha 👇 (1/3)
1
6
13
@kuchaev
Oleksii Kuchaiev
10 months
Our team put together a unified mathematical framework to analyze popular model alignment algorithms. “Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment” https://t.co/qz99EZGJx0
2
19
72
@sarahookr
Sara Hooker
11 months
So collusion rings are still a thing at top conferences. @openreviewnet has all the data across all top tier ml conferences — what are they currently do to solve this. What top tier conf is willing to release data of bidding patterns so we can analyze the problem at scale.
18
19
289
@huybery
Binyuan Hui
11 months
🤔 Pre-training as Ilya knows it will end, but not for us. At NeurIPS, @ilyasut shared an insightful perspective: "pre-training as we know it will end". I fully agree that agents, synthetic data, and inference-time computing are critical breakthroughs for the superintelligence,
26
90
734
@infoxiao
Xiao Ma
11 months
Humans saw this and decided to discuss AI. #neurips24
5
11
231
@shi_weiyan
Weiyan Shi
1 year
🥳Once again I am looking for PhD students in 💬persuasion, 💬dialogues and ⛑️AI safety, to join our CHATS lab @Northeastern in Fall 2025! 🥳 Apply by 12/15 both in ECE and CS! Let’s build and break chatbots together🥳! https://t.co/uXf2P1FKSG
4
42
158