Swaroop Nath @swaroopnath6 X Profile

Swaroop Nath

@swaroopnath6

Followers

502

Following

1K

Media

48

Statuses

614

Pre-Doctoral Researcher @GoogleDeepMind India | Ex-AI Researcher @LinkedIn | NLP @cfiltnlp @CSE_IITBombay Tweets about RL, NLP, RLHF, and general AI-ML

Joined March 2020

Don't wanna be here? Send us removal request.

Swaroop Nath

@swaroopnath6

2 years

🚀New Paper Alert💥 Gathering Human Preferences for RLHF is costly (>20K in size). Preference is contextual - creative writing --> 👍, otherwise --> 👎. How to achieve #alignment cheaply and quickly, in contextual setup? paper🔗at the end ❓ Can Domain Knowledge help? ✅ Yes ⬇️

1

7

52

Swaroop Nath

@swaroopnath6

4 days

Please consider applying to the program. Over two years, my research skills, perspective on research have all been broadened and sharpened. This is an exceptional group, in the way they groom you, and allow you a room for exploring wild ideas. Pls reach out if you have questions!

Prateek Jain

@jainprateek_

5 days

Thrilled to note that we are keeping the tradition of the awesome AI residency program alive in a new avatar: pre-doc researcher program at GDM-Blr -- with some amazing work done by our recent predocs including @gautham_ga_ @pranamyapk @puranjay1412 @sahilgo6801 @swaroopnath6

0

1

6

Swaroop Nath

@swaroopnath6

15 days

Heading out to @NeurIPSConf at San Diego in a couple of days. Would love to meet researchers around Reasoning, Post Training, etc. DM, if you want to meet! If you have some kickass parties, please do send invites. Would love to attend and meet people :) #neurips25

0

1

Swaroop Nath

@swaroopnath6

1 month

While the following might sound like victim blaming, but part of the problem is people submitting too many papers. Can we impose a fee on submitters conditioned on # resubmits of the manuscript, # submits in the past k conferences, ..? Let us as a whole stop paper mills #ICLR2026

0

Tom Silver

@tomssilver

2 months

This was my experience in grad school, and now I've seen some evidence to suggest a trend 🤔

11

46

850

Sagar

@sagarcasm

2 months

Me after turning off TV

71

447

10K

Swaroop Nath

@swaroopnath6

2 months

Seems like people are catching on to some of Transformers' shortcomings. We showed roughly 1.5 years back how Transformers are bad at approximating smooth functions. Read more at

arxiv.org

Transformers have become pivotal in Natural Language Processing, demonstrating remarkable success in applications like Machine Translation and Summarization. Given their widespread adoption,...

alphaXiv

@askalphaxiv

2 months

Why Can't Transformers Learn Multiplication? This paper found that plain training never builds long-range links of multiplications. So by adding a new auxiliary loss that predicts the “running sum”, it enables the model to successfully learn multi-digit multiplication!

0

Raj Dabre

@prajdabre

2 months

Professor Pushpak Bhattacharyya passed away this morning. This world has lost a great human being and a researcher. May he rest in peace.

39

30

587

El Gato

@onlygato

3 months

Who names a fingerprint reading machine Eklavya?? 😭😭

508

2K

30K

Swaroop Nath

@swaroopnath6

3 months

Man the whole thread 😆

Philipp Schmid

@_philschmid

3 months

Just read this new research paper from Google AI called "Attention is All You Need" and I think my brain is actually broken 🤯 All our best AI models are stuck processing language one word at a time, in order. It's this huge sequential bottleneck. These researchers just... threw

0

3

Swaroop Nath

@swaroopnath6

3 months

"There's no language out there in nature"? A bit unbelievable! Language, or rather communication, forms the basis of collective intelligence. What I do agree on that next-token prediction is probably not teaching the model actual procedural knowledge.

Rohan Paul

@rohanpaul_ai

3 months

Fei-Fei Li says, "There's no language out there in nature....There is a 3D world that follows laws of physics." Quite a few papers says similar things. AI models trained on linguistic signals fail when the task requires embodied physical common sense in a world with real

0

1

Swaroop Nath

@swaroopnath6

3 months

The lesson of the week is: Slow is smooth, smooth is fast. Probably one of the best productivity tips I have incorporated this year 🤩

0

2

Sahil Goyal

@sahilgo6801

4 months

Earlier, I curated this list of resources for the niche field of "Aesthetic Assessment of Graphic Designs". https://t.co/xKgfD2ZyDa I try to update it, as I think this area has good directions for future research and is very underexplored.

github.com

Collection of Aesthetics Assessment Papers for Graphic Designs. - sahilg06/Awesome-Aesthetics-Assessment

1

5

Swaroop Nath

@swaroopnath6

4 months

Not citing my work? Fine! Not citing my jokes? Unforgivable 🥷

0

3

Kyle Corbitt

@corbtt

9 months

If you're fine-tuning LLMs, Gemma 3 is the new 👑 and it's not close. Gemma 3 trounces Qwen/Llama models at every size! - Gemma 3 4B beats 7B/8B competition - Gemma 3 27B matches 70B competiton Vision benchmarks coming soon!

19

55

497

Swaroop Nath

@swaroopnath6

4 months

2. I am unsure how pass@k leads to better exploration. Sure there are possibly more correct answers in the K samples, but unclear how it helps LLM explore more. Were it really exploring I would expect pass@1 to also increase. Any clarifications hugely appreciated :)

0

Swaroop Nath

@swaroopnath6

4 months

Great paper! But at the risk of seeming like reviewer 2, just two questions: 1. The specific instantiation of pass@k (max of rewards). Won't it degenerate to having just one out of k correct. I am curious why this degeneration doesn't happen!

Lisan al Gaib

@scaling01

4 months

ByteDance-Seed with another banger all my homies do pass@k training now https://t.co/us2iEeN3qU

1

0

Swaroop Nath

@swaroopnath6

4 months

I didn't know version was a measure of evaluation 🫨 Waiting for reviewer #3 to ask for this plot now 🫠

Aran Komatsuzaki

@arankomatsuzaki

4 months

OpenAI just cannot stop winning

0

Dimitris Papailiopoulos

@DimitrisPapail

4 months

Excited about our new work: Language models develop computational circuits that are reusable AND TRANSFER across tasks. Over a year ago, I tested GPT-4 on 200 digit addition, and the model managed to do it (without CoT!). Someone from OpenAI even clarified they NEVER trained

24

81

523

Swaroop Nath

@swaroopnath6

5 months

Whoever is this reviewer, please change your profession :) The aura debt on this unreal

Yiping Lu

@2prime_PKU

5 months

Anyone knows adam?

1

0

4

Roberta Raileanu

@robertarail

5 months

I’m building a new team at @GoogleDeepMind to work on Open-Ended Discovery! We’re looking for strong Research Scientists and Research Engineers to help us push the frontier of autonomously discovering novel artifacts such as new knowledge, capabilities, or algorithms, in an

job-boards.greenhouse.io

92

259

3K