Nilesh Gupta
@nileshgupta2797
Followers
240
Following
1K
Media
13
Statuses
106
CS PhD @UTAustin, SR @GoogleDeepMind, previously RF @MSFTResearch and CS BTech @IITBombay
Joined October 2018
LLMs are becoming the fundamental unit of compute — not just for generation, but also information retrieval. We present, “Scalable In-context Ranking with Generative Models��� — step toward retrieval-native LLMs — models that understand & optimize retrieval internally, rather than
1
3
15
Huge thanks to Devvrit Khatri for coming on the Delta Podcast! Check out the podcast episode here: https://t.co/wmsDjqFbPn
2
2
7
Wow - seems like a neat way of capitalizing on 🤗 model-verse!
The main breakthrough of GPT-5 was to route your messages between a couple of different models to give you the best, cheapest & fastest answer possible. This is cool but imagine if you could do this not only for a couple of models but hundreds of them, big and small, fast and
0
0
1
The cleanest RL scaling results I've seen so far🤯. Amazing to see how much valuable insights you can get when the premise is not necessarily to come up with a "new" method and just figure out what works (ofc while also being supercharged with 400K gpu hours). Congratssss
Wish to build scaling laws for RL but not sure how to scale? Or what scales? Or would RL even scale predictably? We introduce: The Art of Scaling Reinforcement Learning Compute for LLMs
0
4
58
New @GoogleDeepMind paper makes ranking inside the prompt faster by changing how attention is wired and using that signal to score documents. It reports 4.7x lower latency at 100 candidates and can handle 500 candidates in about 1 second. The big deal is that it turns
7
25
152
(6/6) Work done at Google with @ChongYou3, Srinadh Bhojanapalli, Sanjiv Kumar, @inderjit_ml, Felix Yu. Paper: https://t.co/bJTJkf2SPB (to appear at Neurips'25) Code: https://t.co/uTrEECS9gd (coming soon)
0
0
3
(5/6) BlockRank (Mistral-7B) matches SOTA in-context rankers while being orders of magnitude more efficient at scale.
1
0
1
(4/6) Building on this, we propose BlockRank — blockwise structured sparse attention + an auxiliary contrastive loss to improve sharpness of attention on the relevant documents — enabling efficient attention-based inference.
1
0
1
(3/6) We show that for ICR, the full causal attention of LLMs can be replaced with a blockwise structured sparse attention without loss in performance — yielding linear scaling in the number of in-context documents. Moreover, ranking signals can be inferred during the prefill
1
0
1
(2/6) In-context Ranking / Retrieval (ICR) leverages contextual understanding of LLMs for IR by directly incorporating the task description, candidate documents, and the query into the model’s input prompt and tasks the LLM to identify relevant document(s).
1
0
1
Absolute gold 💛
A fun conversation with the amazing @shripati and his team at @Primevp_in! Thanks a lot for hosting me.
0
0
1
🚀 Ideogram Character is almost here! If you missed our live demo, you can watch it here:
48
67
520
Congrats to @UTAustin students Kulin Shah and Vasilis Kontonis who won an Outstanding Paper Award at #ICML2025! Their work pushes the boundaries of how AI models learn and understand the world. We’re proud to see @utcompsci students leading at the frontier of machine learning.
0
5
32
Today, we're releasing a major upgrade to Ideogram 3.0: enhanced realism, more versatile styles, improved prompt following, and greater diversity. You can now use Magic Fill and Extend with 3.0 in Ideogram Canvas to edit both uploaded and generated images. Ideogram 3.0 is
38
108
910
Congratsss @simi_97k
#EMNLP2024 Best Paper 1/5: An image speaks a thousand words, but can everyone listen? On image transcreation for cultural relevance
1
0
3
Happy to share that our paper has been accepted to #EMNLP2024 Findings! See ya'll in Miami! w/ amazing co-authors @convexlull @eunsolc
1/ 🎉 Excited to share our latest paper: "Exploring Design Choices for Building Language-Specific LLMs" 📄. We explore adaptation of monolingual and multilingual large language models for specializing to a particular language 🌐🚀 \w @convexlull @eunsolc
0
1
14
If I could graduate today, I’d be fighting for this position.
Excited to share that the Machine Learning and Optimization team at @GoogleDeepMind India is hiring Research Scientists and Research Engineers! If you're passionate about cutting-edge AI research and building efficient, elastic, and safe LLMs, we'd love to hear from you. Check
2
0
22
Sure matplotlib is cool, but what if I want to load my loss curves into the 2006 hit Flash game LineRider?
50
823
6K
I was thrilled to learn about this best paper award announced today in COLT 2024, the premier learning theory venue. The paper is "Smoothed Analysis for Learning Concepts with Low Intrinsic Dimension" authored by students Gautam Chandrasekaran, Konstantinos Stavropoulos, IFML
2
35
253