JupyterAI Profile Banner
Kumar Shridhar Profile
Kumar Shridhar

@JupyterAI

Followers
684
Following
3K
Media
73
Statuses
592

Final year PhD student in ML/NLP @eth_en | Past: @MSFTResearch @AIatMeta @AmazonScience @rptu_kl_ld | He/him. Views are my own.

Zurich
Joined August 2017
Don't wanna be here? Send us removal request.
@JupyterAI
Kumar Shridhar
2 years
Can Large Language Models (LLMs) accurately judge their own generative output?. Introducing ART: Ask, Refine, and Trust. 1. ASK important questions to decide if refinement is needed.2. execute REFINEMENT.3. affirm or withhold TRUST in refinement.
@arankomatsuzaki
Aran Komatsuzaki
2 years
The ART of LLM Refinement: Ask, Refine, and Trust. Achieves a performance gain of 5 points over self-refinement baselines, while using a much smaller model as the decision maker.
Tweet media one
2
9
42
@JupyterAI
Kumar Shridhar
2 months
A student should confidently exploit what it knows and then explore its limits with a teacher. “Learning is often a mix of confidence and curiosity”. Check out how we can balance the two for knowledge distillation at @aclmeeting in Vienna! .#ACL2025.
@shivamadarsh99
Shivam Adarsh
2 months
I am happy to share that our work SIKeD has been accepted to ACL 2025 @aclmeeting findings! .More details below
Tweet media one
0
0
4
@JupyterAI
Kumar Shridhar
2 months
“Reasoning’s like dominoes—nudge that first piece just right and the rest fall perfectly into place.”. See you all in Vienna!!.#ACL2025.
@kushalj001
Kushal
2 months
Happy to share that this paper has now been accepted to ACL 2025 @aclmeeting findings!.Paper link: Details in🧵.
0
0
12
@JupyterAI
Kumar Shridhar
3 months
Gave a simple image to @OpenAI O3 to look into and it zoomed in and out to make sure it’s “q” and not “9”. Is zooming in and out with corresponding text part of some alignment strategy? Or some form of augmentation that works ?. Any papers in this direction ?.
0
0
1
@JupyterAI
Kumar Shridhar
3 months
Thought #OpenAI's deep research would add URLs to BibTeX entries easily. Seemed like a perfect use case given I provide all the sources to look into. But NO, it decided to choose a couple of the entries and ignored all the others.
0
0
1
@JupyterAI
Kumar Shridhar
3 months
Game the system!.
@casper_hansen_
Casper Hansen
3 months
Llama 4 quietly dropped from 1417 to 1273 ELO, on par with DeepSeek v2.5
Tweet media one
0
0
2
@JupyterAI
Kumar Shridhar
3 months
Claude 3.7 Max Thinking in Cursor is hands down the best for cloning anything 💻✨.
@deedydas
Deedy
3 months
Claude 3.7 Max Thinking is still the best model for Cursor. — Better at creating apps, refactoring and adding features.— Better at figuring out connections between classes.— Main benefit of Gemini 2.5 is the 1M context
Tweet media one
0
0
1
@JupyterAI
Kumar Shridhar
3 months
The approach works. Consistent improvements over standard distillation. More here:
Tweet media one
0
0
2
@JupyterAI
Kumar Shridhar
3 months
A mismatch between teacher-generated rationales and the student's specific learning requirements is a huge problem in Knowledge Distillation. Solution: Teacher understanding the knowledge gap in student learning and providing tailored rationales that specifically address these.
@kushalj001
Kushal
3 months
Is standard one-step distillation not enough for your smaller agents? 🤖.We present a new iterative distillation approach that enhances teacher🧑‍🏫-student🧑‍🎓 interaction resulting in improved student performance on math reasoning problems. 🧵
Tweet media one
1
1
7
@JupyterAI
Kumar Shridhar
3 months
It’s crazy how the definition of small models is changing so fast. Now it’s 17B MoE with over 100B parameters. Not sure if this will help the open source community to train their own models which was the main reason why llama was so popular.
@Ahmad_Al_Dahle
Ahmad Al-Dahle
3 months
Introducing our first set of Llama 4 models!. We’ve been hard at work doing a complete re-design of the Llama series. I’m so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4
Tweet media one
0
0
3
@JupyterAI
Kumar Shridhar
3 months
RT @hardmaru: 😅
Tweet media one
0
145
0
@JupyterAI
Kumar Shridhar
3 months
What’s the equivalent of Vibe coding for AI agent? . Anything specific people are testing?
Tweet media one
0
0
2
@JupyterAI
Kumar Shridhar
3 months
RT @Tongtian_Zhu: ICML 2025's rebuttal process be like🤣:.👨‍💻 Authors: spend a whole week writing a careful rebuttal.✅ Reviewer: clicks "ack….
0
24
0
@JupyterAI
Kumar Shridhar
3 months
Maybe the person had some interests on selling me those seats. He did it knowingly. An AI system would have done the same but unknowingly, following rules. Or maybe sometimes hallucinates something that doesn’t exist. I wonder what’s worse.
0
0
0
@JupyterAI
Kumar Shridhar
3 months
I mentioned it’s the same seats. No upgrade. Apparently seats prices are dynamically set and you can pay to get the worse seats. This is what an AI system would have done too. Follow rules without any common sense applied to it.
1
0
0
@JupyterAI
Kumar Shridhar
3 months
I paid and literally got the same seats with nothing better. Just another row. I boarded, talked to crew members and they got the same person on the call. He mentioned that I paid to get the seats that I got.
1
0
0
@JupyterAI
Kumar Shridhar
3 months
Sometimes I wonder what’s worse:.A human lying (knowingly) or an LLM hallucinating (unknowingly). Some context: . I booked an @airindia flight lately and wanted to upgrade my seats. I called and a human on the other side asked for some money to upgrade my seats.
2
0
1
@JupyterAI
Kumar Shridhar
7 months
RT @Kasparov63: My congratulations to @DGukesh on his victory today. He has summitted the highest peak of all: making his mother happy!.
0
5K
0
@JupyterAI
Kumar Shridhar
7 months
RT @AravSrinivas: If @narendramodi ji is interested, I would be down to figuring out an economic structure where all Indian students, facul….
0
1K
0
@JupyterAI
Kumar Shridhar
7 months
RT @NielsRogge: Feeling the same tbh
Tweet media one
0
75
0