Kumar Shridhar @JupyterAI X Profile

Kumar Shridhar

@JupyterAI

Followers

684

Following

3K

Media

73

Statuses

592

Final year PhD student in ML/NLP @eth_en | Past: @MSFTResearch @AIatMeta @AmazonScience @rptu_kl_ld | He/him. Views are my own.

Zurich

Joined August 2017

Don't wanna be here? Send us removal request.

Kumar Shridhar

@JupyterAI

2 years

Can Large Language Models (LLMs) accurately judge their own generative output?. Introducing ART: Ask, Refine, and Trust. 1. ASK important questions to decide if refinement is needed.2. execute REFINEMENT.3. affirm or withhold TRUST in refinement.

Aran Komatsuzaki

@arankomatsuzaki

2 years

The ART of LLM Refinement: Ask, Refine, and Trust. Achieves a performance gain of 5 points over self-refinement baselines, while using a much smaller model as the decision maker.

2

9

42

Kumar Shridhar

@JupyterAI

2 months

A student should confidently exploit what it knows and then explore its limits with a teacher. “Learning is often a mix of confidence and curiosity”. Check out how we can balance the two for knowledge distillation at @aclmeeting in Vienna! .#ACL2025.

Shivam Adarsh

@shivamadarsh99

2 months

I am happy to share that our work SIKeD has been accepted to ACL 2025 @aclmeeting findings! .More details below

0

4

Kumar Shridhar

@JupyterAI

2 months

“Reasoning’s like dominoes—nudge that first piece just right and the rest fall perfectly into place.”. See you all in Vienna!!.#ACL2025.

Kushal

@kushalj001

2 months

Happy to share that this paper has now been accepted to ACL 2025 @aclmeeting findings!.Paper link: Details in🧵.

0

12

Kumar Shridhar

@JupyterAI

3 months

Gave a simple image to @OpenAI O3 to look into and it zoomed in and out to make sure it’s “q” and not “9”. Is zooming in and out with corresponding text part of some alignment strategy? Or some form of augmentation that works ?. Any papers in this direction ?.

0

1

Kumar Shridhar

@JupyterAI

3 months

Thought #OpenAI's deep research would add URLs to BibTeX entries easily. Seemed like a perfect use case given I provide all the sources to look into. But NO, it decided to choose a couple of the entries and ignored all the others.

0

1

Kumar Shridhar

@JupyterAI

3 months

Game the system!.

Casper Hansen

@casper_hansen_

3 months

Llama 4 quietly dropped from 1417 to 1273 ELO, on par with DeepSeek v2.5

0

2

Kumar Shridhar

@JupyterAI

3 months

Claude 3.7 Max Thinking in Cursor is hands down the best for cloning anything 💻✨.

Deedy

@deedydas

3 months

Claude 3.7 Max Thinking is still the best model for Cursor. — Better at creating apps, refactoring and adding features.— Better at figuring out connections between classes.— Main benefit of Gemini 2.5 is the 1M context

0

1

Kumar Shridhar

@JupyterAI

3 months

The approach works. Consistent improvements over standard distillation. More here:

0

2

Kumar Shridhar

@JupyterAI

3 months

A mismatch between teacher-generated rationales and the student's specific learning requirements is a huge problem in Knowledge Distillation. Solution: Teacher understanding the knowledge gap in student learning and providing tailored rationales that specifically address these.

Kushal

@kushalj001

3 months

Is standard one-step distillation not enough for your smaller agents? 🤖.We present a new iterative distillation approach that enhances teacher🧑‍🏫-student🧑‍🎓 interaction resulting in improved student performance on math reasoning problems. 🧵

1

7

Kumar Shridhar

@JupyterAI

3 months

It’s crazy how the definition of small models is changing so fast. Now it’s 17B MoE with over 100B parameters. Not sure if this will help the open source community to train their own models which was the main reason why llama was so popular.

Ahmad Al-Dahle

@Ahmad_Al_Dahle

3 months

Introducing our first set of Llama 4 models!. We’ve been hard at work doing a complete re-design of the Llama series. I’m so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4

0

3

Kumar Shridhar

@JupyterAI

3 months

RT @hardmaru: 😅

0

145

0

Kumar Shridhar

@JupyterAI

3 months

What’s the equivalent of Vibe coding for AI agent? . Anything specific people are testing?

0

2

Kumar Shridhar

@JupyterAI

3 months

RT @Tongtian_Zhu: ICML 2025's rebuttal process be like🤣:.👨‍💻 Authors: spend a whole week writing a careful rebuttal.✅ Reviewer: clicks "ack….

0

24

0

Kumar Shridhar

@JupyterAI

3 months

Maybe the person had some interests on selling me those seats. He did it knowingly. An AI system would have done the same but unknowingly, following rules. Or maybe sometimes hallucinates something that doesn’t exist. I wonder what’s worse.

0

Kumar Shridhar

@JupyterAI

3 months

I mentioned it’s the same seats. No upgrade. Apparently seats prices are dynamically set and you can pay to get the worse seats. This is what an AI system would have done too. Follow rules without any common sense applied to it.

1

0

Kumar Shridhar

@JupyterAI

3 months

I paid and literally got the same seats with nothing better. Just another row. I boarded, talked to crew members and they got the same person on the call. He mentioned that I paid to get the seats that I got.

1

0

Kumar Shridhar

@JupyterAI

3 months

Sometimes I wonder what’s worse:.A human lying (knowingly) or an LLM hallucinating (unknowingly). Some context: . I booked an @airindia flight lately and wanted to upgrade my seats. I called and a human on the other side asked for some money to upgrade my seats.

2

0

1

Kumar Shridhar

@JupyterAI

7 months

RT @Kasparov63: My congratulations to @DGukesh on his victory today. He has summitted the highest peak of all: making his mother happy!.

0

5K

0

Kumar Shridhar

@JupyterAI

7 months

RT @AravSrinivas: If @narendramodi ji is interested, I would be down to figuring out an economic structure where all Indian students, facul….

0

1K

0

Kumar Shridhar

@JupyterAI

7 months

RT @NielsRogge: Feeling the same tbh

0

75

0