Avijit Ghosh @evijit X Profile

Avijit Ghosh

@evijit

Followers

3K

Following

27K

Media

585

Statuses

6K

Technical AI Policy Researcher @huggingface 🤗 . Current focus: Responsible AI, AI for Science, and @evaluatingevals!

https://t.co/J9Oae4IyZk

Boston, Massachusetts

Joined January 2012

Don't wanna be here? Send us removal request.

Avijit Ghosh

@evijit

1 month

Today, @evaluatingevals is introducing Every Eval Ever, a unified, open data format and public dataset for AI evaluation results.

4

19

54

Avijit Ghosh

@evijit

2 days

What????? 🤦‍♂️

Adrija Bose

@adrijabose

3 days

If you know me, you’ve heard me talk about this story for months. @HeraRizwan reported from 3 states. We obsessed over every detail. Google's AI, designed for phones, is now rationing food to pregnant women. Read. Get angry. Share https://t.co/1heWRv9Ghj @pulitzercenter

0

Avijit Ghosh

@evijit

2 days

So happy to see Every Eval Ever (@evaluatingevals) take off! This is a big vote of confidence, and we really hope that we, as a community of eval practitioners, can move towards open standards that unlock scientific rigor and reproducibility. Thanks @mercor_ai !

Mercor

@mercor_ai

3 days

We just submitted APEX-Agents, APEX-1 and ACE to @evaluatingevals on @huggingface, an OSS initiative to standardize evals and try to reduce the noise in benchmarking.

1

6

18

David Pfau

@pfau

4 days

The degree to which AI research at the big labs has almost entirely been reduced to hill climbing is actually an aberration and not reflective of the rest of science at all. Ironically this means AI research is probably the easiest branch of research to automate.

Georgia Channing

@cgeorgiaw

4 days

I’ve been at a small conference this week, one where the AI people have been presenting early in the week and the domain science people will be presenting later in the week. At the end of the talks last night, the conversation turned very doomer with all the AI people talking

9

15

234

Leshem (Legend) Choshen 🤖🤗 @NeurIPS

@LChoshen

4 days

Evaluation research? There's no place like evalEval

EvalEval Coalition

@evaluatingevals

4 days

3 days left! 📷 Writing, wrote, or just submitted a paper? Commit it to the EvalEval workshop at ACL 2026 in San Diego! https://t.co/JRSr50UA8y (including ARR Submissions, non-archival, positions, and extended abstracts!) Submission Deadline: March 19th, 2026 AoE

2

14

Avijit Ghosh

@evijit

4 days

This is a good time to mention that the latest versions of both Claude and ChatGPT detect the hidden phrases and warn you of prompt injection, so I’m curious as to how this happened anyway/which LLMs were still susceptible

Yu-Xiang Wang

@yuxiangw_cs

4 days

AI watermarking in action at #ICML's avant garde peer-review experiments this year! Quite a few casualties in my SAC batch (an example below --- appropriately redacted hopefully)

0

2

0

Avijit Ghosh

@evijit

4 days

https://t.co/SPQpfZfV2W

huggingface.co

A Blog post by Hugging Face on Hugging Face

0

2

Avijit Ghosh

@evijit

4 days

We at @huggingface are fortunate to have a unique vantage point on the state of open source AI development. We finally wrote down our observations, from both our own research and that of our peers who have done excellent work investigating the open ecosystem with Hugging Face hub

2

5

8

Avijit Ghosh

@evijit

4 days

https://t.co/id4R2m5ks6

0

3

Avijit Ghosh

@evijit

4 days

Always a hoot reading Georgia’s takes! Case in point: While Rosie the dog’s cancer treating MRNA vaccine made with LLMs+Alphafold went viral, several domain scientists on here have pointed out both novelty issues and the structural problems with generalizing this to large scale

Georgia Channing

@cgeorgiaw

4 days

I’ve been at a small conference this week, one where the AI people have been presenting early in the week and the domain science people will be presenting later in the week. At the end of the talks last night, the conversation turned very doomer with all the AI people talking

2

0

9

Avijit Ghosh

@evijit

6 days

Imagine if my little AGI robot knew how to put my “I have worn it once but it’s not yet dirty enough to launder” clothes on this purgatory chair 😍

Millennial Marketer 📣

@1nefortunate

7 days

@simonegiertz made a chair you can dedicate your laundry, and I love the design. she saw a common problem then brought a solution. what do you think?

1

Patrick Heizer

@PatrickHeizer

6 days

Long post, because apparently many neither understand nor appreciate the intricacies of cancer research and think that pharmaceutical companies and regulators are holding back cures. Your immune system is constantly surveilling your body for both self and non-self recognition.

Eddy Lazzarin 🟠🔭

@eddylazzarin

7 days

“You guys are overhyping this” “Yes we can cure cancer and do regularly this way” “Yes the primary obstacles are regulatory/liability” uh

98

123

964

Patrick Heizer

@PatrickHeizer

7 days

Sorry to be the downer because this is an impressive story in some senses. But it is ~trivially easy to make a single mRNA vaccine. It's not hard. I cure mice of various cancers with various therapeutics all the time. I've made mice lose more weight in a month than tirzepatide

Séb Krier

@sebkrier

8 days

This is wild. https://t.co/fA4oTX8fB9

943

420

6K

Alexander Doria

@Dorialexander

7 days

Unusual open data move by a major AI labs: StepFun releases the general SFT training set of Stepfun-Flash.

testtm

@test_tm7873

7 days

Eyy @StepFun_ai released the dataset https://t.co/V8KxKh4EyY :)

6

17

171

Séb Krier

@sebkrier

8 days

This is wild. https://t.co/fA4oTX8fB9

225

2K

13K

Hugging Face

@huggingface

8 days

Seeing the worldwide demand we are kicking off global applications for Hugging Face Builders! If you're passionate about open AI and love bringing people together, this is your invitation to lead ✉️ Learn more about the program and apply to become a Builder ➡️

12

28

245

Avijit Ghosh

@evijit

9 days

👀

Michael Livs

@micLivs

9 days

Anthropic shipped generative UI for Claude. I reverse-engineered how it works and rebuilt it for PI. Extracted the full design system from a conversation export. Live streaming HTML into native macOS windows via morphdom DOM diffing. Article: https://t.co/C3FLF3JB8Z Repo:

0

Avijit Ghosh

@evijit

9 days

Next step: Open sourcing this UX stack 😈 who’s building a nice wrapper that does responsive UX where we can swap out the models in the back end?

2

0

1

Avijit Ghosh

@evijit

9 days

A welcome development! (I’ve been on an anti chatbot rant lately)

Adam Feldman

@feldman

9 days

Starting today, Claude no longer defaults to text. Claude is learning to choose the best medium for each response — based on the task, the data, and what's most useful for the person. Give it a try!

2

Parimal

@Fintech03

12 days

A massive moment for Sovereign AI in India. Keep an eye out for GGUF quants that will soon allow this to run on 64-128GB Macs. If you are building a tool for the Indian market, this is your base model. It handles Hinglish & 22 official languages with a fertility rate (token

Sarvam for Developers

@SarvamForDevs

12 days

Sarvam-105B is trending on Hugging Face 🚀

8

68

409

Jesse Dodge

@JesseDodge

12 days

This looks cool! It would be great if we had a unified way to report eval results. This is basically a reproducibility problem. Two keys to making this work: reporting info about the models (e.g., num params, tokens trained), and eval settings (e.g., num shots). 1/3

EvalEval Coalition

@evaluatingevals

12 days

🧪 Your LLM evaluation results could help the whole field 🚀 🧑‍🔬 Our ACL Shared task is out! We’re building a unified, crowdsourced database to create a common language for AI evaluation reporting. And we need your data. (1/2) https://t.co/SQhEVsqEWg

2

4

18