Debangan Mishra @DebanganM10375 X Profile

Debangan Mishra

@DebanganM10375

Followers

35

Following

14

Media

4

Statuses

12

Student @iiit_hyderabad, interested in artificial intelligence, AI safety and computer vision.

Hyderabad

Joined April 2024

Don't wanna be here? Send us removal request.

Debangan Mishra

@DebanganM10375

7 days

Continued: @MahsaMassoud @KodaliPrashant @jonasgeiping @MatthiasBethge @AIEleuther @NPCollapse @rajarshee_mitra @anoopk @BharatGen_Com @ganramkr @rishibal @prpankajsingh

0

Debangan Mishra

@DebanganM10375

7 days

Our work might be of interest to: @linguist_cat @nabla_theta @singhshiviii @zhaofeng_wu @monojitchou @aidangomez @cohere @nickfrosst @1vnzh @LChoshen

1

0

Debangan Mishra

@DebanganM10375

7 days

8/8 Work With: @rastogiarihant1 @NegiAgyeya @ShashwatGoel7 @ponguru preprint 📜: https://t.co/kDBTbiTSkP You can find more details about the functional similarity metric - CAPA along with proofs of its robustness in this paper:

arxiv.org

As Language Model (LM) capabilities advance, evaluating and supervising them at scale is getting harder for humans. There is hope that other language models can automate both these tasks, which we...

1

0

Debangan Mishra

@DebanganM10375

7 days

7/8 🔎 There are several exciting observations and insights which can be made by adopting a functional similarity oriented approach towards multilinguality, as is shown by our exploratory analysis. It is the time to take a step beyond accuracy on multilingual benchmarks!

1

0

Debangan Mishra

@DebanganM10375

7 days

6/8 💬 Lastly, we also find that for GlobalMMLU, models tend to be more consistent among themselves for different languages than they are across models for different languages. This can have exciting downstream consequences in multilingual multi-agent systems!

1

0

Debangan Mishra

@DebanganM10375

7 days

5/8 📝 We explore the hypotheses - do models become more consistent for languages with more data? It turns out - they do! Using the number of Wikipedia articles per language as a proxy, we find that models are more consistent on average for high resource languages.

1

0

Debangan Mishra

@DebanganM10375

7 days

4/8 🤖 LLMs are more consistent across subjects which have a less cultural prior such as mathematics, but more inconsistent for the humanities and social sciences. Consistency of models across languages is proportional to their performance and number of parameters!

1

0

Debangan Mishra

@DebanganM10375

7 days

3/8 🌐 We need something more — functional similarity! We use it as a new lens to study multilinguality in LLMs and uncover insights on GlobalMMLU, a large parallel multilingual corpus testing models on STEM, law, etc. We evaluate on top models like Qwen3 and Gemma3.

1

0

Debangan Mishra

@DebanganM10375

7 days

2/8 🤔 What if a model has the same accuracy on Hindi & English, yet changing the language makes it do math differently? Questions it once answered correctly in Korean become wrong in Telugu? All while accuracy stays the same!

1

0

Debangan Mishra

@DebanganM10375

7 days

1/8 🎉Our paper "What if I ask alia lingua? Measuring Functional Similarity Across Languages" has been accepted in the 5th MRL Workshop at #EMNLP2025. Accuracy isn’t enough – multilingual LLMs may score equally yet act very differently across languages! 🧵👇

2

0

9

Akshit

@akshitwt

10 months

How do we remove the effect of incorrect training data on a trained model? Is retraining from scratch the best we can do? We show no! Cognac 🍷 is 8x faster than retraining from scratch, and works even when you discover as little as 5% of such data. How? 🧵👇

3

7

46