Shivalika Singh
@singhshiviii
Followers
2K
Following
8K
Media
74
Statuses
2K
Research Engineer @Cohere_Labs @cohere | @huggingface fellow 🤗 | “Research means that you don't know, but are willing to find out” ✨
Lucknow, India
Joined January 2021
LMArena is widely used for model evaluation, but is it measuring true progress? 🔮 In our work, "The Leaderboard Illusion", we reveal: 🔒 Private testing 📊 Data access asymmetries ⚠️ Overfitting risks 🚫 Silent deprecations Despite best intentions, arena policies favor a few!
9
41
202
Why is collaboration key to research? đź’ˇ From communicating effectively across remote teams, to bridging research and real-world impact, to tackling complex interdisciplinary challenges, these insights show how working together drives progress. Join us and connect with a global
0
2
18
From multilingual models to diverse benchmarks and multimodal learning — Day 1 of Connect brings together researchers expanding what’s possible in global AI. 🖇️ Our lightning talks spotlight collaborative work that make AI more representative of the world’s languages. ⚡
1
8
16
At Connect, our keynotes bring collaboration and open science to the forefront 🖇️, exploring how working together accelerates progress and deepens understanding in AI research.
1
3
12
Cohere Co-founder, @1vnzh reminds us why expanding access matters. 🌌 The Connect Conference brings together researchers, leaders, and collaborators shaping a more open and inclusive future for AI. Learn more & register now: https://t.co/UvZGFGljau
2
2
10
💬 Meet researchers, builders, and collaborators driving open science forward 🔎 Explore new work in multilingual modeling and benchmarking ⚡ Get inspired by lightning talks and real-world projects Join us November 18–20 at the Connect Conference. Save your spot today:
1
7
20
Joelle Pineau reminds us how collaboration moves science forward. 🖇️ At the Connect Conference, we’re celebrating the power of working together to accelerate discovery. Be part of the conversation: https://t.co/RURo9z5pqW
0
3
14
A great opportunity to start your collaborations on open ML research
3 days. Worldwide. Inspiring & starting new research collaborations. Introducing the Connect conference. 🖇️ Join for incredible speakers, including @1vnzh @jpineau1 @mziizm & @ShayneRedford + >20 researchers discussing how collaboration and open science are driving progress. 🚀
0
4
33
Cohere Labs x EMNLP 2025 "When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning" This paper introduces a multi-faceted evaluation framework for personalized preference learning in LLMs, revealing significant performance disparities
2
2
7
Cohere Labs x EMNLP 2025: "Nexus: Adaptive Upcycling to Efficiently Pretrain Mixture of Experts" Nexus is an enhanced Mixture of Experts (MoE) architecture that enables flexible specialization and adaptation to new tasks by upcycling dense expert models, achieving significant
1
3
11
We’re thrilled to announce that some of our research will be presented at @emnlpmeeting next week! 🥳 If you’re attending the conference, don’t miss the chance to explore our work and connect with our team.
2
3
12
If you’re at #EMNLP2025, don’t miss our papers and come chat with our team on the ground (Julia Kreutzer, @alexrs95, @ammar__khairi )🌍💬✨
We’re thrilled to announce that some of our research will be presented at @emnlpmeeting next week! 🥳 If you’re attending the conference, don’t miss the chance to explore our work and connect with our team.
0
7
42
Excited to collaborate with Kaggle to bring GlobalMMLU to the platform empowering the community to explore multilingual research! https://t.co/z3kV83chsH
kaggle.com
Lite version of Global-MMLU with human translated samples in 16 languages.
📣 AI researchers & labs - we’ve recently launched Kaggle Benchmarks, a new way to host rigorous, reproducible model evaluations. With Benchmarks, you can: - Reach 27M+ AI/ML developers - Leave infra, maintenance, compute to us - Ensure neutral & transparent evals More in 🧵…
1
3
27
📢Thrilled to introduce ATLAS 🗺️: scaling laws beyond English, for pretraining, finetuning, and the curse of multilinguality. The largest public, multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer: 🌍Are scaling laws different by
6
38
132
3 days. Worldwide. Inspiring & starting new research collaborations. Introducing the Connect conference. 🖇️ Join for incredible speakers, including @1vnzh @jpineau1 @mziizm & @ShayneRedford + >20 researchers discussing how collaboration and open science are driving progress. 🚀
1
17
52
Every LLM learns from human prompts; but what if those prompts are biased, unnatural, or lost in translation? In our recent paper we show that rethinking how we ask can change what models learn.
Can we synthetically generate data that truly captures a language’s richness instead of just translating English datasets? That’s the focus of our most recent work on prompt space optimization for multilingual synthetic data generation: The Art of Asking 🗣️
2
6
27
Curious about how collaboration begins, and end up shaping breakthroughs in AI? Join us (online) for 3 days of talks, keynotes & stories on how research thrives when done together. 🌍
“Individually, we are one drop. Together, we are an ocean.” - Ryunosuke Satoro ✨ Cohere Labs is excited to announce Connect - a 3-day virtual conference celebrating the power of collaboration in open science!
1
5
30
“Individually, we are one drop. Together, we are an ocean.” - Ryunosuke Satoro ✨ Cohere Labs is excited to announce Connect - a 3-day virtual conference celebrating the power of collaboration in open science!
1
3
7
Learning is the goal. 📚 Community lead, @Sree_Harsha_N reminds us that impactful research isn’t about chasing the most complex ideas— it’s about following what excites you and growing through the process. Join our Open Science Community: https://t.co/QBRWGaI4Io
0
1
11
Can we synthetically generate data that truly captures a language’s richness instead of just translating English datasets? That’s the focus of our most recent work on prompt space optimization for multilingual synthetic data generation: The Art of Asking 🗣️
2
10
53
🌍Most multilingual instruction data starts as English - translation can’t capture cultural nuance or linguistic richness. What if we optimized prompts instead of completions? Our recent work on prompt space optimization for multilingual synthetic data addresses this.🗣️
1
5
17