Daniel D'souza
@mrdanieldsouza
Followers
877
Following
5K
Media
71
Statuses
2K
Research Engineer @Cohere_Labs💙 | @UMichECE Alum 〽️ | 🇮🇳✖️🇺🇸 💫"The Universe Works in Mysterious Ways"💫
Ann Arbor, MI
Joined November 2016
“Best Paper Award” @ ACL 2024 🪄What an incredible culmination of perseverance to connect and represent languages around the 🗺️! 🪄 🤗 Huge thanks to the @aclmeeting committee for recognizing the massive effort behind Project Aya @CohereForAI 💙 #ACL2024
I'm incredibly proud that Aya received #ACL2024 Best Paper Award 🥹. Huge congratulations to the Aya team and @CohereForAI community who make this possible by for extending frontiers of LLMs to multilingual, building Aya Model and Aya Dataset 🌿🌏
4
9
53
From multilingual adaptation to long-tail control, exploring new approaches in multilingual LLMs, and rethinking leaderboards — Day 3 of Connect lightning talks highlights the future directions in AI research 🚀
1
5
16
Featuring: @dianaabagyan @mrdanieldsouza @viraataryabumi @singhshiviii Julia Kreutzer a Yiyang (Oliver) Nan Explore the sessions and register for free today:
events.zoom.us
0
4
5
I'm building a new team in Toronto! We're going to work on agentic capabilities for cybersecurity. This is a unique opportunity to bridge cutting-edge AI research with real-world impact, directly enhancing digital security for citizens in Canada and globally.
jobs.ashbyhq.com
Design and implement novel research ideas, ship state of the art models to production, and maintain deep connections to academia and the government.
21
40
518
✨ We’re thrilled to host 20+ speakers for Connect 2025, Cohere Labs’ 3-day virtual conference celebrating the power of collaboration in open science. Meet the researchers, innovators, and community builders who will share their insights 🧵 Register now: https://t.co/RURo9z4RBo
1
2
9
Huge congrats to @ammar__khairi, Julia Kreutzer, @mrdanieldsouza, Ye Shen! Understanding inference compute is very important gradient free research direction. Important paper for its rigor, understated results (the writing is very humble) and extensive treatment.
Cohere Labs x EMNLP 2025: "When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs" This study explores scalable inference-time compute techniques for multilingual, multi-task generative tasks, proposing novel sampling and selection
1
4
43
3 days. Worldwide. Inspiring & starting new research collaborations. Introducing the Connect conference. 🖇️ Join for incredible speakers, including @1vnzh @jpineau1 @mziizm & @ShayneRedford + >20 researchers discussing how collaboration and open science are driving progress. 🚀
1
17
52
There so many cheap optimizations 🪄 in the synthetic data space that can drive real downstream gains 📈 in multilingual models 🗺️ The talented @DavidM4302 🌟 explores this very topic in his Research Scholar Program @Cohere_Labs 💙 🔥Check out his work! 👇
Can we synthetically generate data that truly captures a language’s richness instead of just translating English datasets? That’s the focus of our most recent work on prompt space optimization for multilingual synthetic data generation: The Art of Asking 🗣️
1
4
17
I am hiring highly skilled performance engineers for my team! You will be working on optimising pretraining for models >100B params on O(1000s) of GPUs, and hardware-aligned architecture design. We are cooking a lot of very exciting projects and I can safely say you will have a
14
45
460
Today at COLM, Cohere Labs Sr Research Scientist, Julia Kreutzer will be presenting at 2 workshops. First, the Multilingual Data Quality Signals workshop, bringing together researchers across disciplines to discuss & present research on data quality signals in multilingual data.
1
5
36
Today at COLM, we are excited to share our work Déjà Vu: Multilingual LLM Evaluation through the Lens of Machine Translation Evaluation, during Poster Session 4, 4:30 - 6:30pm. Come connect with paper authors Julia Kreutzer and Tom Kocmi.
1
4
27
What happens when your verifier decides what your model can (and can't) learn? We've been digging into this for a while, and we're excited to finally share our findings 🧵
5
27
186
LLMs already generate far more knowledge than we use and most of it is thrown away. What if we could use it all? To scale smarter, generate richer data, and train models that actually learn from diversity?
Happy to share our work: Making, not taking, the Best of N ✨ Best-of-N (BoN) is the go-to for inference scaling and a cornerstone of building synth data for SOTA models. But can we do better than the Best? Introducing Fusion-of-N: a simple and powerful way to go beyond BoN 🧵
2
7
81
Is Best-of-N really the best use of your inference compute? Introducing Fusion-of-N: a simple and powerful way to advance inference and distillation beyond Best-of-N.
2
7
27
Why do all this work to sample N generations and then throw away N-1 of them? 🤦♂️ Don’t just Take, but Make the Best of your N! 🧙♂️Fusion-of-N🔥 Strong work from @ammar__khairi ⭐️ as part of the Research Scholars program at @Cohere_Labs 💙
Happy to share our work: Making, not taking, the Best of N ✨ Best-of-N (BoN) is the go-to for inference scaling and a cornerstone of building synth data for SOTA models. But can we do better than the Best? Introducing Fusion-of-N: a simple and powerful way to go beyond BoN 🧵
0
9
35
Looking forward to sharing this work at @NeurIPSConf 2025. 🎉 Let's chat about how training optimization boosts long-tail performance. 🚀 Congrats to the team: @mrdanieldsouza, Julia Kreutzer, @adrien_morisot, @ahmetustun89, and @sarahookr.
🤹 How do we move away from complicated and brittle prompt engineering at inference for under-represented tasks?🤔 🧠 Our latest work finds that optimizing training protocols improves controllability and boosts performance on underrepresented use cases at inference time 📈
0
3
20
Very proud of this work which has been accepted to @NeurIPSConf 2025🔥 Very impactful work which enables more flexibility at test-time for adaptation to long-tail highly infrequent features. Congrats to @mrdanieldsouza w @adrien_morisot, Julia Kreutzer, @ahmetustun89 ✨
🚨 Wait, adding simple markers 📌during training unlocks outsized gains at inference time?! 🤔 🚨 Thrilled to share our latest work at @Cohere_Labs: “Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers“ that explores this phenomenon! Details in 🧵 ⤵️
3
6
94
We struck 🪙!! "Treasure Hunt" is accepted at @NeurIPSConf 2025! 🥳 🤗Extra special this being my 1st first-author maintrack w/ Julia Kreutzer, @ahmetustun89, @adrien_morisot & @sarahookr 💙 Sometimes the stars write the storyline for you and it’s brighter than you imagined 🪄
Very proud of this work which has been accepted to @NeurIPSConf 2025🔥 Very impactful work which enables more flexibility at test-time for adaptation to long-tail highly infrequent features. Congrats to @mrdanieldsouza w @adrien_morisot, Julia Kreutzer, @ahmetustun89 ✨
1
5
28
Stoked that @mziizm leads us into the next chapter at @Cohere_Labs ! Expect Magic ✨💙
I'm excited to share that I'll be stepping into the role of Head of Cohere Labs. It's an honor and a responsibility to lead such an extraordinary group of researchers pushing the boundaries of AI research.
1
2
25