sohamg121 Profile Banner
Soham Profile
Soham

@sohamg121

Followers
421
Following
2K
Media
94
Statuses
605

Research Scientist @MistralAI Previously: Google Deepmind

Mountain View, CA
Joined August 2013
Don't wanna be here? Send us removal request.
@sohamg121
Soham
9 days
**some conversation** Me: Let me check with the AI Gods. Wife: Which one? It's like Hinduism, there are so many.
0
0
1
@sohamg121
Soham
1 month
am I the only one who remembers this banger looking at DeepSeek-OCR's DeepEncoder
0
0
2
@sohamg121
Soham
1 month
There needs to be a voice agent that listens to people yapping and emails/pings others only if needed, thereby solving "This meeting should have been an email"
0
0
0
@sohamg121
Soham
3 months
Grateful to have done so much with the most thoughtful and smart teammates I could have asked for. I'm sure we will continue to do great work - just keep shipping, no hype. Also personally wild to be part of a picture with a caption that has so many big beautiful numbers in it.
@AnjneyMidha
Anjney Midha
3 months
20 months ago, we @a16z led @MistralAI's $500M Series A at $2B Some laughed. Scientists? Serving giant enterprises? Open source AI? From Europe? Good luck! Today: We closed $2B at $13.7B+ led by ASML, and $1.6B+ TCV, deploying RL at scale for critical industries LE WARMUP
0
0
4
@sohamg121
Soham
3 months
late night thoughts: > generational trauma is SFT > forging your own path is RL
0
0
4
@sohamg121
Soham
3 months
Baffles me that in a city as beautiful as this, people still choose to be obsessed about SaaS and shit.
0
0
1
@sohamg121
Soham
3 months
Read Story of Your Life and re-watched Arrival immediately - and this question blew my mind: Why did we not evolve written medium to be complementary to speech (and not the same)? Math expresses some aspects non-linearly and in less bits than natural language, but is still
2
0
2
@GuillaumeLample
Guillaume Lample @ NeurIPS 2024
3 months
Mistral Medium 3.1 is 2nd on LMArena without style control. Very proud of the @MistralAI team !
@sophiamyang
Sophia Yang, Ph.D.
3 months
🔥@MistralAI Mistral Medium 3.1: Our ‘minor’ update just landed 8th on the @lmarena leaderboard—competitive with models with much larger sizes. 🚀 Smaller, but mightier!
6
20
164
@MistralAI
Mistral AI
4 months
Introducing Mistral Medium 3.1. Overall performance boost, tone improvement, smarter web searches. Try it now in Le Chat (default model) or via our API (`mistral-medium-2508`).
112
264
2K
@MistralAI
Mistral AI
4 months
In our continued commitment to open-science, we are releasing the Voxtral Technical Report: https://t.co/fIH9uW8qdZ The report covers details on pre-training, post-training, alignment and evaluations. We also present analysis on selecting the optimal model architecture, which
37
202
1K
@sohamg121
Soham
4 months
It's a great open-source model for transcription, but also most importantly: * You can use it as a single model for text-only applications (Voxtral-Small is just as good as Small 3.1) * Also supports goodies like native function calling, long-form audio summarization,
0
0
0
@sohamg121
Soham
4 months
We also release a bunch of evals (e.g. GSM8K-speech, TriviaQA-Speech and MMLU-Speech) that we used internally to test trivia/math/general knowledge reasoning capabilities of our audio LLMs.
1
0
0
@sohamg121
Soham
4 months
We did on-policy DPO to further improve the quality of model responses leveraging infrastructure built for Magistral and found that being on-policy helps a lot - especially to avoid regressions on ASR.
1
0
0
@sohamg121
Soham
4 months
We do continued pretraining with two patterns: one to improve conversational/question-answering capabilities, and one to straight up maximize transcription performance. Balance of both is important to create an all-round amazing model.
1
0
0
@sohamg121
Soham
4 months
The Voxtral tech-report is up! https://t.co/1qzV3ShvDE We release these models with a permissive Apache 2.0 license. Feedback is welcome! We have a lot more cooking, this is just the beginning.
Tweet card summary image
arxiv.org
We present Voxtral Mini and Voxtral Small, two multimodal audio chat models. Voxtral is trained to comprehend both spoken audio and text documents, achieving state-of-the-art performance across a...
1
3
14
@sohamg121
Soham
4 months
Just a few months ago, I had no audio expertise beyond using it as a feature for video classification. Super proud of the first @MistralAI release in the audio space - it's been an amazing learning journey. There's of course more coming very soon!
@MistralAI
Mistral AI
4 months
Introducing the world's best (and open) speech recognition models!
0
1
16
@sohamg121
Soham
5 months
Trying to get ahead of soham-gate: * I only work at @MistralAI * I'm very proud of the team I work with and the amazing models we ship. Reach out if you would like to do the same! https://t.co/6ZsB0xot1R
@madiator
Mahesh Sathiamoorthy
5 months
Looks like Soham applied to Bespoke as well (via a google form we had -- and his CV was uploaded). This is the new badge to carry: if Soham didn't apply you are not a serious startup. :D
1
0
11
@sohamg121
Soham
6 months
Tons of compute is not all you need - you need smart people who know how to use it efficiently. Come join us at @MistralAI and train the most cost efficient models!
@rohanpaul_ai
Rohan Paul
6 months
Meta's GPU count compared to others
0
0
4
@MistralAI
Mistral AI
6 months
Announcing Magistral, our first reasoning model designed to excel in domain-specific, transparent, and multilingual reasoning.
106
453
3K
@arena
lmarena.ai
6 months
📰 News in Arena: Mistral Medium 3 makes a strong debut with the community! Highlights: 💠 #11 overall in chat: a +90 point leap from Mistral Large 💠Top-tier in technical domains (#5 in Math, #7 in Hard Prompts & Coding) 💠#9 in WebDev Arena Congrats to @MistralAI on the
@MistralAI
Mistral AI
7 months
Introducing Mistral Medium 3: our new multimodal model offering SOTA performance at 8X lower cost. - A new class of models that balances performance, cost, and deployability. - High performance in coding and function-calling. - Full enterprise capabilities, including hybrid or
6
47
302