Eustache Le Bihan
@eustachelb
Followers
701
Following
313
Media
16
Statuses
176
Introducing the first iteration of our Speech-to-Speech pipeline 🗣️. Choose the LLM you want and converse with it with a latency reaching up to 500ms! Who said we need Speech-to-Speech models to be fast?
16
82
632
Here is a tutorial on training LLaSA (LLaMA-based TTS) using GRPO to improve prosody, rhythm, and expressiveness in synthesized speech with TRL!
10
30
171
Excited to release our new open-source collaboration with Meta: OpenEnv Pushing for better research/open-source usage practices on agents (LLM/VLM/code). We want to bring reproducible practices in frontier agentic research (like the recent Code World Model) with a
huggingface.co
Excited to share OpenEnv: frontier-grade RL environments for the open-source community 🔥! https://t.co/KVeBMsxohL 🧩 Modular interfaces: a clean Gymnasium-style API (reset(), step(), state()) that plugs into any RL framework 🐳 Built for scale: run environments in containers
5
37
216
Cool release by @LiquidAI_: LFM2-Audio-1.5B It’s a pretty cool omni-architecture that enables prediction of both text and audio tokens, meaning it can handle multi-turn S2S, ASR, and TTS (with voice description) within a single model. Great to see, once again this year, a model
2
32
161
Cool release by @LiquidAI_: LFM2-Audio-1.5B It’s a pretty cool omni-architecture that enables prediction of both text and audio tokens, meaning it can handle multi-turn S2S, ASR, and TTS (with voice description) within a single model. Great to see, once again this year, a model
2
32
161
🚀 Big news: we’re moving towards the v5 release of transformers! After months of teasing, it’s finally happening 🎉 What to expect in v5: ✨ Cutting-edge stack — fast models, with fast kernels ✨ Smarter defaults — better out-of-the-box experience ✨ Cleaner codebase —
22
53
447
Here is a minimal Voxtral finetuning repo (with 🤗 Transformers and PEFT).
3
9
89
Landing (really) soon in Transformers!
1
1
34
I wanted to experiment with our CSM port and see how easy it would be to fine-tune using 🤗 Transformers, and it truly is! Just grabbed an anime voice dataset from the Hub and fine-tuned the model to learn a predefined set of voices. Almost effortless. Here’s a detailed
1
2
10
The biggest dataset of human written GPU Code all open-source? 👀 YES Please! We at @GPU_MODE have released around 40k 🚀 human written code samples spanning Triton, Hip and PyTorch and it's all open on the @huggingface Hub. Train the new GPT to make GPTs faster ⚡️ Link below ⬇️
3
52
322
Dia TTS from Nari Labs just dropped in the latest 🤗 Transformers release! Here's a simple training script—LoRA and Flash-Attn compatible!
2
27
237
Holy... `transformers` reached 1B downloads 😭 thanks everyone for making this possible, what an amazing community
9
29
230
Kyutai TTS and Unmute are now open source! The text-to-speech is natural, customizable, and fast: it can serve 32 users with a 350ms latency on a single L40S. Try it out and get started on the project page: https://t.co/B4P9FuOrQc
51
179
1K
We've open-sourced the Seamless Interaction Dataset on @huggingface -- the world's largest in-person conversation dataset with over 4,000 hours of two-person interactions and over 4,000 unique participants Download:
huggingface.co
7
51
230
Transformers v4.53.0 is out! 🚀 Excited to welcome new members to the Audio family: 1.Kyutai STT (1B & 2.6B): A strong Whisper contender with a novel streaming approach for real-time transcription. More on its architecture and why it's cool in upcoming tweets. 2.Dia (1.6B): A TTS
1
0
4