Alexandre Ramé @ramealexandre X Profile

Alexandre Ramé

@ramealexandre

Followers

2K

Following

3K

Media

26

Statuses

749

Research scientist @GoogleDeepMind. Previously PhD @Sorbonne_Univ_. Post-training Gemma LLMs: distillation, RL and merging.

Joined May 2011

Don't wanna be here? Send us removal request.

Alexandre Ramé

@ramealexandre

5 months

Welcome Gemma 3, our new open-weight LLM from @GoogleDeepMind. All sizes (1B, 4B, 12B and 27B) excel on benchmarks, but the key result may be the 27B reaching 1338 on LMSYS. For this, we scaled post-training, with our novel distillation, RL and merging strategies. Happy building!

4

24

204

Alexandre Ramé

@ramealexandre

6 hours

RT @chargoddard: another entry on the list of things model merging can replace: learning rate scheduling

0

13

0

Alexandre Ramé

@ramealexandre

3 days

RT @GoogleDeepMind: Our new AI model AlphaEarth Foundations is mapping the planet in astonishing detail. 🌏🔍. Scientists will now be able to….

0

656

0

Alexandre Ramé

@ramealexandre

8 days

RT @2prime_PKU: Anyone knows adam?

0

463

0

Alexandre Ramé

@ramealexandre

12 days

RT @GoogleDeepMind: An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International….

0

781

0

Alexandre Ramé

@ramealexandre

16 days

RT @natolambert: hahahahahahaha the top US open models are gemma 3 27b and @nvidia's finetune of llama 3.1

0

40

0

Alexandre Ramé

@ramealexandre

16 days

RT @NeurIPSConf: NeurIPS is pleased to officially endorse EurIPS, an independently-organized meeting taking place in Copenhagen this year,….

0

114

0

Alexandre Ramé

@ramealexandre

18 days

RT @balesni: A simple AGI safety technique: AI’s thoughts are in plain English, just read them. We know it works, with OK (not perfect) tra….

0

103

0

Alexandre Ramé

@ramealexandre

18 days

RT @caglarml: I am proud that our latest work on a novel RL method for foundation models/LLMs is finally out!. 1️⃣ Why does QRPO matter?.Al….

0

6

0

Alexandre Ramé

@ramealexandre

18 days

RT @cdancette: Happy to share two publications by our team!. First, RadSAM: Segmenting 3D radiological images with a 2D promptable model by….

arxiv.org

Medical image segmentation is a crucial and time-consuming task in clinical care, where mask precision is extremely important. The Segment Anything Model (SAM) offers a promising approach, as it...

0

4

0

Alexandre Ramé

@ramealexandre

18 days

RT @MustafaShukor1: We propose new scaling laws that predict the optimal data mixture, for pretraining LLMs, native multimodal models and l….

0

48

0

Alexandre Ramé

@ramealexandre

19 days

RT @SkanderMoalla: 🚀 Big time! We can finally do LLM RL fine-tuning with rewards and leverage offline/off-policy data!. ❌ You want rewards,….

0

36

0

Alexandre Ramé

@ramealexandre

23 days

RT @vitrupo: Demis Hassabis wants to create a third pole in AI. A counterweight to US and China dominance. A coalition of like-minded coun….

0

83

0

Alexandre Ramé

@ramealexandre

23 days

RT @demishassabis: Great conversation today with President @EmmanuelMacron, @ArthurMensch, & @AmandaWolt at @ImperialCollege on how interna….

0

65

0

Alexandre Ramé

@ramealexandre

1 month

RT @y0b1byte: This is an amazingly well written paper, give it a read. Kind of papers you are jealous you are not on the authors' list. @ma….

0

28

0

Alexandre Ramé

@ramealexandre

1 month

RT @karpathy: The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability….

0

1K

0

Alexandre Ramé

@ramealexandre

1 month

RT @demishassabis: Our open source Gemma models are the most powerful single GPU/TPU models out there! Our latest model Gemma 3n has amazin….

aistudio.google.com

The fastest path from prompt to production with Gemini

0

412

0

Alexandre Ramé

@ramealexandre

1 month

RT @googleaidevs: Announcing the full release of Gemma 3n, bringing powerful multimodal capabilities to edge devices for developers 🙌 ↓.htt….

developers.googleblog.com

0

172

0

Alexandre Ramé

@ramealexandre

1 month

RT @osanseviero: I’m so excited to announce Gemma 3n is here! 🎉. 🔊Multimodal (text/audio/image/video) understanding.🤯Runs with as little as….

0

330

0

Alexandre Ramé

@ramealexandre

1 month

RT @swyx: whoa so @thinkymachines is doing model merging + customized RL. quite a come-up for merging in the past couple weeks, with @arcee….

0

50

0

Alexandre Ramé

@ramealexandre

1 month

RT @fly51fly: [LG] Less is More: Undertraining Experts Improves Model Upcycling.S Horoi, G Wolf, E Belilovsky, G K Dziugaite [Université de….

0

4

0