Benjamin Muller @ben_mlr X Profile

Benjamin Muller

@ben_mlr

Followers

968

Following

2K

Media

14

Statuses

190

Research in AI. Focusing on scaling language models multi-modally & multilingually @AIatMeta

https://t.co/dOLR8QdmUU

NYC

Joined April 2016

Don't wanna be here? Send us removal request.

Benjamin Muller

@ben_mlr

1 year

So many exciting releases from FAIR @AIatMeta Super happy to see Spirit LM now open-sourced. Spirit LM unlocks expressive speech generation through interleaving speech-text training and phonetic(hubert)+pitch+style-specific tokenization. Available here: Weights:

AI at Meta

@AIatMeta

1 year

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from @jpineau1. This work is another important step

1

14

Yen-Ju Lu

@Yen_Ju_Lu

2 months

🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️ Paper 📄 https://t.co/4nUsbC1YKF

7

17

34

Aymeric Zhuo

@aymericzzz

2 months

Introducing @CodeWordsAI , the fastest way to go from idea to automation, simply by chatting with AI. No more drag-and-drop and configuration. Save time by doing less. Available today for free, for everyone. The Cursor moment for automation is here.

7

14

47

Artidoro Pagnoni

@ArtidoroPagnoni

4 months

Thrilled to share that our Byte Latent Transformer won an Outstanding Paper Award at ACL 2025! 🏆

Artidoro Pagnoni

@ArtidoroPagnoni

1 year

🚀 Introducing the Byte Latent Transformer (BLT) – An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens 🤯 Paper 📄 https://t.co/5QGrlJdK0y Code 🛠️ https://t.co/jCdDI5BXwe

16

31

282

Gargi Ghosh

@gargighosh

4 months

Outstanding paper award! @aclmeeting - BLT: https://t.co/QATnyus5Xb

5

8

133

Percy Liang

@percyliang

8 months

We ran Llama 4 Maverick through some HELM benchmarks. It is 1st on HELM capabilities (MMLU-Pro, GPQA, IFEval, WildBench, Omni-MATH), but… https://t.co/uKMHRe7xKF

5

18

140

AI at Meta

@AIatMeta

8 months

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model

835

2K

13K

Jason Weston

@jaseweston

10 months

🚨 Diverse Preference Optimization (DivPO) 🚨 SOTA LLMs have model collapse🫠: they can't generate diverse creative writing or synthetic data 🎨 DivPO trains for both high reward & diversity, vastly improving variety with similar quality. Paper 📝: https://t.co/bRwq3d3wJq 🧵below

1

78

343

Gargi Ghosh

@gargighosh

11 months

We released new research - Byte Latent Transformer(BLT) BLT encodes bytes into dynamic patches using light-weight local models and processes them with a large latent transformer. Think of it as a transformer sandwich!

AI at Meta

@AIatMeta

11 months

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️ https://t.co/0iamZCRnMN

11

85

661

AI at Meta

@AIatMeta

11 months

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️ https://t.co/0iamZCRnMN

28

192

1K

Benjamin Muller

@ben_mlr

1 year

Groundbreaking scaling trends for Byte-level Language Modeling with the new BLT architecture 🚀 More insights in the thread 🧵

Artidoro Pagnoni

@ArtidoroPagnoni

1 year

🚀 Introducing the Byte Latent Transformer (BLT) – An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens 🤯 Paper 📄 https://t.co/5QGrlJdK0y Code 🛠️ https://t.co/jCdDI5BXwe

0

4

21

Benjamin Muller

@ben_mlr

1 year

Congrats @aymericzzz and team on being live! Very exciting vision to build entire softwares with just a prompt

Aymeric Zhuo

@aymericzzz

1 year

Excited to share more about our background, vision and where we're headed at @agemoai with @r1ddhi at @BusinessInsider 𝗢𝘂𝗿 𝘃𝗶𝘀𝗶𝗼𝗻 𝗶𝘀 𝘁𝗼 𝗲𝗻𝗮𝗯𝗹𝗲 𝗮𝗻𝘆𝗼𝗻𝗲 𝘁𝗼 𝗰𝗿𝗲𝗮𝘁𝗲 𝘀𝗼𝗳𝘁𝘄𝗮𝗿𝗲 – from an idea to fully deployed software. The critical path to

0

2

4

Xiang Yue

@xiangyue96

1 year

🌍 I’ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries! 🚀 Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! 🌐✨

7

81

378

Yann LeCun

@ylecun

1 year

Meta Spirit LM: open source language model that mixes text and speech.

AI at Meta

@AIatMeta

1 year

Today we released Meta Spirit LM — our first open source multimodal language model that freely mixes text and speech. Many existing AI voice experiences today use ASR to techniques to process speech before synthesizing with an LLM to generate text — but these approaches

19

72

331

Benjamin Muller

@ben_mlr

1 year

Recent LLMs (e.g. LLama 3 🦙) are increasingly good at Math. However, this progress is reserved for languages with large amounts of task-specific instruct-tuning data. In this work @AIatMeta (led by @LucasBandarkar ), we introduce a new model merging technique called **Layer

Lucas Bandarkar

@LucasBandarkar

1 year

Cross-lingual transfer can be as easy as swapping model layers between LLMs! 🔀 Our model merging method can compose math and language skills by swapping top&bottom layers from a SFT’d target language expert into a math expert without retraining https://t.co/IN5JPdTYU4 🧵: [1/3]

2

7

28

Andrew Brown

@Andrew__Brown__

1 year

OK here goes the "excited to share ...." post Want to know how to train a T2V model (with other amazing capabilities) that beats ALL prior work ?? Well we released a 90 page tech report with every detail 😊 https://t.co/FU2PzloDhr… Thanks to the amazing team!

ai.meta.com

Meta Movie Gen is our latest research breakthrough that allows you to use simple text inputs to create videos and sounds, edit existing videos or transform your personal image into a unique video.

AI at Meta

@AIatMeta

1 year

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in

10

17

178

Chunting Zhou

@violet_zct

1 year

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. https://t.co/h9PyPl1zNc Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

24

215

1K

AI at Meta

@AIatMeta

1 year

LLM Evaluations are an important area of work — today we're announcing a new LLM Evaluation Research Grant to foster further innovation in this area. Recipients will get $200K in funding to support this work. We're accepting proposals until September 6 ➡️ https://t.co/0tJcAFq4RO

15

90

456

AI at Meta

@AIatMeta

1 year

Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context

264

1K

6K