ben_mlr Profile Banner
Benjamin Muller Profile
Benjamin Muller

@ben_mlr

Followers
968
Following
2K
Media
14
Statuses
190

Research in AI. Focusing on scaling language models multi-modally & multilingually @AIatMeta

NYC
Joined April 2016
Don't wanna be here? Send us removal request.
@ben_mlr
Benjamin Muller
1 year
So many exciting releases from FAIR @AIatMeta Super happy to see Spirit LM now open-sourced. Spirit LM unlocks expressive speech generation through interleaving speech-text training and phonetic(hubert)+pitch+style-specific tokenization. Available here: Weights:
@AIatMeta
AI at Meta
1 year
Open science is how we continue to push technology forward and today at Meta FAIR weโ€™re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from @jpineau1. This work is another important step
1
1
14
@Yen_Ju_Lu
Yen-Ju Lu
2 months
๐Ÿš€ Introducing the Latent Speech-Text Transformer (LST) โ€” a speech-text model that organizes speech tokens into latent patches for better textโ†’speech transfer, enabling steeper scaling laws and more efficient multimodal training โšก๏ธ Paper ๐Ÿ“„ https://t.co/4nUsbC1YKF
7
17
34
@aymericzzz
Aymeric Zhuo
2 months
Introducing @CodeWordsAI , the fastest way to go from idea to automation, simply by chatting with AI. No more drag-and-drop and configuration. Save time by doing less. Available today for free, for everyone. The Cursor moment for automation is here.
7
14
47
@ArtidoroPagnoni
Artidoro Pagnoni
4 months
Thrilled to share that our Byte Latent Transformer won an Outstanding Paper Award at ACL 2025! ๐Ÿ†
@ArtidoroPagnoni
Artidoro Pagnoni
1 year
๐Ÿš€ Introducing the Byte Latent Transformer (BLT) โ€“ An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens ๐Ÿคฏ Paper ๐Ÿ“„ https://t.co/5QGrlJdK0y Code ๐Ÿ› ๏ธ https://t.co/jCdDI5BXwe
16
31
282
@gargighosh
Gargi Ghosh
4 months
Outstanding paper award! @aclmeeting - BLT: https://t.co/QATnyus5Xb
5
8
133
@percyliang
Percy Liang
8 months
We ran Llama 4 Maverick through some HELM benchmarks. It is 1st on HELM capabilities (MMLU-Pro, GPQA, IFEval, WildBench, Omni-MATH), butโ€ฆ https://t.co/uKMHRe7xKF
5
18
140
@AIatMeta
AI at Meta
8 months
Today is the start of a new era of natively multimodal AI innovation. Today, weโ€™re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick โ€” our most advanced models yet and the best in their class for multimodality. Llama 4 Scout โ€ขย 17B-active-parameter model
835
2K
13K
@jaseweston
Jason Weston
10 months
๐Ÿšจ Diverse Preference Optimization (DivPO) ๐Ÿšจ SOTA LLMs have model collapse๐Ÿซ : they can't generate diverse creative writing or synthetic data ๐ŸŽจ DivPO trains for both high reward & diversity, vastly improving variety with similar quality. Paper ๐Ÿ“: https://t.co/bRwq3d3wJq ๐Ÿงตbelow
1
78
343
@gargighosh
Gargi Ghosh
11 months
We released new research - Byte Latent Transformer(BLT) BLT encodes bytes into dynamic patches using light-weight local models and processes them with a large latent transformer. Think of it as a transformer sandwich!
@AIatMeta
AI at Meta
11 months
New from Meta FAIR โ€” Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper โžก๏ธ https://t.co/0iamZCRnMN
11
85
661
@AIatMeta
AI at Meta
11 months
New from Meta FAIR โ€” Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper โžก๏ธ https://t.co/0iamZCRnMN
28
192
1K
@ben_mlr
Benjamin Muller
1 year
Groundbreaking scaling trends for Byte-level Language Modeling with the new BLT architecture ๐Ÿš€ More insights in the thread ๐Ÿงต
@ArtidoroPagnoni
Artidoro Pagnoni
1 year
๐Ÿš€ Introducing the Byte Latent Transformer (BLT) โ€“ An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens ๐Ÿคฏ Paper ๐Ÿ“„ https://t.co/5QGrlJdK0y Code ๐Ÿ› ๏ธ https://t.co/jCdDI5BXwe
0
4
21
@ben_mlr
Benjamin Muller
1 year
Congrats @aymericzzz and team on being live! Very exciting vision to build entire softwares with just a prompt
@aymericzzz
Aymeric Zhuo
1 year
Excited to share more about our background, vision and where we're headed at @agemoai with @r1ddhi at @BusinessInsider ๐—ข๐˜‚๐—ฟ ๐˜ƒ๐—ถ๐˜€๐—ถ๐—ผ๐—ป ๐—ถ๐˜€ ๐˜๐—ผ ๐—ฒ๐—ป๐—ฎ๐—ฏ๐—น๐—ฒ ๐—ฎ๐—ป๐˜†๐—ผ๐—ป๐—ฒ ๐˜๐—ผ ๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜๐—ฒ ๐˜€๐—ผ๐—ณ๐˜๐˜„๐—ฎ๐—ฟ๐—ฒ โ€“ from an idea to fully deployed software. The critical path to
0
2
4
@xiangyue96
Xiang Yue
1 year
๐ŸŒ Iโ€™ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries! ๐Ÿš€ Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! ๐ŸŒโœจ
7
81
378
@ylecun
Yann LeCun
1 year
Meta Spirit LM: open source language model that mixes text and speech.
@AIatMeta
AI at Meta
1 year
Today we released Meta Spirit LM โ€” our first open source multimodal language model that freely mixes text and speech. Many existing AI voice experiences today use ASR to techniques to process speech before synthesizing with an LLM to generate text โ€” but these approaches
19
72
331
@ben_mlr
Benjamin Muller
1 year
Recent LLMs (e.g. LLama 3 ๐Ÿฆ™) are increasingly good at Math. However, this progress is reserved for languages with large amounts of task-specific instruct-tuning data. In this work @AIatMeta (led by @LucasBandarkar ), we introduce a new model merging technique called **Layer
@LucasBandarkar
Lucas Bandarkar
1 year
Cross-lingual transfer can be as easy as swapping model layers between LLMs! ๐Ÿ”€ Our model merging method can compose math and language skills by swapping top&bottom layers from a SFTโ€™d target language expert into a math expert without retraining https://t.co/IN5JPdTYU4 ๐Ÿงต: [1/3]
2
7
28
@Andrew__Brown__
Andrew Brown
1 year
OK here goes the "excited to share ...." post Want to know how to train a T2V model (with other amazing capabilities) that beats ALL prior work ?? Well we released a 90 page tech report with every detail ๐Ÿ˜Š https://t.co/FU2PzloDhrโ€ฆ Thanks to the amazing team!
Tweet card summary image
ai.meta.com
Meta Movie Gen is our latest research breakthrough that allows you to use simple text inputs to create videos and sounds, edit existing videos or transform your personal image into a unique video.
@AIatMeta
AI at Meta
1 year
๐ŸŽฅ Today weโ€™re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. Weโ€™re excited for the potential of this line of research to usher in
10
17
178
@violet_zct
Chunting Zhou
1 year
Introducing *Transfusion* - a unified approach for training models that can generate both text and images. https://t.co/h9PyPl1zNc Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This
24
215
1K
@AIatMeta
AI at Meta
1 year
LLM Evaluations are an important area of work โ€” today we're announcing a new LLM Evaluation Research Grant to foster further innovation in this area. Recipients will get $200K in funding to support this work. We're accepting proposals until September 6 โžก๏ธ https://t.co/0tJcAFq4RO
15
90
456
@AIatMeta
AI at Meta
1 year
Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today weโ€™re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context
264
1K
6K