Benjamin Muller Profile
Benjamin Muller

@ben_mlr

Followers
939
Following
2K
Media
14
Statuses
186

Research in AI. Focusing on scaling language models multi-modally & multilingually. Llama pretraining team @AIatMeta

NYC
Joined April 2016
Don't wanna be here? Send us removal request.
@ben_mlr
Benjamin Muller
9 months
So many exciting releases from FAIR @AIatMeta . Super happy to see Spirit LM now open-sourced. Spirit LM unlocks expressive speech generation through interleaving speech-text training and phonetic(hubert)+pitch+style-specific tokenization. Available here: .Weights:
@AIatMeta
AI at Meta
9 months
Open science is how we continue to push technology forward and today at Meta FAIR weโ€™re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from @jpineau1. This work is another important step
1
0
14
@ben_mlr
Benjamin Muller
3 months
RT @percyliang: We ran Llama 4 Maverick through some HELM benchmarks. It is 1st on HELM capabilities (MMLU-Pro, GPQA, IFEval, WildBench, Omโ€ฆ.
0
17
0
@ben_mlr
Benjamin Muller
3 months
RT @AIatMeta: Today is the start of a new era of natively multimodal AI innovation. Today, weโ€™re introducing the first Llama 4 models: Llaโ€ฆ.
0
2K
0
@ben_mlr
Benjamin Muller
5 months
RT @jaseweston: ๐Ÿšจ Diverse Preference Optimization (DivPO) ๐Ÿšจ.SOTA LLMs have model collapse๐Ÿซ : they can't generate diverse creative writing orโ€ฆ.
0
78
0
@ben_mlr
Benjamin Muller
7 months
RT @gargighosh: We released new research - Byte Latent Transformer(BLT).BLT encodes bytes into dynamic patches using light-weight local modโ€ฆ.
0
83
0
@ben_mlr
Benjamin Muller
7 months
RT @AIatMeta: New from Meta FAIR โ€” Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matcโ€ฆ.
0
191
0
@ben_mlr
Benjamin Muller
7 months
Groundbreaking scaling trends for Byte-level Language Modeling with the new BLT architecture ๐Ÿš€ . More insights in the thread ๐Ÿงต.
@ArtidoroPagnoni
Artidoro Pagnoni
7 months
๐Ÿš€ Introducing the Byte Latent Transformer (BLT) โ€“ An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens ๐Ÿคฏ . Paper ๐Ÿ“„ Code ๐Ÿ› ๏ธ
Tweet media one
0
3
21
@ben_mlr
Benjamin Muller
8 months
Congrats @aymericzzz and team on being live! .Very exciting vision to build entire softwares with just a prompt.
@aymericzzz
Aymeric Zhuo
8 months
Excited to share more about our background, vision and where we're headed at @agemoai with @r1ddhi at @BusinessInsider . ๐—ข๐˜‚๐—ฟ ๐˜ƒ๐—ถ๐˜€๐—ถ๐—ผ๐—ป ๐—ถ๐˜€ ๐˜๐—ผ ๐—ฒ๐—ป๐—ฎ๐—ฏ๐—น๐—ฒ ๐—ฎ๐—ป๐˜†๐—ผ๐—ป๐—ฒ ๐˜๐—ผ ๐—ฐ๐—ฟ๐—ฒ๐—ฎ๐˜๐—ฒ ๐˜€๐—ผ๐—ณ๐˜๐˜„๐—ฎ๐—ฟ๐—ฒ โ€“ from an idea to fully deployed software. The critical path to.
0
1
3
@ben_mlr
Benjamin Muller
9 months
RT @xiangyue96: ๐ŸŒ Iโ€™ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLโ€ฆ.
0
77
0
@ben_mlr
Benjamin Muller
9 months
RT @ylecun: Meta Spirit LM: open source language model that mixes text and speech.
0
71
0
@ben_mlr
Benjamin Muller
9 months
Recent LLMs (e.g. LLama 3 ๐Ÿฆ™) are increasingly good at Math. However, this progress is reserved for languages with large amounts of task-specific instruct-tuning data. In this work @AIatMeta (led by @LucasBandarkar ), we introduce a new model merging technique called **Layer.
@LucasBandarkar
Lucas Bandarkar
9 months
Cross-lingual transfer can be as easy as swapping model layers between LLMs! ๐Ÿ”€. Our model merging method can compose math and language skills by swapping top&bottom layers from a SFTโ€™d target language expert into a math expert without retraining ๐Ÿงต: [1/3]
Tweet media one
2
6
28
@ben_mlr
Benjamin Muller
9 months
RT @Andrew__Brown__: OK here goes the "excited to share . " post . Want to know how to train a T2V model (with other amazing capabilitieโ€ฆ.
0
16
0
@ben_mlr
Benjamin Muller
11 months
RT @violet_zct: Introducing *Transfusion* - a unified approach for training models that can generate both text and images. .
0
209
0
@ben_mlr
Benjamin Muller
11 months
RT @AIatMeta: LLM Evaluations are an important area of work โ€” today we're announcing a new LLM Evaluation Research Grant to foster furtherโ€ฆ.
0
91
0
@ben_mlr
Benjamin Muller
1 year
RT @AIatMeta: Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today weโ€™re releasing aโ€ฆ.
0
1K
0
@ben_mlr
Benjamin Muller
1 year
RT @lvdmaaten: Soโ€ฆ we trained a model and we wrote a paper about it. Have fun yโ€™all!.
0
52
0
@ben_mlr
Benjamin Muller
1 year
RT @soumithchintala: I'm giving the opening Keynote at ICML 2024 on Tuesday the 23rd @ 9:30am CEST. I'll try empower folks to get Open Scieโ€ฆ.
0
66
0
@ben_mlr
Benjamin Muller
1 year
It was great to present the Spirit-LM model with @tuanh208 . Spirit-LM is a foundation model that jointly learns text and expressive speech based on Llama 2. Thanks @twelve_labs for organizing the webinar. Arxiv available here for more details:
@twelve_labs
TwelveLabs (twelvelabs.io)
1 year
The recording of this webinar with @ben_mlr and @tuanh208 of @metaai is up!. Watch here: ๐Ÿ“บ. They discussed:.- Challenges of expressive speech generation.- SpiRit-LM combines TextLM and SpeechLM.- Training recipe and generation samples.- Can we observe the.
1
2
9
@ben_mlr
Benjamin Muller
1 year
RT @ArmenAgha: A restricted, safety aligned (no-image-out) version of Chameleon (7B/34B) is now open-weight!. Theโ€ฆ.
0
50
0