
Benjamin Muller
@ben_mlr
Followers
939
Following
2K
Media
14
Statuses
186
Research in AI. Focusing on scaling language models multi-modally & multilingually. Llama pretraining team @AIatMeta
NYC
Joined April 2016
So many exciting releases from FAIR @AIatMeta . Super happy to see Spirit LM now open-sourced. Spirit LM unlocks expressive speech generation through interleaving speech-text training and phonetic(hubert)+pitch+style-specific tokenization. Available here: .Weights:
Open science is how we continue to push technology forward and today at Meta FAIR weโre sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from @jpineau1. This work is another important step
1
0
14
RT @percyliang: We ran Llama 4 Maverick through some HELM benchmarks. It is 1st on HELM capabilities (MMLU-Pro, GPQA, IFEval, WildBench, Omโฆ.
0
17
0
RT @jaseweston: ๐จ Diverse Preference Optimization (DivPO) ๐จ.SOTA LLMs have model collapse๐ซ : they can't generate diverse creative writing orโฆ.
0
78
0
RT @gargighosh: We released new research - Byte Latent Transformer(BLT).BLT encodes bytes into dynamic patches using light-weight local modโฆ.
0
83
0
Groundbreaking scaling trends for Byte-level Language Modeling with the new BLT architecture ๐ . More insights in the thread ๐งต.
๐ Introducing the Byte Latent Transformer (BLT) โ An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens ๐คฏ . Paper ๐ Code ๐ ๏ธ
0
3
21
Congrats @aymericzzz and team on being live! .Very exciting vision to build entire softwares with just a prompt.
Excited to share more about our background, vision and where we're headed at @agemoai with @r1ddhi at @BusinessInsider . ๐ข๐๐ฟ ๐๐ถ๐๐ถ๐ผ๐ป ๐ถ๐ ๐๐ผ ๐ฒ๐ป๐ฎ๐ฏ๐น๐ฒ ๐ฎ๐ป๐๐ผ๐ป๐ฒ ๐๐ผ ๐ฐ๐ฟ๐ฒ๐ฎ๐๐ฒ ๐๐ผ๐ณ๐๐๐ฎ๐ฟ๐ฒ โ from an idea to fully deployed software. The critical path to.
0
1
3
RT @xiangyue96: ๐ Iโve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLโฆ.
0
77
0
Recent LLMs (e.g. LLama 3 ๐ฆ) are increasingly good at Math. However, this progress is reserved for languages with large amounts of task-specific instruct-tuning data. In this work @AIatMeta (led by @LucasBandarkar ), we introduce a new model merging technique called **Layer.
Cross-lingual transfer can be as easy as swapping model layers between LLMs! ๐. Our model merging method can compose math and language skills by swapping top&bottom layers from a SFTโd target language expert into a math expert without retraining ๐งต: [1/3]
2
6
28
RT @Andrew__Brown__: OK here goes the "excited to share . " post . Want to know how to train a T2V model (with other amazing capabilitieโฆ.
0
16
0
RT @violet_zct: Introducing *Transfusion* - a unified approach for training models that can generate both text and images. .
0
209
0
RT @soumithchintala: I'm giving the opening Keynote at ICML 2024 on Tuesday the 23rd @ 9:30am CEST. I'll try empower folks to get Open Scieโฆ.
0
66
0
It was great to present the Spirit-LM model with @tuanh208 . Spirit-LM is a foundation model that jointly learns text and expressive speech based on Llama 2. Thanks @twelve_labs for organizing the webinar. Arxiv available here for more details:
1
2
9
RT @ArmenAgha: A restricted, safety aligned (no-image-out) version of Chameleon (7B/34B) is now open-weight!. Theโฆ.
0
50
0