Sachin
@sacmehtauw
Followers
733
Following
209
Media
24
Statuses
155
Staff Research Scientist, GenAI@Meta and Affiliate Assistant Professor @UW. Opinions are my own.
Seattle, WA
Joined June 2019
We've just published CoreNet. A few highlights: ⚡️OpenELM, a new efficient language model that optimizes parameters for accuracy with fewer tokens using layer-wise scaling. ⚡️MLX model conversion and inference. ⚡️Wide array of vision and language models with SOTA training recipes
1
8
47
RL with verifiable reward has shown impressive results in improving LLM reasoning, but what can we do when we do not have ground truth answers? Introducing Self-Rewarding Training (SRT): where language models provide their own reward for RL training! 🧵 1/n
21
147
839
Ready, set, innovate! #Llama4 is out now! 👇
Introducing our first set of Llama 4 models! We’ve been hard at work doing a complete re-design of the Llama series. I’m so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4
0
0
2
We're hiring PhD interns for Summer 2025 in Seattle to work with us on improving BLT even more! If this is something that excites you, reach out to me on dm/email asap!
New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️ https://t.co/0iamZCRnMN
4
29
314
Join our team! Applications are open until December 16. Submit your application through the portal below, and feel free to send me a message afterward. This position is also available in Cupertino!
Our Machine Learning Research (MLR) team at #Apple is seeking a passionate AI resident to conduct research on multi-modal generative models (vision, 3D, language, audio) and to explore effective control mechanisms for these models. Application details: https://t.co/NwwTeLYoGX
0
3
12
I am recruiting PhD students for Fall '25 at Cornell! I plan to admit multiple students interested in building more controllable generative models, music technologies (or both!). 🎶 Please apply to @Cornell_CS.
3
47
249
To our collaborators & community: We’ve seen questions about AutoGen forks/clones vs. the official project. Here's a summary of the latest. Please share with others. - The official repo is https://t.co/9OI3SK56zE. - We're actively working on AutoGen v0.2, with v0.4 innovations
github.com
A programming framework for agentic AI. Contribute to microsoft/autogen development by creating an account on GitHub.
9
24
55
Excited to finally release Magentic-One! The thing I love about this multi-agent team is that the same implementation achieves very strong performance across three challenging agentic benchmarks. If you are someone working on agentic systems, you know how challenging this can
📢Introducing Magentic-One, a generalist 5-agent multi-agent system for solving open-ended web- and file-based tasks. 🤖🤖🤖🤖🤖 Magentic-One represents a significant step towards agents that can complete tasks that people encounter in their daily lives and can achieve strong
2
6
40
We’ve released QUANTIZED Llama 3.2 1B/3B models. ⚡️FAST and EFFICIENT: 1B decodes at ~50 tok/s on a MOBILE PHONE CPU. ⚡️As ACCURATE as full-precision models. ⚡️Ready to CONSUME on mobile devices. Looking forward to on-device experiences these models will enable! Read more👇
We want to make it easier for more people to build with Llama — so today we’re releasing new quantized versions of Llama 3.2 1B & 3B that deliver up to 2-4x increases in inference speed and, on average, 56% reduction in model size, and 41% reduction in memory footprint. Details
0
7
17
Hey Guys, I'm gonna present LLM in a flash in ACL 2024. Hit me up if you are in Bangkok. https://t.co/t67MbvpPOO Updates from previous version: - Llama 2 results - Some results on Apple GPUs (Metal) - Speculative decoding - Memory Latency Tradeoff - Impact of longer generation
arxiv.org
Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their substantial computational and memory...
0
6
45
Today is my last day @Apple. It's been a fantastic journey working on projects like OpenELM, CoreNet, and MobileViT with such a talented team. I’m excited for what’s next but will always remember the incredible experiences here. 🍎 #GoodbyeApple #NextChapter
2
1
96
Apple presents LazyLLM Dynamic Token Pruning for Efficient Long Context LLM Inference The inference of transformer-based large language models consists of two sequential stages: 1) a prefilling stage to compute the KV cache of prompts and generate the first token, and 2) a
4
70
296
🤔 Wondering how to leverage large foundation models to train small on-device task-specific models? Check out our ICML paper or stop by the poster next Thursday at 11:30am in Vienna. paper: https://t.co/VRqk6jXNYL dataset: https://t.co/F6wR9VkppT
#Apple @ #ICML:
0
6
15
Excited to release our work from last year showcasing a stable training recipe for fully token-based multi-modal early-fusion auto-regressive models! https://t.co/H0wOurpeuC Huge shout out to @ArmenAgha @ramakanth1729 @LukeZettlemoyer @gargighosh and other co-authors. (1/n)
arxiv.org
We present Chameleon, a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We outline a stable training...
4
28
102
Friendly reminder that if you are looking to get into mixed-modal early-fusion foundation models & are looking for training and inference code, open model weights and benchmarking across a suite of vision and NLP benchmarks, Please take a look at our work from @allen_ai 2 years
Newly published work from FAIR, Chameleon: Mixed-Modal Early-Fusion Foundation Models. This research presents a family of early-fusion token-based mixed-modal models capable of understanding & generating images & text in any arbitrary sequence. Paper ➡️ https://t.co/JQZHig977O
0
10
55
Newly published work from FAIR, Chameleon: Mixed-Modal Early-Fusion Foundation Models. This research presents a family of early-fusion token-based mixed-modal models capable of understanding & generating images & text in any arbitrary sequence. Paper ➡️ https://t.co/JQZHig977O
25
192
910
OpenELM (small 270M version) converted to Core ML, running on my M1 at 56 tok/s.
5
20
135
Interesting!
Run Apple's new OpenELM models in MLX LM thanks to @Prince_Canuma pip install -U mlx-lm 270M model in 16-bit runs quite fast on an 8GB M2 Mini (512 tokens at 115 toks/sec). Also pretty good quality for the size:
0
1
3
Like OpenELM, CatLIP is also "Open" https://t.co/66apun79Xm
github.com
CoreNet: A library for training deep neural networks - apple/corenet
Apple presents CatLIP CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Contrastive learning has emerged as a transformative method for learning effective visual representations through the alignment of image and text
0
6
15