Roshan Sumbaly
@rsumbaly
Followers
2K
Following
4K
Media
30
Statuses
789
Senior Director of AI, @AIatMeta - Llama & Movie Gen. Prior life @coursera, @linkedIn, @stanford
Joined January 2009
17 years ago, I took courses from @chrmanning and later TA’d a course he co-taught with @WittedNote at Stanford. The lessons in humility and patience I learned from him in those brief interactions as a young grad student still stay with me. His influence on the field of NLP and
4
8
80
Llama 4 is here with 4 models!🦙🦙🦙🦙 I'm back to share with the world what the team has been cooking. Today we're open-sourcing 2 state-of-the-art omni models (Scout, Maverick - including pre-trained weights), previewing a 3rd one (Behemoth) and will drop a reasoning one soon.
6
8
75
Early Christmas gift that we wanted to share with the community - last one for the year and a nice way to say good bye to Llama 3. 2025 will be the year of Llama 4!
Introducing Llama 3.3 – a new 70B model that delivers the performance of our 405B model but is easier & more cost-efficient to run. By leveraging the latest advancements in post-training techniques including online preference optimization, this model improves core performance at
0
1
28
When we wrote the first research proposal for Movie Gen 6 months back we had two goals in mind: 🎬 First, accelerate creative expression for everyone - from production studios to the 100s of millions of content creators on Meta 📜Second, accelerate research in media generation
github.com
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen - facebookresearch/MovieGenBench
A few weeks ago, i was given early access to movie gen from @Meta. My only instructions were to make something with it. As always, a very special thank you to my mom, who digitized over 100 hours of home video footage to make this story possible. https://t.co/4uF3dieZEO
0
2
18
And not just the paper, early next week we'll be releasing our full evaluation sets - the field of media generation would really benefit from having canonical benchmarks. Stay tuned!
yes, Meta released a full scientific paper on MovieGen, with a lot of details that'll help the field move forward.
1
7
63
Back to the lab to create the next Llama and Movie Gen foundation models...
1
0
5
Lights, camera, action - introducing Meta's Movie Gen! Our latest breakthrough in AI-powered media generation, setting a new standard for immersive AI content creation. We're also releasing a 92 page detailed report of what we learned, along with evaluation prompts that we hope
2
10
86
We're still baking the next herd of Llamas - stay tune and keep sharing us feedback! 🦙🦙🦙🦙
0
0
3
🟡 Llama Stack A set of interfaces that we believe would make the developer experience around Llama better. We're partnering with several companies today to provide canonical implementations of these interfaces, along with packaged "distributions" to decrease guesswork for
1
0
2
🔵 Llama 3.2 11B and 90B image reasoning models These are focused on visual recognition tasks and trained via adapters, being a good drop-in replacement for their corresponding text model equivalents. Further, we're open-sourcing both base and instruct-aligned models - so much
1
0
1
🟢 Llama 3.2 1B and 3B text-only models These 128K context length models were pruned from 8B + distilled from 8B/70B logits, thereby making them state-of-the-art on tasks like summarization, instruction following, and rewriting. Also they can run on-device using @PyTorch
1
0
1
Llama 3.2 - release #3 for 2024! 💪🔥🦙 3 months ago we released our flagship 405B model that leveled the playing ground and gave everyone open access to a world-class foundation model. We've been overwhelmed by the community's response and seen a wide range of fine-tuned
1
1
14
This is an exciting moment not just for the Llama community, but also the wider AI space. Having an open weight model that is competitive is important for us to move the field forward. Thank you @lmsysorg for bringing this up so fast and @anyscalecompute for hosting!
Exciting news! @metaai's Llama-3.1 results are here🔥 The Llama-3.1 series, extensively tested over the past week, has gathered over 10K community votes. Now, Llama-3.1-405B has climbed to #3 on the Overall Arena leaderboard, marking the first time an open model has ranked in
1
4
18
Great to see the community moving fast to adapt Llama 3.1 to their needs. This is the beauty of open-source and key part of why we're going to share more of our system-level thinking with Llama Stack. Great work @vllm_project and @neuralmagic folks - let's find more ways to work
vLLM now supports deploying Llama-3.1-405B on a single 8xH100 or 8xA100 node, making inference much easier and cheaper! This is a huge feat by Neural Magic’s engineers who contributed 3 crucial features to enable immediate, FP8 deployments of the 405B model in vLLM: (1/5)
1
5
17
We’re just getting started. Llama Herd of models and system has more exciting updates coming - stay tuned! 🦙🦙🦙🦙🦙🦙🦙
0
0
0
Finally, Llama thrives because of its existing open developer ecosystem integration. We’re partnering with 25+ partners to integrate our models (and eventually parts of Llama Stack) to get you started on day 0. So if you want to get low-latency inference, reach out to @groq. Or
1
0
0
We also want to lower the barrier to entry for folks to leverage / train Llama variants. So we’re starting an RFC to define Llama Stack - a set of interfaces for standard model development (think evals, reward modeling, synthetic data generation) and agentic applications. Our
1
0
1
Moving up the stack, Llama was always built to be part of an overall system that can orchestrate several components, including calling external tools or running system-level safety. Today we’re excited to open-source a full reference system that parallels our internal
1
0
2