rsumbaly Profile Banner
Roshan Sumbaly Profile
Roshan Sumbaly

@rsumbaly

Followers
2K
Following
4K
Media
30
Statuses
789

Senior Director of AI, @AIatMeta - Llama & Movie Gen. Prior life @coursera, @linkedIn, @stanford

Joined January 2009
Don't wanna be here? Send us removal request.
@rsumbaly
Roshan Sumbaly
24 days
17 years ago, I took courses from @chrmanning and later TA’d a course he co-taught with @WittedNote at Stanford. The lessons in humility and patience I learned from him in those brief interactions as a young grad student still stay with me. His influence on the field of NLP and
@Diyi_Yang
Diyi Yang
24 days
Stanford NLP 25th Anniversary🤩🤩🤩
4
8
80
@rsumbaly
Roshan Sumbaly
8 months
Llama 4 is here with 4 models!🦙🦙🦙🦙 I'm back to share with the world what the team has been cooking. Today we're open-sourcing 2 state-of-the-art omni models (Scout, Maverick - including pre-trained weights), previewing a 3rd one (Behemoth) and will drop a reasoning one soon.
6
8
75
@rsumbaly
Roshan Sumbaly
1 year
Early Christmas gift that we wanted to share with the community - last one for the year and a nice way to say good bye to Llama 3. 2025 will be the year of Llama 4!
@Ahmad_Al_Dahle
Ahmad Al-Dahle
1 year
Introducing Llama 3.3 – a new 70B model that delivers the performance of our 405B model but is easier & more cost-efficient to run. By leveraging the latest advancements in post-training techniques including online preference optimization, this model improves core performance at
0
1
28
@rsumbaly
Roshan Sumbaly
1 year
When we wrote the first research proposal for Movie Gen 6 months back we had two goals in mind: 🎬 First, accelerate creative expression for everyone - from production studios to the 100s of millions of content creators on Meta 📜Second, accelerate research in media generation
Tweet card summary image
github.com
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen - facebookresearch/MovieGenBench
@aneeshchaganty
Aneesh Chaganty
1 year
A few weeks ago, i was given early access to movie gen from @Meta. My only instructions were to make something with it. As always, a very special thank you to my mom, who digitized over 100 hours of home video footage to make this story possible. https://t.co/4uF3dieZEO
0
2
18
@rsumbaly
Roshan Sumbaly
1 year
And not just the paper, early next week we'll be releasing our full evaluation sets - the field of media generation would really benefit from having canonical benchmarks. Stay tuned!
@soumithchintala
Soumith Chintala
1 year
yes, Meta released a full scientific paper on MovieGen, with a lot of details that'll help the field move forward.
1
7
63
@rsumbaly
Roshan Sumbaly
1 year
Back to the lab to create the next Llama and Movie Gen foundation models...
1
0
5
@rsumbaly
Roshan Sumbaly
1 year
Summary reel with examples...
1
1
6
@rsumbaly
Roshan Sumbaly
1 year
Lights, camera, action - introducing Meta's Movie Gen! Our latest breakthrough in AI-powered media generation, setting a new standard for immersive AI content creation. We're also releasing a 92 page detailed report of what we learned, along with evaluation prompts that we hope
2
10
86
@rsumbaly
Roshan Sumbaly
1 year
We're still baking the next herd of Llamas - stay tune and keep sharing us feedback! 🦙🦙🦙🦙
0
0
3
@rsumbaly
Roshan Sumbaly
1 year
🟡 Llama Stack A set of interfaces that we believe would make the developer experience around Llama better. We're partnering with several companies today to provide canonical implementations of these interfaces, along with packaged "distributions" to decrease guesswork for
1
0
2
@rsumbaly
Roshan Sumbaly
1 year
🔵 Llama 3.2 11B and 90B image reasoning models These are focused on visual recognition tasks and trained via adapters, being a good drop-in replacement for their corresponding text model equivalents. Further, we're open-sourcing both base and instruct-aligned models - so much
1
0
1
@rsumbaly
Roshan Sumbaly
1 year
🟢 Llama 3.2 1B and 3B text-only models These 128K context length models were pruned from 8B + distilled from 8B/70B logits, thereby making them state-of-the-art on tasks like summarization, instruction following, and rewriting. Also they can run on-device using @PyTorch
1
0
1
@rsumbaly
Roshan Sumbaly
1 year
Llama 3.2 - release #3 for 2024! 💪🔥🦙 3 months ago we released our flagship 405B model that leveled the playing ground and gave everyone open access to a world-class foundation model. We've been overwhelmed by the community's response and seen a wide range of fine-tuned
1
1
14
@rsumbaly
Roshan Sumbaly
1 year
This is an exciting moment not just for the Llama community, but also the wider AI space. Having an open weight model that is competitive is important for us to move the field forward. Thank you @lmsysorg for bringing this up so fast and @anyscalecompute for hosting!
@arena
lmarena.ai
1 year
Exciting news! @metaai's Llama-3.1 results are here🔥 The Llama-3.1 series, extensively tested over the past week, has gathered over 10K community votes. Now, Llama-3.1-405B has climbed to #3 on the Overall Arena leaderboard, marking the first time an open model has ranked in
1
4
18
@rsumbaly
Roshan Sumbaly
1 year
Great to see the community moving fast to adapt Llama 3.1 to their needs. This is the beauty of open-source and key part of why we're going to share more of our system-level thinking with Llama Stack. Great work @vllm_project and @neuralmagic folks - let's find more ways to work
@RedHat_AI
Red Hat AI
1 year
vLLM now supports deploying Llama-3.1-405B on a single 8xH100 or 8xA100 node, making inference much easier and cheaper! This is a huge feat by Neural Magic’s engineers who contributed 3 crucial features to enable immediate, FP8 deployments of the 405B model in vLLM: (1/5)
1
5
17
@rsumbaly
Roshan Sumbaly
1 year
We’re just getting started. Llama Herd of models and system has more exciting updates coming - stay tuned! 🦙🦙🦙🦙🦙🦙🦙
0
0
0
@rsumbaly
Roshan Sumbaly
1 year
Finally, Llama thrives because of its existing open developer ecosystem integration. We’re partnering with 25+ partners to integrate our models (and eventually parts of Llama Stack) to get you started on day 0. So if you want to get low-latency inference, reach out to @groq. Or
1
0
0
@rsumbaly
Roshan Sumbaly
1 year
We also want to lower the barrier to entry for folks to leverage / train Llama variants. So we’re starting an RFC to define Llama Stack - a set of interfaces for standard model development (think evals, reward modeling, synthetic data generation) and agentic applications. Our
1
0
1
@rsumbaly
Roshan Sumbaly
1 year
Moving up the stack, Llama was always built to be part of an overall system that can orchestrate several components, including calling external tools or running system-level safety. Today we’re excited to open-source a full reference system that parallels our internal
1
0
2