Roshan Sumbaly @rsumbaly X Profile

Roshan Sumbaly

@rsumbaly

Followers

2K

Following

4K

Media

30

Statuses

789

Senior Director of AI, @AIatMeta - Llama & Movie Gen. Prior life @coursera, @linkedIn, @stanford

https://t.co/OGFsT2i6Kf

Joined January 2009

Don't wanna be here? Send us removal request.

Roshan Sumbaly

@rsumbaly

24 days

17 years ago, I took courses from @chrmanning and later TA’d a course he co-taught with @WittedNote at Stanford. The lessons in humility and patience I learned from him in those brief interactions as a young grad student still stay with me. His influence on the field of NLP and

Diyi Yang

@Diyi_Yang

24 days

Stanford NLP 25th Anniversary🤩🤩🤩

4

8

80

Roshan Sumbaly

@rsumbaly

8 months

Llama 4 is here with 4 models!🦙🦙🦙🦙 I'm back to share with the world what the team has been cooking. Today we're open-sourcing 2 state-of-the-art omni models (Scout, Maverick - including pre-trained weights), previewing a 3rd one (Behemoth) and will drop a reasoning one soon.

6

8

75

Roshan Sumbaly

@rsumbaly

1 year

Early Christmas gift that we wanted to share with the community - last one for the year and a nice way to say good bye to Llama 3. 2025 will be the year of Llama 4!

Ahmad Al-Dahle

@Ahmad_Al_Dahle

1 year

Introducing Llama 3.3 – a new 70B model that delivers the performance of our 405B model but is easier & more cost-efficient to run. By leveraging the latest advancements in post-training techniques including online preference optimization, this model improves core performance at

0

1

28

Roshan Sumbaly

@rsumbaly

1 year

When we wrote the first research proposal for Movie Gen 6 months back we had two goals in mind: 🎬 First, accelerate creative expression for everyone - from production studios to the 100s of millions of content creators on Meta 📜Second, accelerate research in media generation

github.com

Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen - facebookresearch/MovieGenBench

Aneesh Chaganty

@aneeshchaganty

1 year

A few weeks ago, i was given early access to movie gen from @Meta. My only instructions were to make something with it. As always, a very special thank you to my mom, who digitized over 100 hours of home video footage to make this story possible. https://t.co/4uF3dieZEO

0

2

18

Roshan Sumbaly

@rsumbaly

1 year

And not just the paper, early next week we'll be releasing our full evaluation sets - the field of media generation would really benefit from having canonical benchmarks. Stay tuned!

Soumith Chintala

@soumithchintala

1 year

yes, Meta released a full scientific paper on MovieGen, with a lot of details that'll help the field move forward.

1

7

63

Roshan Sumbaly

@rsumbaly

1 year

Back to the lab to create the next Llama and Movie Gen foundation models...

1

0

5

Roshan Sumbaly

@rsumbaly

1 year

Summary reel with examples...

1

6

Roshan Sumbaly

@rsumbaly

1 year

Lights, camera, action - introducing Meta's Movie Gen! Our latest breakthrough in AI-powered media generation, setting a new standard for immersive AI content creation. We're also releasing a 92 page detailed report of what we learned, along with evaluation prompts that we hope

2

10

86

Roshan Sumbaly

@rsumbaly

1 year

We're still baking the next herd of Llamas - stay tune and keep sharing us feedback! 🦙🦙🦙🦙

0

3

Roshan Sumbaly

@rsumbaly

1 year

🟡 Llama Stack A set of interfaces that we believe would make the developer experience around Llama better. We're partnering with several companies today to provide canonical implementations of these interfaces, along with packaged "distributions" to decrease guesswork for

1

0

2

Roshan Sumbaly

@rsumbaly

1 year

🔵 Llama 3.2 11B and 90B image reasoning models These are focused on visual recognition tasks and trained via adapters, being a good drop-in replacement for their corresponding text model equivalents. Further, we're open-sourcing both base and instruct-aligned models - so much

1

0

1

Roshan Sumbaly

@rsumbaly

1 year

🟢 Llama 3.2 1B and 3B text-only models These 128K context length models were pruned from 8B + distilled from 8B/70B logits, thereby making them state-of-the-art on tasks like summarization, instruction following, and rewriting. Also they can run on-device using @PyTorch

1

0

1

Roshan Sumbaly

@rsumbaly

1 year

Llama 3.2 - release #3 for 2024! 💪🔥🦙 3 months ago we released our flagship 405B model that leveled the playing ground and gave everyone open access to a world-class foundation model. We've been overwhelmed by the community's response and seen a wide range of fine-tuned

1

14

Roshan Sumbaly

@rsumbaly

1 year

This is an exciting moment not just for the Llama community, but also the wider AI space. Having an open weight model that is competitive is important for us to move the field forward. Thank you @lmsysorg for bringing this up so fast and @anyscalecompute for hosting!

lmarena.ai

@arena

1 year

Exciting news! @metaai's Llama-3.1 results are here🔥 The Llama-3.1 series, extensively tested over the past week, has gathered over 10K community votes. Now, Llama-3.1-405B has climbed to #3 on the Overall Arena leaderboard, marking the first time an open model has ranked in

1

4

18

Roshan Sumbaly

@rsumbaly

1 year

Great to see the community moving fast to adapt Llama 3.1 to their needs. This is the beauty of open-source and key part of why we're going to share more of our system-level thinking with Llama Stack. Great work @vllm_project and @neuralmagic folks - let's find more ways to work

Red Hat AI

@RedHat_AI

1 year

vLLM now supports deploying Llama-3.1-405B on a single 8xH100 or 8xA100 node, making inference much easier and cheaper! This is a huge feat by Neural Magic’s engineers who contributed 3 crucial features to enable immediate, FP8 deployments of the 405B model in vLLM: (1/5)

1

5

17

Roshan Sumbaly

@rsumbaly

1 year

We’re just getting started. Llama Herd of models and system has more exciting updates coming - stay tuned! 🦙🦙🦙🦙🦙🦙🦙

0

Roshan Sumbaly

@rsumbaly

1 year

Finally, Llama thrives because of its existing open developer ecosystem integration. We’re partnering with 25+ partners to integrate our models (and eventually parts of Llama Stack) to get you started on day 0. So if you want to get low-latency inference, reach out to @groq. Or

1

0

Roshan Sumbaly

@rsumbaly

1 year

We also want to lower the barrier to entry for folks to leverage / train Llama variants. So we’re starting an RFC to define Llama Stack - a set of interfaces for standard model development (think evals, reward modeling, synthetic data generation) and agentic applications. Our

1

0

1

Roshan Sumbaly

@rsumbaly

1 year

Moving up the stack, Llama was always built to be part of an overall system that can orchestrate several components, including calling external tools or running system-level safety. Today we’re excited to open-source a full reference system that parallels our internal

1

0

2