kourosh hakhamaneshi @CyrusHakha X Profile

kourosh hakhamaneshi

@CyrusHakha

Followers

971

Following

2K

Media

42

Statuses

747

LLMs + Ray @anyscalecompute 💻 prev PhD, EECS, @UCBerkeley 👨‍🎓

California, USA

Joined September 2010

Don't wanna be here? Send us removal request.

kourosh hakhamaneshi

@CyrusHakha

2 years

🚀 Exploring Llama-2’s Quality: Can we replace generalist GPT-4 endpoints with specialized OSS models? Dive deep with our technical blogpost to understand the nuances and insights of fine-tuning OSS models. 🔗🧵 Thread 1/N👇.

16

116

529

kourosh hakhamaneshi

@CyrusHakha

6 days

RT @pcmoritz: There have been a lot of open source RL libraries for training LLMs popping up recently. We took a stab at describing some of….

0

10

0

kourosh hakhamaneshi

@CyrusHakha

10 days

RT @erictang000: Check out our updates to SkyRL!. In this release we also reproduced our prior results outperforming GPT-4o for multi-turn….

0

4

0

kourosh hakhamaneshi

@CyrusHakha

10 days

RT @istoica05: Taking a step towards building a modular RL framework with our SkyRL project.

0

9

0

kourosh hakhamaneshi

@CyrusHakha

11 days

RL for LLMs is here to stay. With SkyRL, you get both modularity and performance:. • Clean trainer/generator separation (colocated or disaggregated).• Sync + async RL (multi-turn).• Remote inference (OpenAI-compatible). No forking needed to customize. Easy to use and.

NovaSky

@NovaSkyAI

11 days

✨Release: We upgraded SkyRL into a highly-modular, performant RL framework for training LLMs. We prioritized modularity—easily prototype new algorithms, environments, and training logic with minimal overhead. 🧵👇.Blog: Code:

0

10

kourosh hakhamaneshi

@CyrusHakha

13 days

RT @PyTorch: An #OpenSource Stack for #AI Compute: @kubernetesio + @raydistributed + @pytorch + @vllm_project ➡️ This Anyscale blog post by….

0

28

0

kourosh hakhamaneshi

@CyrusHakha

21 days

RT @robertnishihara: Impressive work! Agentic workflows have tons and tons of design and arcitectural decisions that affect performance and….

0

4

0

kourosh hakhamaneshi

@CyrusHakha

25 days

I get a lot of questions around what is the role of each of these layers of AI compute stack: vLLM, ray, k8s, etc. What does ray do in vLLM, what does ray do around vLLM? Why is Ray core part of post-training frameworks like vERL, etc? In this blog @robertnishihara depicts what a.

Robert Nishihara

@robertnishihara

25 days

The AI compute software stack consists of 3 specialized layers:. 🔧🔧🔧 Layer 1: Training & Inference Framework (PyTorch + vLLM).• Runs models efficiently on GPUs.• Handles model optimization and model parallelism strategies.• Manages accelerator memory and automatic

0

2

8

kourosh hakhamaneshi

@CyrusHakha

2 months

RT @sumanthrh: Some of our interesting observations from working on multi-turn text2SQL: .- Data-efficient RL works pretty well: We did ver….

0

4

0

kourosh hakhamaneshi

@CyrusHakha

2 months

SkyRL-SQL is another illustration of applying RL to an agentic workflow that beats the state of the art frontier reasoning and non-reasoninig models with a sample efficient training recipe. The improvements are still incremental but the direction is very promising. We also.

NovaSky

@NovaSkyAI

2 months

1/N Introducing SkyRL-SQL, a simple, data-efficient RL pipeline for Text-to-SQL that trains LLMs to interactively probe, refine, and verify SQL queries with a real database. 🚀 Early Result: trained on just ~600 samples, SkyRL-SQL-7B outperforms GPT-4o, o4-mini, and SFT model

1

4

14

kourosh hakhamaneshi

@CyrusHakha

2 months

RT @NovaSkyAI: 1/N Introducing SkyRL-SQL, a simple, data-efficient RL pipeline for Text-to-SQL that trains LLMs to interactively probe, ref….

0

32

0

kourosh hakhamaneshi

@CyrusHakha

2 months

RayTurbo Data >> OSS Ray Data

0

3

kourosh hakhamaneshi

@CyrusHakha

2 months

RT @anyscalecompute: “We realized our ML Engineers were spending too much time waiting before they could iterate.”.– Wenyue Liu, ML Platfor….

0

7

0

kourosh hakhamaneshi

@CyrusHakha

2 months

RT @haoailab: Announcing FastVideo V1, a unified framework for accelerating video generation. FastVideo V1 offers:.- A simple, consistent….

0

43

0

kourosh hakhamaneshi

@CyrusHakha

2 months

We are seeing incremental progress in oss on improving true long term decision making process for AI agents. SkyRL-v0 is a snapshot of the progress we have made in collaboration with @NovaSkyAI . We will have more releases in the upcoming weeks.

NovaSky

@NovaSkyAI

2 months

1/N Introducing SkyRL-v0, our RL training pipeline enabling efficient RL training for long-horizon, real-environment tasks like SWE-Bench. We also open-source a series of our early trained models to showcase the potential of end-to-end online RL training on long-horizon (20-50

1

0

8

kourosh hakhamaneshi

@CyrusHakha

2 months

RT @richliaw: Today we’re introducing SkyRL, a RL training pipeline optimized for long-horizon tasks like SWE-Bench, built on top of VeRL….

0

97

0

kourosh hakhamaneshi

@CyrusHakha

2 months

RT @NovaSkyAI: 1/N Introducing SkyRL-v0, our RL training pipeline enabling efficient RL training for long-horizon, real-environment tasks l….

0

70

0

kourosh hakhamaneshi

@CyrusHakha

2 months

Omg, why is this so accurate 😂😂?.

Garry Tan

@garrytan

2 months

The sweet sweet moment you find product market fit (wait for it)

0

3

kourosh hakhamaneshi

@CyrusHakha

2 months

OpenRLHF’s post-training stack: . Ray + vLLM + zero3 (deepspeed). We have made sure vLLM has native support for Ray allowing granular placement of vLLM workers on the desired physical placement. This is crucial for post-training frameworks like verl or OpenRLHF.

vLLM

@vllm_project

2 months

OpenRLHF is a pioneering framework to use vLLM for RLHF, driving many design and implementation of vLLM's features for RLHF, making vLLM a popular choice for many RLHF frameworks. Learn more about the story at

0

1

17

kourosh hakhamaneshi

@CyrusHakha

2 months

RT @robertnishihara: If you're curious why vLLM, which is an inference engine, is being used in the post-training tech stack, the answer is….

0

25

0

kourosh hakhamaneshi

@CyrusHakha

2 months

RT @vllm_project: OpenRLHF is a pioneering framework to use vLLM for RLHF, driving many design and implementation of vLLM's features for RL….

0

40

0