kourosh hakhamaneshi Profile
kourosh hakhamaneshi

@CyrusHakha

Followers
971
Following
2K
Media
42
Statuses
747

LLMs + Ray @anyscalecompute 💻 prev PhD, EECS, @UCBerkeley 👨‍🎓

California, USA
Joined September 2010
Don't wanna be here? Send us removal request.
@CyrusHakha
kourosh hakhamaneshi
2 years
🚀 Exploring Llama-2’s Quality: Can we replace generalist GPT-4 endpoints with specialized OSS models? Dive deep with our technical blogpost to understand the nuances and insights of fine-tuning OSS models. 🔗🧵 Thread 1/N👇.
16
116
529
@CyrusHakha
kourosh hakhamaneshi
6 days
RT @pcmoritz: There have been a lot of open source RL libraries for training LLMs popping up recently. We took a stab at describing some of….
0
10
0
@CyrusHakha
kourosh hakhamaneshi
10 days
RT @erictang000: Check out our updates to SkyRL!. In this release we also reproduced our prior results outperforming GPT-4o for multi-turn….
0
4
0
@CyrusHakha
kourosh hakhamaneshi
10 days
RT @istoica05: Taking a step towards building a modular RL framework with our SkyRL project.
0
9
0
@CyrusHakha
kourosh hakhamaneshi
11 days
RL for LLMs is here to stay. With SkyRL, you get both modularity and performance:. • Clean trainer/generator separation (colocated or disaggregated).• Sync + async RL (multi-turn).• Remote inference (OpenAI-compatible). No forking needed to customize. Easy to use and.
@NovaSkyAI
NovaSky
11 days
✨Release: We upgraded SkyRL into a highly-modular, performant RL framework for training LLMs. We prioritized modularity—easily prototype new algorithms, environments, and training logic with minimal overhead. 🧵👇.Blog: Code:
Tweet media one
0
0
10
@CyrusHakha
kourosh hakhamaneshi
13 days
RT @PyTorch: An #OpenSource Stack for #AI Compute: @kubernetesio + @raydistributed + @pytorch + @vllm_project ➡️ This Anyscale blog post by….
0
28
0
@CyrusHakha
kourosh hakhamaneshi
21 days
RT @robertnishihara: Impressive work! Agentic workflows have tons and tons of design and arcitectural decisions that affect performance and….
0
4
0
@CyrusHakha
kourosh hakhamaneshi
25 days
I get a lot of questions around what is the role of each of these layers of AI compute stack: vLLM, ray, k8s, etc. What does ray do in vLLM, what does ray do around vLLM? Why is Ray core part of post-training frameworks like vERL, etc? In this blog @robertnishihara depicts what a.
@robertnishihara
Robert Nishihara
25 days
The AI compute software stack consists of 3 specialized layers:. 🔧🔧🔧 Layer 1: Training & Inference Framework (PyTorch + vLLM).• Runs models efficiently on GPUs.• Handles model optimization and model parallelism strategies.• Manages accelerator memory and automatic
Tweet media one
0
2
8
@CyrusHakha
kourosh hakhamaneshi
2 months
RT @sumanthrh: Some of our interesting observations from working on multi-turn text2SQL: .- Data-efficient RL works pretty well: We did ver….
0
4
0
@CyrusHakha
kourosh hakhamaneshi
2 months
SkyRL-SQL is another illustration of applying RL to an agentic workflow that beats the state of the art frontier reasoning and non-reasoninig models with a sample efficient training recipe. The improvements are still incremental but the direction is very promising. We also.
@NovaSkyAI
NovaSky
2 months
1/N Introducing SkyRL-SQL, a simple, data-efficient RL pipeline for Text-to-SQL that trains LLMs to interactively probe, refine, and verify SQL queries with a real database. 🚀 Early Result: trained on just ~600 samples, SkyRL-SQL-7B outperforms GPT-4o, o4-mini, and SFT model
Tweet media one
1
4
14
@CyrusHakha
kourosh hakhamaneshi
2 months
RT @NovaSkyAI: 1/N Introducing SkyRL-SQL, a simple, data-efficient RL pipeline for Text-to-SQL that trains LLMs to interactively probe, ref….
0
32
0
@CyrusHakha
kourosh hakhamaneshi
2 months
RayTurbo Data >> OSS Ray Data
0
0
3
@CyrusHakha
kourosh hakhamaneshi
2 months
RT @anyscalecompute: “We realized our ML Engineers were spending too much time waiting before they could iterate.”.– Wenyue Liu, ML Platfor….
0
7
0
@CyrusHakha
kourosh hakhamaneshi
2 months
RT @haoailab: Announcing FastVideo V1, a unified framework for accelerating video generation. FastVideo V1 offers:.- A simple, consistent….
0
43
0
@CyrusHakha
kourosh hakhamaneshi
2 months
We are seeing incremental progress in oss on improving true long term decision making process for AI agents. SkyRL-v0 is a snapshot of the progress we have made in collaboration with @NovaSkyAI . We will have more releases in the upcoming weeks.
@NovaSkyAI
NovaSky
2 months
1/N Introducing SkyRL-v0, our RL training pipeline enabling efficient RL training for long-horizon, real-environment tasks like SWE-Bench. We also open-source a series of our early trained models to showcase the potential of end-to-end online RL training on long-horizon (20-50
Tweet media one
1
0
8
@CyrusHakha
kourosh hakhamaneshi
2 months
RT @richliaw: Today we’re introducing SkyRL, a RL training pipeline optimized for long-horizon tasks like SWE-Bench, built on top of VeRL….
0
97
0
@CyrusHakha
kourosh hakhamaneshi
2 months
RT @NovaSkyAI: 1/N Introducing SkyRL-v0, our RL training pipeline enabling efficient RL training for long-horizon, real-environment tasks l….
0
70
0
@CyrusHakha
kourosh hakhamaneshi
2 months
Omg, why is this so accurate 😂😂?.
@garrytan
Garry Tan
2 months
The sweet sweet moment you find product market fit (wait for it)
0
0
3
@CyrusHakha
kourosh hakhamaneshi
2 months
OpenRLHF’s post-training stack: . Ray + vLLM + zero3 (deepspeed). We have made sure vLLM has native support for Ray allowing granular placement of vLLM workers on the desired physical placement. This is crucial for post-training frameworks like verl or OpenRLHF.
@vllm_project
vLLM
2 months
OpenRLHF is a pioneering framework to use vLLM for RLHF, driving many design and implementation of vLLM's features for RLHF, making vLLM a popular choice for many RLHF frameworks. Learn more about the story at
0
1
17
@CyrusHakha
kourosh hakhamaneshi
2 months
RT @robertnishihara: If you're curious why vLLM, which is an inference engine, is being used in the post-training tech stack, the answer is….
0
25
0
@CyrusHakha
kourosh hakhamaneshi
2 months
RT @vllm_project: OpenRLHF is a pioneering framework to use vLLM for RLHF, driving many design and implementation of vLLM's features for RL….
0
40
0