AlpayAriyak Profile Banner
Alpay Ariyak Profile
Alpay Ariyak

@AlpayAriyak

Followers
3K
Following
4K
Media
40
Statuses
246

LLM Post-Training Lead @ Together AI | OpenChat Project Lead (2M+ downloads, #1 7B LLM on Arena for 2+ months) | DeepCoder, DeepSWE

San Francisco, CA
Joined July 2023
Don't wanna be here? Send us removal request.
@AlpayAriyak
Alpay Ariyak
1 month
🤖 Introducing DeepSWE-Preview, our latest model trained in collaboration with @Agentica_ . Using only RL, we increase the performance of Qwen 3 32B from 23% to 42.2% pass@1 (and 59.0% with TTS) on SWE-Bench Verified!
Tweet media one
@togethercompute
Together AI
1 month
Announcing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. Built in
Tweet media one
4
8
59
@AlpayAriyak
Alpay Ariyak
1 month
RT @ChongZitaZhang: Only after labelling a dataset by your own you know how dirty it is.
0
18
0
@AlpayAriyak
Alpay Ariyak
1 month
RT @teortaxesTex: important correction on DeepSWE-Preview. on SWE-Bench-Verified:.Pass@1 = 42.2%."Best@8" = 59%, trajectory selection achie….
0
4
0
@AlpayAriyak
Alpay Ariyak
1 month
Let’s normalize reading through and actually understanding something before attempting to criticize it publicly :).
@Agentica_
Agentica Project
1 month
It's easy to confuse Best@K vs Pass@K—and we've seen some misconceptions about our results. Our 59% on SWEBench-Verified is Pass@1 with Best@16, not Pass@8/16. Our Pass@8/16 is 67%/71%. So how did we achieve this? . DeepSWE generates N candidate solutions. Then, another LLM
Tweet media one
2
1
31
@AlpayAriyak
Alpay Ariyak
1 month
RT @Agentica_: It's easy to confuse Best@K vs Pass@K—and we've seen some misconceptions about our results. Our 59% on SWEBench-Verified….
0
15
0
@AlpayAriyak
Alpay Ariyak
1 month
Soham Parekh was a DeepSWE checkpoint sorry.
0
0
15
@AlpayAriyak
Alpay Ariyak
1 month
Excited to introduce DeepSWE-Preview, our latest model trained in collaboration with @Agentica_ . Using only RL, we increase the performance of Qwen 3 32B from 23% to 42.2% on SWE-Bench Verified!
Tweet media one
@Agentica_
Agentica Project
1 month
🚀 Introducing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. 💪DeepSWE
Tweet media one
1
4
38
@AlpayAriyak
Alpay Ariyak
1 month
RT @Agentica_: 🚀 Introducing DeepSWE 🤖: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B.….
0
67
0
@AlpayAriyak
Alpay Ariyak
3 months
Excited to join some great friends at Nous as one of the judges for their first hackathon! It will be focused on RL environments. Pull up, it will be fun :).
@NousResearch
Nous Research
3 months
Announcing the Nous RL Environments Hackathon in SF!. Create with Atropos, Nous' RL environments framework, and claim your stake of a $50,000 prize pool. Partners - @xai @nvidia @nebiusai @SHACK15sf @akashnet_ @LambdaAPI @tensorstax and @runpod_io . May 18th. Sign up below 👇👇
Tweet media one
4
1
43
@AlpayAriyak
Alpay Ariyak
4 months
DeepCoder has reached the top of HuggingFace trending models 🥳
Tweet media one
7
3
48
@AlpayAriyak
Alpay Ariyak
4 months
Our DeepCoder-14B LiveCodeBench v5 scores have been validated and put on the official leaderboard!
Tweet media one
13
37
320
@AlpayAriyak
Alpay Ariyak
4 months
I’ve seen some people using DeepCoder-1.5B as speculator for DeepCoder-14B. Because the final stage for both was “self-play” RL, it won’t be a good speculator, as they diverge a lot. If there’s enough interest, we can train a good speculator (w/ logit distillation) & release.
1
1
8
@AlpayAriyak
Alpay Ariyak
4 months
New article from @VentureBeat about our model!.
@VentureBeat
VentureBeat
4 months
DeepCoder delivers top coding performance in efficient 14B open model
Tweet media one
1
0
10
@AlpayAriyak
Alpay Ariyak
4 months
RT @Teknium1: Nice!.
0
1
0
@AlpayAriyak
Alpay Ariyak
4 months
RT @ollama: ollama run deepcoder . 🫡 let’s go open source!!.
0
141
0
@AlpayAriyak
Alpay Ariyak
4 months
RT @Agentica_: Introducing DeepCoder-14B-Preview - our fully open-sourced reasoning model reaching o1 and o3-mini level on coding and math.….
0
210
0
@AlpayAriyak
Alpay Ariyak
4 months
Excited to present our project in collaboration with Agentica: .14B LLM trained with Code RL that reaches OpenAI's o3-mini-low performance on coding benchmarks like LiveCodeBench, Codeforces and HumanEval! . We open source the code, data, weights and full recipe
Tweet media one
@togethercompute
Together AI
4 months
Announcing DeepCoder-14B – an o1 & o3-mini level coding reasoning model fully open-sourced!. We’re releasing everything: dataset, code, and training recipe.🔥. Built in collaboration with the @Agentica_ team. See how we created it. 🧵
Tweet media one
20
28
196