
KwaiAICoder
@KwaiAICoder
Followers
132
Following
11
Media
26
Statuses
30
Kwaipilot team, illuminating the world with AI and building dreams with code. The official account of the Kwaipilot team (Kuaishou's Large Model for coding).
Joined August 2024
⚡ SeamlessFlow’s secret weapon #2:Tag Driven Scheduling Paradigm. We propose a tag driven scheduling paradigm that abstracts hardware into capability tagged resources, unifying colocated and disaggregated architectures. Based on this, SeamlessFlow introduces a spatiotemporal
1
0
0
💡 SeamlessFlow’s secret weapon #1: Introduce A Data Plane. The data plane can decouple the RL trainer from diverse, complex agent implementations while sustaining high throughput. A central trajectory manager maintains complete interaction histories and supports partial
1
0
1
🔥 Introduce 「SeamlessFlow」, a server based reinforcement learning (RL) framework that eliminates pipeline bubbles by spatiotemporal multiplexing, achieving a 100% improvement in token throughput, and a 62% reduction in overall training time. 🚀 Paper:
arxiv.org
We introduce SeamlessFlow, a server based reinforcement learning (RL) framework that addresses two core challenges in industrial scale RL: (1) decoupling RL training from the complex execution...
1
0
1
🧠 Why it matters. SRPO is the first pure RL algorithm that fully reproduces and surpasses DeepSeek’s performance on both math and code tasks, while previous works have only focused on math area (ORZ, DAPO, VAPO etc). Performance of SRPO:.• AIME24: 50.0 pass@1.• LiveCodeBench:
0
1
2