
Justus Mattern
@MatternJustus
Followers
4K
Following
2K
Media
121
Statuses
821
Research Engineer @PrimeIntellect | prev. co-founder https://t.co/fozfCukVT4 (YC S23), research @MPI_IS, physics @RWTH
San Francisco, CA
Joined March 2021
RT @designarena_ai: Kimi K2 by @Kimi_Moonshot and Mistral Small 3.2 by @MistralAI just added to leaderboard . Crown your winner at https://….
0
2
0
SYNTHETIC-2 Datasets are now on Huggingface! We’re releasing an SFT dataset collected from the new R1-0528 as well as an RL Dataset with difficulty annotations from various smaller models. Go train some models 🫡.
Releasing SYNTHETIC-2: our open dataset of 4m verified reasoning traces spanning a comprehensive set of complex RL tasks and verifiers. Created by hundreds of compute contributors across the globe via our pipeline parallel decentralized inference stack.
3
4
80
great thread, summarizes well why we're particularly excited about RL. higher inference to training compute ratio -> less inter node communication -> better suited for globally distributed training infra with slow connection speeds.
So I think something else that doesn't get discussed much is the extrapolation of this inference : training trend. - 2015: back in the day, we would train one model per dataset, and inference it once (to obtain the eval result for our paper).- 2020: with chatgpt, multi-task
0
0
18
I'm wondering to what degree a VLM judge matches human preferences here.
🚨 3D Design Battle: DeepSeek-V3 vs Sonnet 4 . @deepseek_ai ranks #1 on for 3D design. One-shot Prompt: Build a model of a globe
1
0
16
RT @PrimeIntellect: We did it — SYNTHETIC‑2 is complete. A planetary-scale decentralized inference run generating 4M verified reasoning sa….
0
104
0
Prime Intellect is nothing without its child laborers.
Our 16yo intern @raphzyyyy contributing our DGX A100s to the SYNTHETIC-2 run, with support from @jackminong - just in time to help finish the run.
11
6
236
slides sneak peek (actually)
I’ll be at the Democratize Intelligence Summit today to speak about why reinforcement learning is ideally suited for a decentralized training setup + share a bit about our work at @PrimeIntellect . Come say hi!.
0
0
8
I’ll be at the Democratize Intelligence Summit today to speak about why reinforcement learning is ideally suited for a decentralized training setup + share a bit about our work at @PrimeIntellect . Come say hi!.
5 days until the SF Democratize Intelligence summit, backed by @cyberFund_!. Join hundreds of the world's most brilliant researchers like @edwardjhu, founders like @eshear , engineers like @trevormccrt1, and policymakers like @IzzieHahn in the same room to create an open and.
0
2
21
I'm especially looking forward to Deepseek-R1-0528 distilled models that can be trained using SYNTHETIC-2. The distilled model based on Qwen 8B that is out already outperforms Qwen3-32B without any RL (at least on benchmarks), which is quite impressive
Launching SYNTHETIC-2: our next-gen open reasoning dataset and planetary-scale synthetic data generation run. Powered by our P2P inference stack and DeepSeek-R1-0528, it verifies traces for the hardest RL tasks. Contribute towards AGI via open, permissionless compute.
0
1
17
RT @khoomeik: there is no excuse for any gpu on earth to be idling right now. every idle gpu can and should contribute to generating high q….
0
8
0