
Alexander Doria
@Dorialexander
Followers
19K
Following
136K
Media
3K
Statuses
42K
Reasoning models to come. Co-founder @pleiasfr
Joined April 2011
if you can stand to listen to me about model economics and synthetic environments for two hours, @himanshustwts got you covered
9
8
117
Just sooner a NYT paper was amazed so many AI researchers are like 23 years old. But that’s exactly what happens in a boom (or here pre-boom): things moving so fast you climb to corporate tops in a few months.
0
0
24
Not completely useless: we already had Bertology and Qwen might even be the first decoder to attain an ubiquitous use comparable to Bert. Actual issue is that people do not realize they’re doing it.
0
0
4
I’m afraid a large chunk of AI research in 2025 will just be demoted to Qwenology.
More generally: if all of your experiments are "RL on math with Qwen", I'm not interested in any outlandish claims you want to make. Qwen's base models have been (appropriately) aggressively mid-trained for math for a long time. Stop drawing conclusions purely from this.
1
0
32
Weird how all the progressive grand causes have a way of dying without a bang lately.
0
0
6
My guess: it's not emergence but dramatic recall of the relevant synth data games among 5T synth tokens.
🧵 As AI labs race to scale RL, one question matters: when should you stop pre-training and start RL? We trained 5 Qwen models (0.6B→14B) with RL on GSM8K and found something wild: Small models see EMERGENCE-LIKE jumps. Large models see diminishing returns. The scaling law?
1
0
25
Looking forward to good GPU installs for the two things that matter : open source agi and b2b RAG.
2
0
25
Very corporate announcement: likely to the surprise of many, @pleiasfr is now a long time customer of @PrimeIntellect
6
3
121
i can do "you’re absolutely right" with a deep french accent
0
0
7
this but i’d like to apply as cheerleader.
WE'RE HIRING FOUNDING ENGINEERS TC $150k + Housing - in person, in sf - the office is the housing - no equity but you get asian cheerleaders in office - no shoes policy (feet guys stay away) (bonus points for anime pfp, green github garden, can ship hw + sw)
4
0
20
The real reason it happens: tourists are mostly doing it with waiters, sellers, all kinds of people that really don’t want to spend time trying to figure out what you want.
Being in Paris is hilarious, because I ask a question in fluent French, but because of my American accent, everyone replies to me in English in a pitiful way. 🤣🤣
0
0
18
project/global distinction is *very* fuzzy, not much frictions.
0
0
13
actually one of the lead cause was uv — so joining @vikhyatk side for the time being.
7
0
54
More seriously, good opportunity here for an explainability lab with some popularization skills on b2b normies. Dealing straight with tokenizers, maybe logit for open source solution is just corporate savings.
I thought infinite context was nearly solved. And yet we find all theses strategies to fit tokens better, as if we were constrained or something. Curious.
1
1
10
Ok never going to have a smooth codex session again. Time to light up CC again I guess.
OpenAI's efforts to catch up with Anthropic's code-writing AI seem to be working: OpenAI's Codex has pulled ahead of Anthropic's Claude Code assistant by some measures, and its popularity with developers is catching up too, based on new data from Modu: https://t.co/i5ZPXDQc98
2
0
13