Dorialexander Profile Banner
Alexander Doria Profile
Alexander Doria

@Dorialexander

Followers
19K
Following
136K
Media
3K
Statuses
42K

Reasoning models to come. Co-founder @pleiasfr

Joined April 2011
Don't wanna be here? Send us removal request.
@Dorialexander
Alexander Doria
6 days
if you can stand to listen to me about model economics and synthetic environments for two hours, @himanshustwts got you covered
9
8
117
@Dorialexander
Alexander Doria
7 hours
Just sooner a NYT paper was amazed so many AI researchers are like 23 years old. But that’s exactly what happens in a boom (or here pre-boom): things moving so fast you climb to corporate tops in a few months.
0
0
24
@Dorialexander
Alexander Doria
7 hours
i'll go even a bit further: not enough people are emotionally prepared for an actual boom. didn't happen in the west for more than half a century. no wonder China navigates the possibility with more ease.
@tszzl
roon
8 hours
not enough people are emotionally prepared for if it’s not a bubble
2
4
77
@Dorialexander
Alexander Doria
7 hours
Not completely useless: we already had Bertology and Qwen might even be the first decoder to attain an ubiquitous use comparable to Bert. Actual issue is that people do not realize they’re doing it.
0
0
4
@Dorialexander
Alexander Doria
7 hours
I’m afraid a large chunk of AI research in 2025 will just be demoted to Qwenology.
@lateinteraction
Omar Khattab
8 hours
More generally: if all of your experiments are "RL on math with Qwen", I'm not interested in any outlandish claims you want to make. Qwen's base models have been (appropriately) aggressively mid-trained for math for a long time. Stop drawing conclusions purely from this.
1
0
32
@Dorialexander
Alexander Doria
7 hours
Weird how all the progressive grand causes have a way of dying without a bang lately.
0
0
6
@Dorialexander
Alexander Doria
10 hours
My guess: it's not emergence but dramatic recall of the relevant synth data games among 5T synth tokens.
@josancamon19
Joan Cabezas
12 hours
🧵 As AI labs race to scale RL, one question matters: when should you stop pre-training and start RL? We trained 5 Qwen models (0.6B→14B) with RL on GSM8K and found something wild: Small models see EMERGENCE-LIKE jumps. Large models see diminishing returns. The scaling law?
1
0
25
@Dorialexander
Alexander Doria
12 hours
Congrats everyone, we finally improved the regex.
@code_star
Cody Blakeney
14 hours
If gradstudents knew what actually worked in training SOTA LLMs they would be so mad
0
0
44
@Dorialexander
Alexander Doria
16 hours
Looking forward to good GPU installs for the two things that matter : open source agi and b2b RAG.
2
0
25
@Dorialexander
Alexander Doria
16 hours
Very corporate announcement: likely to the surprise of many, @pleiasfr is now a long time customer of @PrimeIntellect
6
3
121
@Dorialexander
Alexander Doria
16 hours
story of my life
@antoine_chaffin
Antoine Chaffin
16 hours
being bounded by RAM and storage is wild
0
0
14
@Dorialexander
Alexander Doria
18 hours
i can do "you’re absolutely right" with a deep french accent
0
0
7
@Dorialexander
Alexander Doria
18 hours
this but i’d like to apply as cheerleader.
@dejavucoder
sankalp
21 hours
WE'RE HIRING FOUNDING ENGINEERS TC $150k + Housing - in person, in sf - the office is the housing - no equity but you get asian cheerleaders in office - no shoes policy (feet guys stay away) (bonus points for anime pfp, green github garden, can ship hw + sw)
4
0
20
@Dorialexander
Alexander Doria
18 hours
The real reason it happens: tourists are mostly doing it with waiters, sellers, all kinds of people that really don’t want to spend time trying to figure out what you want.
@SpencerHakimian
Spencer Hakimian
1 day
Being in Paris is hilarious, because I ask a question in fluent French, but because of my American accent, everyone replies to me in English in a pitiful way. 🤣🤣
0
0
18
@Dorialexander
Alexander Doria
21 hours
project/global distinction is *very* fuzzy, not much frictions.
0
0
13
@Dorialexander
Alexander Doria
21 hours
actually one of the lead cause was uv — so joining @vikhyatk side for the time being.
@Dorialexander
Alexander Doria
2 days
learning the hard way to backup everything
7
0
54
@Dorialexander
Alexander Doria
21 hours
More seriously, good opportunity here for an explainability lab with some popularization skills on b2b normies. Dealing straight with tokenizers, maybe logit for open source solution is just corporate savings.
@Dorialexander
Alexander Doria
22 hours
I thought infinite context was nearly solved. And yet we find all theses strategies to fit tokens better, as if we were constrained or something. Curious.
1
1
10
@Dorialexander
Alexander Doria
22 hours
I thought infinite context was nearly solved. And yet we find all theses strategies to fit tokens better, as if we were constrained or something. Curious.
@doomslide
doomslide
1 day
Maybe just MAYBE when "context engineering" becomes the hottest topic you are not on the way to exploding gdp. Maybe.
8
0
76
@Dorialexander
Alexander Doria
22 hours
So relieved to see Europe sheltered from another bubble.
@teortaxesTex
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
23 hours
yep, there's a bubble in Chinese robotics this is NOT a real use case, come on now
4
1
99
@Dorialexander
Alexander Doria
1 day
Ok never going to have a smooth codex session again. Time to light up CC again I guess.
@steph_palazzolo
Stephanie Palazzolo
2 days
OpenAI's efforts to catch up with Anthropic's code-writing AI seem to be working: OpenAI's Codex has pulled ahead of Anthropic's Claude Code assistant by some measures, and its popularity with developers is catching up too, based on new data from Modu: https://t.co/i5ZPXDQc98
2
0
13