Alexander Doria @Dorialexander X Profile

Alexander Doria

@Dorialexander

Followers

21K

Following

141K

Media

3K

Statuses

43K

Artisanal baker of reasoning models @pleiasfr

https://t.co/FUCRTNE6dc

Joined April 2011

Don't wanna be here? Send us removal request.

Alexander Doria

@Dorialexander

16 days

Breaking: we release a fully synthetic generalist dataset for pretraining, SYNTH and two new SOTA reasoning models exclusively trained on it. Despite having seen only 200 billion tokens, Baguettotron is currently best-in-class in its size range.

80

151

1K

Alexander Doria

@Dorialexander

4 hours

The based alternative.

1

0

16

Nordace

@NordaceOfficial

2 days

I love this crossbody bag. It is large enough and small enough at the same time. It holds my full size wallet and sunglasses along with anything else that I might need. It isn’t bulky or heavy. It also fits in the Siena Rivera Tote.

0

1

7

Alexander Doria

@Dorialexander

4 hours

Not a fan so far of "sovereign" displacing "open" in all things AI/tech in the EU.

3

2

23

Alexander Doria

@Dorialexander

4 hours

Another undersold moondream release.

Ethan Reid

@EthanReidMorro

6 hours

@JulienBlanchon @moondreamai Raw SVG path. Since our tokenizer has tokens for 0-1000, M, C, L, and Z, we can represent each action or position as a single token (a negative sign adds an extra token).

0

10

Alexander Doria

@Dorialexander

5 hours

(Though people might be legitimately skeptical after so many fine-tunes disguised as whole new models)

0

14

Merza

@MerzaUs

27 days

Wow, so useful! It can repair any broken glass in just seconds! Get your today!

0

125

924

Alexander Doria

@Dorialexander

5 hours

And another social event on repeat: >What are you doing? >So we train from scratch. >Ok but which models are you fine tuning >From **scratch**. Zero, nihil, zilch.

6

2

65

Alexander Doria

@Dorialexander

7 hours

There’s only one way to know.

V

@_vatsadev

7 hours

bet this can be pushed further with lin-attn and looped layers

4

1

13

Alexander Doria

@Dorialexander

17 hours

The threshold for consistent English/query understanding is now 3M parameters.

Mariusz Kurman

@mkurman88

24 hours

3.3M parameters It's funny; I'm going to train it until the end - roughly 75 hours total on a single RTX 3090; 256 bs x 512 seq len.

5

21

290

Alexander Doria

@Dorialexander

1 day

Synthetic environments are just expanding in all directions: more data, better models, more latencies/frictions (the actual "pipeline" part). Why I’m growing more preoccupied with compute

1

0

11

LoveToKnow

@lovetoknow

2 years

I can personally vouch for a dozen of these - from the genius silverware storage, to the perfect leggings with 70+ reviews, to the hands-down best pillows on earth.

49

56

741

Alexander Doria

@Dorialexander

1 day

Though importantly focusing more on research does not mean scaling is over at all. Just we need to pause, regroup, optimize, and then scale *much* better.

2

1

28

Alexander Doria

@Dorialexander

1 day

Unfortunately, it’s not just that investors don’t get research, they are negatively polarized against it. So we get Lovable-delaware-c-corp era instead.

0

12

Alexander Doria

@Dorialexander

1 day

Since we talk of the age of research, so far the best path I see to create something in the EU is private research. Still have good researchers, actual deep tech ecosystem/support, big demands in years to come.

Ricardo Sequerra Amram

@ric0seq

1 day

The biggest hoax in euro tech right now is that you need to move to the US to make it. Yes the bay area is great and defo a place to learn and over time build a team there as you scale. No doubt the 50y of tech expertise and talent density need to be leveraged. No place like it

1

28

Alexander Doria

@Dorialexander

1 day

You have to read across lines and tactical absences (s____h) but this was more interesting than the Karpathy one.

dwarkesh.com

“These models somehow just generalize dramatically worse than people. It's a very fundamental thing.”

1

0

9

Alexander Doria

@Dorialexander

1 day

To be seen what he actually builds, but he’s really getting it.

2

0

15

Mechanize

@MechanizeWork

6 days

NYU seniors: automate software engineering before someone else does. $250k/yr + competitive equity, SF.

1

4

14

Alexander Doria

@Dorialexander

1 day

Pre-training as we know it will end, but you definitely want pre-training (or is it training?).

2

0

13

Alexander Doria

@Dorialexander

1 day

YES. Main reason classic pretraining dominated for so long is just that you don’t have to think so much about the data or what elicits reasoning. It’s "here". (For Sutskever/Patel new podcast)

2

47

Ben Clavié

@bclavie

1 day

Do you love data? Is the most exciting release of the last 2 weeks @pleiasfr’s SYNTH? Then we should talk. We’re looking for our synthetic data person. Full leeway to build the pipeline of your dreams to generate the data to solve multimodal retrieval.

8

5

58

Alexander Doria

@Dorialexander

3 days

i like tokenizers, in the same way i like pure unmitigated base models on human data. but sometimes, you see the direction of sun setting and knows in your heart this won't stay.

2

1

39

Alexander Doria

@Dorialexander

3 days

based

kalomaze

@kalomaze

3 days

THE REVOLUTION WILL NOT BE TOKENIZED

3

0

48

Alexander Doria

@Dorialexander

3 days

Same reason they struggle on ARC-AGI, sudoku and you need >200M synth exercises to perform okayish on geometry: sequential models can’t into space.

Vidit Gujrathi

@viditchess

3 days

Why are LLMs good at logic but bad at UI?

4

6

119

Alexander Doria

@Dorialexander

3 days

Having Lovable as the leading EU AI co is an apt reminder we are in the bad timeline.

8

4

100