Federico Cassano @ellev3n11 X Profile

Federico Cassano

@ellev3n11

Followers

2K

Following

579

Media

18

Statuses

395

big model trainer @cursor_ai. licensed fisherman @californiadfw. gpu connoisseur. emergent opera singer.

https://t.co/PgUSaxWUc3

San Francisco - Milan

Joined September 2020

Don't wanna be here? Send us removal request.

Federico Cassano

@ellev3n11

27 days

so excited to share Composer with the world! Composer is Cursor's own agentic coding model. it plans, edits, and builds software alongside you with precision, keeping you in flow with incredible speed. i started this project on the side while working on a bug-finder prototype

Cursor

@cursor_ai

27 days

Composer is a frontier coding model that completes tasks in under 30 seconds.

32

11

466

Federico Cassano

@ellev3n11

11 days

Interesting to hear this six-month-old podcast where we discuss ideas that later evolved into what's now Online Tab RL and Composer.

Cursor

@cursor_ai

6 months

A conversation on the optimal reward for coding agents, infinite context models, and real-time RL

6

8

162

Federico Cassano

@ellev3n11

16 days

the thing i dislike about all the cli coding agents out today is that they all break the https://t.co/wNUYOG1Q8n philosophy. cli tools should be minimal, snappy, and hackable.

4

0

31

Federico Cassano

@ellev3n11

21 days

https://t.co/wDSYlXdOQ2

Zichen Liu

@zzlccc

25 days

BF16 -> FP16 is such a simple (one configuration change in Oat) yet fundamental fix for inference-training mismatch. With FP16, the most basic importance sampling PG outperforms all algorithmic fixes in BF16. Let's rethink RL stability from the precision perspective.🔎

2

0

50

Federico Cassano

@ellev3n11

22 days

next ship: orchestra

Anthony

@kr0der

23 days

how it feels using cursor's composer model

1

0

22

Federico Cassano

@ellev3n11

24 days

2026/2027 will be the age of fault tolerance, health checks, and CP+DP->EP

Stas Bekman

@StasBekman

25 days

Remember how we were stuck with 80GB HBM for a really long time? This pattern is breaking - copious GPU memory will be the new norm in high end gpus in 26/27 - Nvidia Ultra Rubin - 1024GB HBM - Qualcomm MI200 - 768GB LPDDR - AMD MI400x - 432GB HBM So ML performance

3

2

44

Federico Cassano

@ellev3n11

27 days

CURSOR-BIG

6

1

88

Federico Cassano

@ellev3n11

27 days

More on our blog post:

cursor.com

Built to make you extraordinarily productive, Cursor is the best way to code with AI.

Federico Cassano

@ellev3n11

27 days

so excited to share Composer with the world! Composer is Cursor's own agentic coding model. it plans, edits, and builds software alongside you with precision, keeping you in flow with incredible speed. i started this project on the side while working on a bug-finder prototype

0

45

Federico Cassano

@ellev3n11

1 month

we got flash attention 4 varlen backward before gta 6

2

1

90

Federico Cassano

@ellev3n11

2 months

i have not updated by arch laptop in months; should i do it chat?

1

0

3

Federico Cassano

@ellev3n11

2 months

The world expert in SDC issues for LLM training is, of course, AWS:

arxiv.org

As the scale of training large language models (LLMs) increases, one emergent failure is silent data corruption (SDC), where hardware produces incorrect computations without explicit failure...

typedfemale

@typedfemale

4 months

presenting: big jeff's trainium hell

0

1

7

Federico Cassano

@ellev3n11

3 months

incredibly grateful. thank you to everyone that helped me get here. especially to @cursor_ai for truly making it happen and to @ArjunGuha for being such a great inspiration.

19

3

213

Federico Cassano

@ellev3n11

3 months

"aight, im compiling flash attention. see you tomorrow"

2

0

8

Federico Cassano

@ellev3n11

3 months

lots of puppies in cages

0

9

Federico Cassano

@ellev3n11

3 months

checked-out the puppies at @VoltagePark IRL. H100s are pretty blocky, unlike gaming graphics cards

2

3

55

Stuart Sul

@stuart_sul

3 months

MoE layers can be really slow. When training our coding models @cursor_ai, they ate up 27–53% of training time. So we completely rebuilt it at the kernel level and transitioned to MXFP8. The result: 3.5x faster MoE layer and 1.5x end-to-end training speedup. We believe our

29

105

884

Cursor

@cursor_ai

4 months

GPT-5 is now available in Cursor. It’s the most intelligent coding model our team has tested. We're launching it for free for the time being. Enjoy!

247

535

6K

Cursor

@cursor_ai

4 months

Cursor is now in your terminal! It’s an early beta. Access all models. Move easily between your CLI and editor.

304

546

7K

Federico Cassano

@ellev3n11

5 months

TIL Llama 4 2T is training with FP8 at 390 TFLOPS

0

7