ellev3n11 Profile Banner
Federico Cassano Profile
Federico Cassano

@ellev3n11

Followers
2K
Following
579
Media
18
Statuses
395

big model trainer @cursor_ai. licensed fisherman @californiadfw. gpu connoisseur. emergent opera singer.

San Francisco - Milan
Joined September 2020
Don't wanna be here? Send us removal request.
@ellev3n11
Federico Cassano
27 days
so excited to share Composer with the world! Composer is Cursor's own agentic coding model. it plans, edits, and builds software alongside you with precision, keeping you in flow with incredible speed. i started this project on the side while working on a bug-finder prototype
@cursor_ai
Cursor
27 days
Composer is a frontier coding model that completes tasks in under 30 seconds.
32
11
466
@ellev3n11
Federico Cassano
11 days
Interesting to hear this six-month-old podcast where we discuss ideas that later evolved into what's now Online Tab RL and Composer.
@cursor_ai
Cursor
6 months
A conversation on the optimal reward for coding agents, infinite context models, and real-time RL
6
8
162
@ellev3n11
Federico Cassano
16 days
the thing i dislike about all the cli coding agents out today is that they all break the https://t.co/wNUYOG1Q8n philosophy. cli tools should be minimal, snappy, and hackable.
4
0
31
@ellev3n11
Federico Cassano
21 days
@zzlccc
Zichen Liu
25 days
BF16 -> FP16 is such a simple (one configuration change in Oat) yet fundamental fix for inference-training mismatch. With FP16, the most basic importance sampling PG outperforms all algorithmic fixes in BF16. Let's rethink RL stability from the precision perspective.🔎
2
0
50
@ellev3n11
Federico Cassano
22 days
next ship: orchestra
@kr0der
Anthony
23 days
how it feels using cursor's composer model
1
0
22
@ellev3n11
Federico Cassano
24 days
2026/2027 will be the age of fault tolerance, health checks, and CP+DP->EP
@StasBekman
Stas Bekman
25 days
Remember how we were stuck with 80GB HBM for a really long time? This pattern is breaking - copious GPU memory will be the new norm in high end gpus in 26/27 - Nvidia Ultra Rubin - 1024GB HBM - Qualcomm MI200 - 768GB LPDDR - AMD MI400x - 432GB HBM So ML performance
3
2
44
@ellev3n11
Federico Cassano
27 days
CURSOR-BIG
6
1
88
@ellev3n11
Federico Cassano
27 days
More on our blog post:
Tweet card summary image
cursor.com
Built to make you extraordinarily productive, Cursor is the best way to code with AI.
@ellev3n11
Federico Cassano
27 days
so excited to share Composer with the world! Composer is Cursor's own agentic coding model. it plans, edits, and builds software alongside you with precision, keeping you in flow with incredible speed. i started this project on the side while working on a bug-finder prototype
0
0
45
@ellev3n11
Federico Cassano
1 month
we got flash attention 4 varlen backward before gta 6
2
1
90
@ellev3n11
Federico Cassano
2 months
i have not updated by arch laptop in months; should i do it chat?
1
0
3
@ellev3n11
Federico Cassano
2 months
The world expert in SDC issues for LLM training is, of course, AWS:
Tweet card summary image
arxiv.org
As the scale of training large language models (LLMs) increases, one emergent failure is silent data corruption (SDC), where hardware produces incorrect computations without explicit failure...
@typedfemale
typedfemale
4 months
presenting: big jeff's trainium hell
0
1
7
@ellev3n11
Federico Cassano
3 months
incredibly grateful. thank you to everyone that helped me get here. especially to @cursor_ai for truly making it happen and to @ArjunGuha for being such a great inspiration.
19
3
213
@ellev3n11
Federico Cassano
3 months
"aight, im compiling flash attention. see you tomorrow"
2
0
8
@ellev3n11
Federico Cassano
3 months
lots of puppies in cages
0
0
9
@ellev3n11
Federico Cassano
3 months
checked-out the puppies at @VoltagePark IRL. H100s are pretty blocky, unlike gaming graphics cards
2
3
55
@stuart_sul
Stuart Sul
3 months
MoE layers can be really slow. When training our coding models @cursor_ai, they ate up 27–53% of training time. So we completely rebuilt it at the kernel level and transitioned to MXFP8. The result: 3.5x faster MoE layer and 1.5x end-to-end training speedup. We believe our
29
105
884
@cursor_ai
Cursor
4 months
GPT-5 is now available in Cursor. It’s the most intelligent coding model our team has tested. We're launching it for free for the time being. Enjoy!
247
535
6K
@cursor_ai
Cursor
4 months
Cursor is now in your terminal! It’s an early beta. Access all models. Move easily between your CLI and editor.
304
546
7K
@ellev3n11
Federico Cassano
5 months
TIL Llama 4 2T is training with FP8 at 390 TFLOPS
0
0
7