Diego Calanzone @diegocalanzone X Profile

Diego Calanzone

@diegocalanzone

Followers

247

Following

4K

Media

69

Statuses

660

« artificia docuit fames » // phd at @Mila_Quebec, intelligence by agency + deep learning for science // AI grad @UniTrento

127.0.0.1

Joined April 2015

Don't wanna be here? Send us removal request.

Diego Calanzone

@diegocalanzone

21 hours

🫡 so long old friend, thanks for the journey from day 1 of masters and zero research, to PhD year 1. You’ve seen evolving decision trees, quantized models with handcrafted losses, chains of latex syntax errors. ps: not thanks for the nvidia driver crashes.

2

0

15

Diego Calanzone

@diegocalanzone

22 hours

RT @Laz4rz: nobody told @karpathy that its obligatory to use uv now

0

12

0

Grok

@grok

3 days

Join millions who have switched to Grok.

161

184

2K

Diego Calanzone

@diegocalanzone

9 days

unfortunately, this is not playing Pokémon. The agent chats with the audience to get hints, plenty of tokens of reasoning before one major action, hence “steps”, no actual exploration, but handcrafted game APIs to recall pretraining knowledge. Good APIs, but IMO no agency.

Lisan al Gaib

@scaling01

9 days

GPT-5 is speed-running Pokemon .It's 3x faster than o3

0

Diego Calanzone

@diegocalanzone

14 days

RT @RL_Conference: Ending with our last RLC oral, @RichardSSutton with "The Oak Architecture: A Vision of SuperIntelligence from Experienc….

0

8

0

Diego Calanzone

@diegocalanzone

14 days

!!!.

Jack Merullo

@jack_merullo_

14 days

Could we tell if gpt-oss was memorizing its training data? I.e., points where it’s reasoning vs reciting? We took a quick look at the curvature of the loss landscape of the 20B model to understand memorization and what’s happening internally during reasoning

0

Diego Calanzone

@diegocalanzone

15 days

RT @giffmana: Amazing! Truly open review, through which we all gained more insights, i love it!. Result: in multi epoch setting, making AR….

0

36

0

Diego Calanzone

@diegocalanzone

15 days

RT @sporadicalia: just remembered that time Noam Shazeer dropped the hardest line ever written in an ML paper

0

622

0

Diego Calanzone

@diegocalanzone

15 days

Day 2: chatted with Rich. That’s all we need to know

0

4

Diego Calanzone

@diegocalanzone

16 days

And I’ve been mentioning sinks for months.

Graham Neubig

@gneubig

17 days

Summary of GPT-OSS architectural innovations:. 1. sliding window attention (ref: .2. mixture of experts (ref: .3. RoPE w/ Yarn (ref: .4. attention sinks (ref: streaming llm .

0

1

Diego Calanzone

@diegocalanzone

24 days

> you get H100s.> nodes are isolated from internet.> you decide to copy envs and llama 4 weights over SSH 🫠. I’m wondering how accessible compute actually is for researchers without a CS background.

0

1

Diego Calanzone

@diegocalanzone

24 days

with pretty viz!!!.

David McAllister

@davidrmcall

25 days

Excited to share Flow Matching Policy Gradients: expressive RL policies trained from rewards using flow matching. It’s an easy, drop-in replacement for Gaussian PPO on control tasks.

0

1

Diego Calanzone

@diegocalanzone

1 month

RT @lavoiems: 🧵 Everyone is chasing new diffusion models—but what about the representations they model from?.We introduce Discrete Latent C….

0

45

0

Diego Calanzone

@diegocalanzone

1 month

good points. Inherent constraints of the competition are part of the outcome performance and they shall be considered. Though Terence didn’t mention that human competitors are running on a negligible fraction of energy ;).

Rota

@pli_cachete

1 month

Terence Tao on the supposed Gold from OpenAI at IMO

0

Diego Calanzone

@diegocalanzone

1 month

RT @PontiEdoardo: We blend imitation (SFT) and exploration (RLVR) in post-training with a simple idea:. Sample a prefix of an SFT demonstra….

0

7

0

Diego Calanzone

@diegocalanzone

1 month

partially brewed at Mila.

Deedy

@deedydas

1 month

Google DeepMind just dropped this new LLM model architecture called Mixture-of-Recursions. It gets 2x inference speed, reduced training FLOPs and ~50% reduced KV cache memory. Really interesting read. Has potential to be a Transformers killer.

0

2

Diego Calanzone

@diegocalanzone

1 month

RT @steveazzolin: This is an issue on multiple levels, and authors using those "shortcuts"👀 are equally responsible for this unethical beha….

0

2

0

Diego Calanzone

@diegocalanzone

2 months

A comprehensive article on ways to Hierarchical RL!.

Martin Klissarov

@MartinKlissarov

2 months

As AI agents face increasingly long and complex tasks, decomposing them into subtasks becomes increasingly appealing. But how do we discover such temporal structure?. Hierarchical RL provides a natural formalism-yet many questions remain open. Here's our overview of the field🧵