Matej Sirovatka @m_sirovatka X Profile

Matej Sirovatka

@m_sirovatka

Followers

1K

Following

872

Media

52

Statuses

584

distributed connoisseur

nccl mines

Joined August 2021

Don't wanna be here? Send us removal request.

Matej Sirovatka

@m_sirovatka

7 days

It's that time of the year again and we're coming with another @GPU_MODE competition! This time in collaboration with @nvidia focused on NVFP4. Focused on NVFP4 and B200 GPUs (thanks to @sestercegroup ) we'll release 4 problems over the following 3 months: 1. NVFP4 Batched GEMV

7

10

191

Matej Sirovatka

@m_sirovatka

14 hours

nothing beats the feeling of seeing your model reach ~10% mfu on a full node, what pain awaits at me full scale? give me back Jax please

0

19

Matej Sirovatka

@m_sirovatka

2 days

I hope the people behind Cursor composer will have their pillow cold from both sides

0

17

kalomaze

@kalomaze

2 days

RL LEARNING WITH LORA: A DIVERSE DEEP DIVE

21

87

982

Matej Sirovatka

@m_sirovatka

2 days

What's stopping you from waking up Sunday 4am to build? (please someone how do I fix jetlag 😭 it's been a week already)

5

0

24

Dom

@dominik_scherm

7 days

Hosted evals are live on our Environment Hub 🫡 Run your fav model, no setup needed. And share your benchmarks.

7

10

77

Matej Sirovatka

@m_sirovatka

7 days

So much high signal in this blog, always bullish on Gau’s posts 🐐

Thien Tran

@gaunernst

8 days

https://t.co/BxK6ise8ZA

0

2

28

Alex

@afurgs

8 days

gm

3

4

71

GPU MODE

@GPU_MODE

7 days

If you'd like to win your own Dell Pro Max with GB300 we're launching a new kernel competition with @NVIDIAAI @sestercegroup @Dell to optimize NVF4 kernels on B200 2025 has seen a tremendous rise of pythonic kernel DSLs, we got on-prem hardware to have reliable ncu benchmarking

7

20

164

Alex L Zhang

@a1zhang

7 days

The wait is over! We’re so excited to announce the @GPU_MODE x @NVIDIA kernel optimization competition for NVFP4 kernels on Blackwell B200s! We will be awarding NVIDIA DGX Spark’s & RTX 50XX series GPUs for individual rankings on each problem, as well as a Dell Pro Max with

8

32

210

Matej Sirovatka

@m_sirovatka

7 days

To register to be eligible for prices, follow this link https://t.co/Y8XHI1nI3c. As always, updates and discussions will be posted to https://t.co/5LyY8srz65. Good luck to everyone! 🔥

discord.com

An open source GPU programming community | 21509 members

1

22

Matej Sirovatka

@m_sirovatka

7 days

As you're all accustomed to, this competition also comes with banger prices. The grand winner receives a GB300 Dell Pro Max from @Dell and winners per problem will receive DGX Sparks and RTX 50xx GPUs and last but not least, tickets for NVIDIA GTC, for the official award ceremony

3

2

28

Matej Sirovatka

@m_sirovatka

7 days

you can just create environments

will brown

@willccbb

7 days

creating an environment with verifiers is as simple as writing a load_environment function and filling out a pyproject.toml, both of which are initialized for you when you do `prime env init` environments are packages, and can be used with prime-rl, skyrl, tinker, and more :)

1

7

65

Matej Sirovatka

@m_sirovatka

8 days

God how I love Paris airport

1

0

10

Matej Sirovatka

@m_sirovatka

9 days

Only in SF kinda story. I went to buy my matcha latte, listening to Clairo. Noticed a girl at this cafe and she seemed sad, I asked her what’s the problem, she said reward isn’t going up in her runs. I told her to try prime-rl - she starred the repo immediately.

Dhawal Jain

@thatssodhawal

11 days

Only in SF kinda story. Saw a beautiful girl at this cafe I was working from, well dressed, waiting for someone for over 25 mins & visibly distressed. Spoke to her, she was on a blind date and guy didn't show up (stupid guy). We spoke more. Told her what @mavehealth does,

9

5

283

Matej Sirovatka

@m_sirovatka

11 days

Researchers figuring out a way to minimize mismatch while I'm happily rewriting the training loop to use CP + FlexAttention while the inference uses god knows what

Rosinality

@rosinality

11 days

FP16 can have a smaller training-inference gap compared to BFloat16, thus fits better for RL. Even the difference between RL algorithms vanishes once FP16 is adopted. Surprising!

4

3

76

Matej Sirovatka

@m_sirovatka

11 days

Big tech doesn't want you to know this one simple trick - you can have 2 jobs at the same time, working the 2nd one while your "max-autotune-no-cudagraphs" flex attention is compiling

6

4

127

Matej Sirovatka

@m_sirovatka

13 days

3 days later, a lot of multiprocessing and multithreading attempts later, we’re at linear scaling with number of inference nodes

Matej Sirovatka

@m_sirovatka

14 days

Types in Python are a good idea until you open your profile trace and see this

2

0

30

Matej Sirovatka

@m_sirovatka

14 days

Types in Python are a good idea until you open your profile trace and see this

14

2

107

Prime Intellect

@PrimeIntellect

14 days

We're scaling our Open-Source Environments Program As part of this, we're committing hundreds of thousands of $ in bounties and looking for partners who want to join our mission to accelerate open superintelligence Join us in building the global hub for environments and evals

12

59

428