xjdr @_xjdr X Profile

xjdr

@_xjdr

Followers

24K

Following

28K

Media

780

Statuses

6K

ptx enjoyer

Noam's Labyrinth

Joined December 2023

Don't wanna be here? Send us removal request.

xjdr

@_xjdr

1 year

Writing jitted jax code is like playing Dark Souls but in python.

13

17

392

xjdr

@_xjdr

1 day

ok, lets see what this is all about then

3

0

69

xjdr

@_xjdr

2 days

This is how I approach vibe checking models

0

14

xjdr

@_xjdr

2 days

You sweet sweet summer child . .

ρ:ɡeσn

@pigeon__s

2 days

@_xjdr @tejashaveridev but why would anyone in the world ever run any model at full precision.

6

2

195

xjdr

@_xjdr

3 days

now running k2 heavy (4 collaborating agents) at full precision just to feel something.

14

7

413

xjdr

@_xjdr

3 days

it is a very very good model. maybe my new favorite. for sure top 3.

xjdr

@_xjdr

3 days

k2 is very good.

12

8

389

xjdr

@_xjdr

3 days

+1.

kalomaze

@kalomaze

3 days

@Kimi_Moonshot cons@64 has issues in situations where your metric is inherently continuous. avg@64 seems more sensible + general as a standard for evaluation.

0

10

xjdr

@_xjdr

3 days

RT @kalomaze: @Kimi_Moonshot cons@64 has issues in situations where your metric is inherently continuous. avg@64 seems more sensible + gene….

0

3

0

xjdr

@_xjdr

3 days

3

167

xjdr

@_xjdr

3 days

i say this will all seriousness. skill issue.

METR

@METR_Evals

4 days

We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.

33

26

661

xjdr

@_xjdr

3 days

turns out, jurgen actually invented the GPU in order to unleash the power of neural networks in a little known paper in 1989 but has never received the credit he deserves.

Jürgen Schmidhuber

@SchmidhuberAI

3 days

Congrats to @NVIDIA, the first public $4T company! Today, compute is 100000x cheaper, and $NVDA 4000x more valuable than in the 1990s when we worked on unleashing the true potential of neural networks. Thanks to Jensen Huang (see image) for generously funding our research 🚀

14

8

396

xjdr

@_xjdr

3 days

agreed. it feels very much like 3.5 in the best ways. it still needs a bit of polish but its really good.

wh

@nrehiew_

3 days

Playing with it a bit more and I think its similar to 3.5 Sonnet which to date was still the biggest step change in capabilities.

2

94

xjdr

@_xjdr

3 days

such a stacked author list. such a great paper

1

6

88

xjdr

@_xjdr

3 days

k2 is very good.

xjdr

@_xjdr

3 days

@zephyr_z9 @YouJiacheng Downloading now.

17

7

276

xjdr

@_xjdr

5 days

early vibe check:.on the plus side, xai and co have built a model that puts in squarely in the frontier. on the minus side, its a bit deepfried with all the RL and its verbose and sycophantic. its slow enough to need to compete with O3-pro but i prefer the latter every time.

12

10

390

xjdr

@_xjdr

5 days

this is impressive.

ARC Prize

@arcprize

5 days

Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9%. This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA

15

2

289

xjdr

@_xjdr

5 days

nope. nope nope nope.

1

0

64

xjdr

@_xjdr

5 days

. its just best of N in a trench coat . .

13

6

286

xjdr

@_xjdr

5 days

my review so far:

1

5

104

xjdr

@_xjdr

5 days

grok got held up on the eastern front. classic blunder . .

4

1

82

xjdr

@_xjdr

5 days

xAI

@xai

5 days

The Grok 4 livestream will begin soon. Stay tuned.

0

1

94