_xjdr Profile Banner
xjdr Profile
xjdr

@_xjdr

Followers
24K
Following
28K
Media
780
Statuses
6K

ptx enjoyer

Noam's Labyrinth
Joined December 2023
Don't wanna be here? Send us removal request.
@_xjdr
xjdr
1 year
Writing jitted jax code is like playing Dark Souls but in python.
13
17
392
@_xjdr
xjdr
1 day
ok, lets see what this is all about then
Tweet media one
3
0
69
@_xjdr
xjdr
2 days
This is how I approach vibe checking models
0
0
14
@_xjdr
xjdr
2 days
You sweet sweet summer child . .
@pigeon__s
ρ:ɡeσn
2 days
@_xjdr @tejashaveridev but why would anyone in the world ever run any model at full precision.
6
2
195
@_xjdr
xjdr
3 days
now running k2 heavy (4 collaborating agents) at full precision just to feel something.
14
7
413
@_xjdr
xjdr
3 days
it is a very very good model. maybe my new favorite. for sure top 3.
@_xjdr
xjdr
3 days
k2 is very good.
12
8
389
@_xjdr
xjdr
3 days
+1.
@kalomaze
kalomaze
3 days
@Kimi_Moonshot cons@64 has issues in situations where your metric is inherently continuous. avg@64 seems more sensible + general as a standard for evaluation.
0
0
10
@_xjdr
xjdr
3 days
RT @kalomaze: @Kimi_Moonshot cons@64 has issues in situations where your metric is inherently continuous. avg@64 seems more sensible + gene….
0
3
0
@_xjdr
xjdr
3 days
Tweet media one
Tweet media two
3
3
167
@_xjdr
xjdr
3 days
i say this will all seriousness. skill issue.
@METR_Evals
METR
4 days
We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.
Tweet media one
33
26
661
@_xjdr
xjdr
3 days
turns out, jurgen actually invented the GPU in order to unleash the power of neural networks in a little known paper in 1989 but has never received the credit he deserves.
@SchmidhuberAI
Jürgen Schmidhuber
3 days
Congrats to @NVIDIA, the first public $4T company! Today, compute is 100000x cheaper, and $NVDA 4000x more valuable than in the 1990s when we worked on unleashing the true potential of neural networks. Thanks to Jensen Huang (see image) for generously funding our research 🚀
Tweet media one
14
8
396
@_xjdr
xjdr
3 days
agreed. it feels very much like 3.5 in the best ways. it still needs a bit of polish but its really good.
@nrehiew_
wh
3 days
Playing with it a bit more and I think its similar to 3.5 Sonnet which to date was still the biggest step change in capabilities.
2
2
94
@_xjdr
xjdr
3 days
such a stacked author list. such a great paper
Tweet media one
1
6
88
@_xjdr
xjdr
3 days
k2 is very good.
@_xjdr
xjdr
3 days
@zephyr_z9 @YouJiacheng Downloading now.
17
7
276
@_xjdr
xjdr
5 days
early vibe check:.on the plus side, xai and co have built a model that puts in squarely in the frontier. on the minus side, its a bit deepfried with all the RL and its verbose and sycophantic. its slow enough to need to compete with O3-pro but i prefer the latter every time.
12
10
390
@_xjdr
xjdr
5 days
this is impressive.
@arcprize
ARC Prize
5 days
Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9%. This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA
Tweet media one
15
2
289
@_xjdr
xjdr
5 days
nope. nope nope nope.
1
0
64
@_xjdr
xjdr
5 days
. its just best of N in a trench coat . .
13
6
286
@_xjdr
xjdr
5 days
my review so far:
Tweet media one
1
5
104
@_xjdr
xjdr
5 days
grok got held up on the eastern front. classic blunder . .
4
1
82
@_xjdr
xjdr
5 days
Tweet media one
@xai
xAI
5 days
The Grok 4 livestream will begin soon. Stay tuned.
0
1
94