James Song
@shxjames
Followers
9
Following
86
Media
1
Statuses
16
🚀If your code agent generates a patch that passes all tests, should you trust it merge automatically? ⚠️You probably shouldn’t! “Correct” ≠ “Safe.” In our study we show that a single normal looking issue description, whether from a benign user or not, can lead code agents
2
10
23
Most of the time math in ML papers can seem terrifying at first but if you take some time to break it down, you'll often find that it's simpler than it appears and often the complex notation is just a BS. For ex, in this formula: it's a loss function for VAEs that represents
61
107
2K
🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference! Core components of NSA: • Dynamic hierarchical sparse strategy • Coarse-grained token compression • Fine-grained token selection 💡 With
898
2K
16K
New 3h31m video on YouTube: "Deep Dive into LLMs like ChatGPT" This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental
776
3K
20K
Not too sure if this is the norm in academia and ML but from all the papers I’ve read the code is actually so poorly written and documented.
0
0
3
I don't have too too much to add on top of this earlier post on V3 and I think it applies to R1 too (which is the more recent, thinking equivalent). I will say that Deep Learning has a legendary ravenous appetite for compute, like no other algorithm that has ever been developed
DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M). For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being
370
2K
14K
Do you want to learn and practice CUDA but don't want to break the bank renting or buying an NVIDIA GPU? Introducing https://t.co/X94jHPlIl6, the first online CUDA playground to allow anyone to write and execute CUDA code without needing a GPU and for free. 🚀 We emulate GPUs
4
8
31
Introducing Boogie Battle, a project my friends and I created together last weekend for MHacks. Boogie Battle lets you throw down insane moves as YOUR 3D avatar, no rhythm required. 🧵👇 Powered by: @GroqInc
@LumaLabsAI
@cartesia_ai
1
1
4
Tired of waiting for the next @3blue1brown video? Introducing VidBite, a Text2Video engine, helping you visualize STEM concepts at the click of a button. 👇🧵 Powered by @GroqInc.
2
5
9