James Song @shxjames X Profile

James Song

@shxjames

Followers

9

Following

86

Media

1

Statuses

16

@umich cs

Joined April 2024

Don't wanna be here? Send us removal request.

Infini-AI-Lab

@InfiniAILab

2 months

🚀If your code agent generates a patch that passes all tests, should you trust it merge automatically? ⚠️You probably shouldn’t! “Correct” ≠ “Safe.” In our study we show that a single normal looking issue description, whether from a benign user or not, can lead code agents

2

10

23

hatoo💛

@hatookov

9 months

LeetGPU、まだやってる人少なくてよく知られた最適化やるだけで全然いけるぞ。みんなやりな？？ https://t.co/100WgbsyTk

0

5

13

Gauri Tripathi

@Gauri_the_great

9 months

Most of the time math in ML papers can seem terrifying at first but if you take some time to break it down, you'll often find that it's simpler than it appears and often the complex notation is just a BS. For ex, in this formula: it's a loss function for VAEs that represents

61

107

2K

DeepSeek

@deepseek_ai

10 months

🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference! Core components of NSA: • Dynamic hierarchical sparse strategy • Coarse-grained token compression • Fine-grained token selection 💡 With

898

2K

16K

Andrej Karpathy

@karpathy

10 months

New 3h31m video on YouTube: "Deep Dive into LLMs like ChatGPT" This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental

776

3K

20K

James Song

@shxjames

10 months

Not too sure if this is the norm in academia and ML but from all the papers I’ve read the code is actually so poorly written and documented.

0

3

Andrej Karpathy

@karpathy

11 months

I don't have too too much to add on top of this earlier post on V3 and I think it applies to R1 too (which is the more recent, thinking equivalent). I will say that Deep Learning has a legendary ravenous appetite for compute, like no other algorithm that has ever been developed

Andrej Karpathy

@karpathy

1 year

DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M). For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being

370

2K

14K

Kunal

@therealkmans

11 months

Do you want to learn and practice CUDA but don't want to break the bank renting or buying an NVIDIA GPU? Introducing https://t.co/X94jHPlIl6, the first online CUDA playground to allow anyone to write and execute CUDA code without needing a GPU and for free. 🚀 We emulate GPUs

4

8

31

Kunal

@therealkmans

1 year

Introducing Boogie Battle, a project my friends and I created together last weekend for MHacks. Boogie Battle lets you throw down insane moves as YOUR 3D avatar, no rhythm required. 🧵👇 Powered by: @GroqInc @LumaLabsAI @cartesia_ai

1

4

Kunal

@therealkmans

1 year

Tired of waiting for the next @3blue1brown video? Introducing VidBite, a Text2Video engine, helping you visualize STEM concepts at the click of a button. 👇🧵 Powered by @GroqInc.

2

5

9