James Song Profile
James Song

@shxjames

Followers
9
Following
86
Media
1
Statuses
16

@umich cs

Joined April 2024
Don't wanna be here? Send us removal request.
@InfiniAILab
Infini-AI-Lab
2 months
🚀If your code agent generates a patch that passes all tests, should you trust it merge automatically? ⚠️You probably shouldn’t! “Correct” ≠ “Safe.” In our study we show that a single normal looking issue description, whether from a benign user or not, can lead code agents
2
10
23
@hatookov
hatoo💛
9 months
LeetGPU、まだやってる人少なくてよく知られた最適化やるだけで全然いけるぞ。みんなやりな?? https://t.co/100WgbsyTk
0
5
13
@Gauri_the_great
Gauri Tripathi
9 months
Most of the time math in ML papers can seem terrifying at first but if you take some time to break it down, you'll often find that it's simpler than it appears and often the complex notation is just a BS. For ex, in this formula: it's a loss function for VAEs that represents
61
107
2K
@deepseek_ai
DeepSeek
10 months
🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference! Core components of NSA: • Dynamic hierarchical sparse strategy • Coarse-grained token compression • Fine-grained token selection 💡 With
898
2K
16K
@karpathy
Andrej Karpathy
10 months
New 3h31m video on YouTube: "Deep Dive into LLMs like ChatGPT" This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental
776
3K
20K
@shxjames
James Song
10 months
Not too sure if this is the norm in academia and ML but from all the papers I’ve read the code is actually so poorly written and documented.
0
0
3
@karpathy
Andrej Karpathy
11 months
I don't have too too much to add on top of this earlier post on V3 and I think it applies to R1 too (which is the more recent, thinking equivalent). I will say that Deep Learning has a legendary ravenous appetite for compute, like no other algorithm that has ever been developed
@karpathy
Andrej Karpathy
1 year
DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M). For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being
370
2K
14K
@therealkmans
Kunal
11 months
Do you want to learn and practice CUDA but don't want to break the bank renting or buying an NVIDIA GPU? Introducing https://t.co/X94jHPlIl6, the first online CUDA playground to allow anyone to write and execute CUDA code without needing a GPU and for free. 🚀 We emulate GPUs
4
8
31
@therealkmans
Kunal
1 year
Introducing Boogie Battle, a project my friends and I created together last weekend for MHacks. Boogie Battle lets you throw down insane moves as YOUR 3D avatar, no rhythm required. 🧵👇 Powered by: @GroqInc @LumaLabsAI @cartesia_ai
1
1
4
@therealkmans
Kunal
1 year
Tired of waiting for the next @3blue1brown video? Introducing VidBite, a Text2Video engine, helping you visualize STEM concepts at the click of a button. 👇🧵 Powered by @GroqInc.
2
5
9