
Aleksa Gordić (水平问题)
@gordic_aleksa
Followers
27K
Following
8K
Media
877
Statuses
5K
getting us to singularity with friends computers can be understood: https://t.co/doHE1Qv2Sj x @GoogleDeepMind @Microsoft tensor core maximalist
San Francisco, CA
Joined September 2017
llm.c gang w/ @karpathy, Erik, and Arun: aka "avengers" and since Andrej's yesterday talk the "4 pandas team" lol 😂
26
27
1K
New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along. (Remember matmul is the single most important operation that transformers execute
48
395
3K
Amazing blogpost from @gordic_aleksa explaining internals of vLLM😍
New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work! Took me a while to get this level of understanding of the codebase and then to write up
3
49
386
1
2
21
New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work! Took me a while to get this level of understanding of the codebase and then to write up
63
406
3K
this is how you position yourself as a research authority
2
2
57
A few weeks ago, I was thrilled for a buddy at @windsurf_ai when the @OpenAI acquisition was announced. I joked in our group chat that he'd be picking up the tab on the next boy's trip. Now, with Google���s acquihire, the news is devastating. Here’s what I’ve gathered: - The
27
45
903
really cool product launch by @MishaLaskin and the team at @reflection_ai! we're currently more bottlenecked by code understanding than code generation
Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The best-in-class code research agent, built for teams and organizations.
0
1
14
This is a really great point and something I totally forgot about. I've been often thinking there may be future optimization algorithms that may deviate from SGD or be something of a hybrid approach. However if you keep stacking parameters local minima are less of an enemy
4
8
112
https://t.co/nPPNw62lmy <- blog here or on Medium (where my older blogs live):
1
1
2
started a new blog on my new website, the first one after the flash attn blog i wrote back in 2023! title --> neocambria - humans in the post-ASI world i've been thinking about these ideas for many years, finally found time to sit down and write them all down in a cohesive way.
3
2
28
There are two kinds of people in this world: those who count 1, 2, 3 - and those who count 1, 3, 4
1
0
8