Explore tweets tagged as #kernels
Inference optimizations Iโd study if I wanted sub-second LLM responses: Bookmark this. 1.KV-Caching 2.Speculative Decoding 3.FlashAttention 4.PagedAttention 5.Batch Inference 6.Early Exit Decoding 7.Parallel Decoding 8.Mixed Precision Inference 9.Quantized Kernels 10.Tensor
18
261
2K
๐ We are releasing state-of-the-art post-training quantization (PTQ) algorithms for Microscaling FP4, together with kernels: - First study focused on MXFP4/NVFP4 PTQ for LLMs - New Micro-Rotated (MR) format and GPTQ algorithm - QuTLASS GPU kernels with up to 3.6x speedups.
1
28
150
I'm building my own OpenCV from scratch - fastcv. fastcv is a C++ CUDA rewrite with Pytorch bindings of the image filters in the OpenCV library. I have already written two optimized kernels and will keep studying and implementing more. I have also added current benchmarks.
38
58
1K
good morning >first day of unemployment >gpu programming seems so cool >wrote my first kernels
112
90
5K
Celebrating 500 downloads๐คฉ๐ The earliest preprints are now available on Zenodo . Spiral kernels : https://t.co/UPHQ4YdoU3 Sound localization: https://t.co/T1bAzaUsC8 Ultra lightweight QCNN : https://t.co/IKVDFxwCR9 SKCNN for DNA classification: https://t.co/8qTYs4W77j
6
2
34
Morning grind โ > Read a research paper (ML+Open Quantum systems) > Did theory part of SVM, QSVM >Implemented SVM on Iris dataset using different kernels > Implemented QSVM from scratch using numpy on synthetic dataset >Implemented QSVM using pennylane
5
5
194
Cured from stage 4 metastatic cancer in the breast, lungs, stomach, ovaries & bones with soursop along with apricot kernels and black seed oil. Told she was going to die but is now cancer free. /1
27
343
1K
This is a super cool topic! I did a class project with a friend for a course on Kernels during senior year in college ๐: https://t.co/FWmUOdbJjc Lots of fun connections between kernels and self-attention, especially when learning periodic functions The attention patterns
0
55
446
FlashInfer redefines how attention kernels, kv-cache layouts and dynamic runtimes are compiled and scheduled for efficient LLM serving. Check out my latest blog "Dissecting FlashInfer - A Systems Perspective on High-Performance LLM Inference".
9
43
399
in the lectures below, i hold your hand through low-level LLM systems engineering. it includes everything up to TODAY! 1) pytorch tensors 2) large matmul on cpu vs gpu 3) JAX (and why xAI uses it instead of pytorch) 4) raw cuda kernels and global threading indexing 5) triton
30
76
800
For a product like popcorn where people are going to ask WHY would I pay way more...you gotta show them how it's actually different (unique flavors, wrapped kernels) & that it's SO worth it (aka, it's freaking delicious). Throw in an innuendo hook and you're good to roll ๐
#ugc
8
0
17
fastcv sobel edge detector kernel is ๐ญ๐ฎ๐ฌ๐ฌ๐
faster than opencv for 4k images. more benchmarks (RTX 4060 Ti and 4k images) 1. blur kernel is ๐ฐ๐
faster than opencv 2. grayscale kernel is ๐ญ๐ฎ๐
faster than opencv for 4k images. Writing more kernels every single day!
I'm building my own OpenCV from scratch - fastcv. fastcv is a C++ CUDA rewrite with Pytorch bindings of the image filters in the OpenCV library. I have already written two optimized kernels and will keep studying and implementing more. I have also added current benchmarks.
18
32
386
Optimizations Iโd study if I wanted real-time Gen AI. Bookmark this. 1.Streaming Generation 2.Token Parallelism 3.Prefetch Pipelines 4.CUDA Graphs 5.Speculative Decoding 6.PagedAttention 7.KV Cache Quantization 8.Dynamic Batching 9.FP8 Kernels 10.Asynchronous Prefill 11.Memory
8
90
589
If anyone needs a video guide to Karpathy's nanochat, check out Stanford's CS336! It covers: - Tokenization - Resource Accounting - Pretraining - Finetuning (SFT/RLHF) - Overview of Key Architectures - Working with GPUs - Kernels and Tritons - Parallelism - Scaling Laws -
12
281
2K
Excited to attend the Aptos Experience with @rockyntheblock as an Ambassador for ๐๏ธ Resort Maker on @Aptos โ๏ธ๐ฝ Here for RWAs, data, and some good time in NYC? Letโs connect :) Catch you at the conference ! @KernelsAI
@exponentlabshq
1
2
10
Weโre also hiring aggressively. Reach out if youโre interested in building automated research environments and agents. (AI researchers and SWEs, pre/mid/post training, architecture design, kernels, lots of backend system design, and automation) Our team is behind the tech for
16
5
110
Inside info on Elegoo! Introducing the Pop-n-Print 3000! ๐ Multi-color 3D printing is now as easy as movie night. Load the spools & kernels, hit start, & let this high-tech marvel handle the print AND the popcorn. Kick back, game on, & enjoy simultaneous 3D printing & snacking!
0
0
3