
Kirill Solodskikh
@GarchFather
Followers
370
Following
342
Media
34
Statuses
178
Almost Phd, Almost Founder, Almost Team Lead, Almost Successful, married. @TheStageAI Co-founder, CEO, ex Huawei P50 AI cameras
Joined October 2022
Replace static scale estimation with dynamic activation quantization. Accuracy jumps to 66.5% – almost original performance. This progress is only possible thanks to open source. Special thanks to SmoothQuant authors (@Guangxuan_Xiao, @jilin_14, @songhan_mit), their method is
1
0
9
Our research team took @AIatMeta LLaMA-8B, quantized it with QLIP using post-training int8, applied SmoothQuant, and used pre-defined compiler-compatible NVIDIA configs. Why do this? Up to 2× fewer weights and 3.6× faster on one GPU. Try it with our simple Jupyter Notebook.
6
14
202
Our @TheStageAI team was happy to gain early access to the @nvidia B200 from @nebiusai and establish benchmarking for our optimized diffusion models. We now fully support inference of optimized models on B200 across various AI applications - LLMs, VLMs, Text-to-Image,.
NVIDIA HGX B200 instances are now available as self-service AI clusters on Nebius AI Cloud: 🔥. This means anyone can access @NVIDIA Blackwell — the latest generation of NVIDIA’s accelerated computing platform — with just a few clicks and a credit card.
0
1
8
RT @TheStageAI: AI engineers and researchers can now use our Quantization API to run accelerated LLMs, VLMs, and diffusion on NVIDIA and ed….
0
7
0
Been cooking up some audio tools. Made a quick playground on Hugging Face Spaces for easy testing. It’s Elastic MusicGen, our fork of Meta’s MusicGen Large by @TheStageAI. Drop prompts, get tracks —.in seconds, right in your browser. 🚀 11× faster than.
huggingface.co
12
12
141
Meet Elastic MusicGen Large — our optimized fork of @metaai's MusicGen, powered by ANNA (@TheStageAI’s Automated Neural Network Accelerator):. Ye @kanyewest used AI for vocals on "Bully," calling it the "next Auto-Tune." He switched up later, but tracks.
huggingface.co
6
17
186
⌁ EUROPE SIGNAL: ACTIVE ⌁. ↳ Want to accelerate your model’s inference?.↳ These guys sure do. ✦ Berlin: mapped next steps with our investors Christophe Maire and Lukas Erbguth of Atlantic Labs. ✦ Paris: @NVIDIAGTC showed us what’s possible. ✦ Germany: more investor talks
0
2
7
▚▞▚▞ DATA LOG: AI EUROPE ▚▞▚▞. For years, AI talk was all Silicon Valley. After @NVIDIA #GTCParis, one thing became clear: Europe’s AI ecosystem has already kicked into high gear. 🇫🇷 @MistralAI’s dropping open weights that actually run. 🇩🇪 @Aleph__Alpha building native
0
4
8
RT @TheStageAI: Bonjour, Paris 🇫🇷. Just wrapped 2 amazing days at @NVIDIA #GTCParis at @VivaTech — AI infra, agentic systems, and robots wa….
0
3
0
RT @TheStageAI: 🥐 Bon appétit, developers. New @MistralAI models for self-hosting accelerated by TheStage AI:. - New LLM: Mistral Small 24B….
0
2
0
RT @TheStageAI: Wrong model = slow app. We help you pick the right one for your GPU. You can now explore a new Models section on our platf….
0
4
0
RT @TheStageAI: 💻Hey devs - real-time speech transcription inference, zero cost to run! Starting from MacOS 14, M1-M4. Check the crash test….
0
13
0
Yes! We already supporting @NVIDIAAI B200! And it's giving us the fastest performance in the world!.
🚀 TheStage AI x Nebius = fastest diffusion model inference on NVIDIA Blackwell - and it’s already live. Huge thanks to @TFNBreakingNews for the sharp story. We don’t chase trends. We set benchmarks. Big thanks to @Nebius team!. 🔗 #AI #NVIDIA
0
2
2