Avian.io
@avian_io
Followers
263
Following
33
Media
2
Statuses
41
Worlds Fastest Inference for DeepSeek R1 at 351 tokens per second. Dedicated endpoints + Serverless APIs available.
New York, USA
Joined July 2022
4X DeepSeek R1 671B Inference: As we prepare for DeepSeek R2, Avian presents 351 output tokens per second on DeepSeek R1 per user on NVIDIA B200 at FP4 precision. Presented to you in collaboration with @Vultr Cloud GPU, who provided the NVIDIA HGX B200 hardware. Keep your
2
4
13
Learn how https://t.co/hZeVXLVPLP achieved a world record 303 tokens per second on DeepSeek R1 using TensorRT-LLM and NVIDIA Blackwell B200 in our technical blog https://t.co/Vz6IOQxKi1
1
22
71
NVIDIA Blackwell sets a new benchmark with 303 Tokens/s for DeepSeek R1 in FP4 precision. 👀 🎉 Huge congrats to NVIDIA Inception partner @avian_io on this impressive achievement which showcases major ecosystem breakthrough, leveraging NVIDIA Blackwell and our open accelerated
NVIDIA Blackwell can achieve 303 output tokens/s for DeepSeek R1 in FP4 precision, per our benchmarking of an Avian API endpoint Artificial Analysis benchmarked DeepSeek R1 on an @avian_io private API endpoint. Running DeepSeek R1 in FP4 precision on NVIDIA Blackwell, their
5
15
105
NVIDIA Blackwell can achieve 303 output tokens/s for DeepSeek R1 in FP4 precision, per our benchmarking of an Avian API endpoint Artificial Analysis benchmarked DeepSeek R1 on an @avian_io private API endpoint. Running DeepSeek R1 in FP4 precision on NVIDIA Blackwell, their
5
14
195