
Aurick Qiao
@AurickQ
Followers
478
Following
187
Media
20
Statuses
141
@Snowflake AI Research | @LLM360 | Previously @PetuumInc | PhD @SCSatCMU | CS @UWaterloo
Pittsburgh, PA
Joined November 2016
Arctic Inference helps @allhands_ai complete real-world coding tasks 2x faster through faster LLM inference. Check it out!.
Imagine coding agents finishing your requests and sending a pull request in 30 seconds 🤯. Check out this new video of OpenHands + DevStral + @Snowflake’s new inference method ArcticInference. It speeds up coding agents by as much as 2x over vLLM (which is already fast).
0
8
23
RT @StasBekman: My first project at @Snowflake AI Research is complete! . I present to you Arctic Long Sequence Training (ALST) . Paper: ht….
0
63
0
RT @JiaZhihao: One of the best ways to reduce LLM latency is by fusing all computation and communication into a single GPU megakernel. But….
0
120
0
RT @vllm_project: vLLM has just reached 50K github stars! Huge thanks to the community!🚀.Together let's bring easy, fast, and cheap LLM ser….
0
21
0
RT @haoailab: 🚀 Dynasor is now production-ready in open-source stacks!.@NVIDIA TensorRT-LLM.@Snowflake ArcticInference. Try it today ↓. Ten….
0
17
0
RT @HongyiWang10: Super excited to see GenBio featured in the @googlecloud blog on bio startups!.
0
3
0
RT @jeffra45: 🧵1/ New release from @Snowflake AI Research:. Shift Parallelism is a new LLM inference technique built on top of vLLM, relea….
0
18
0
This is the combined work of the amazing inference systems team at @Snowflake AI Research: @samyamrb @MertHidayetoglu @YeWang6626 @1a1a11a @jeffra45 Mike Wyatt @_charlesxu @JerryL411 @spacemanidol @yuxionghe and many others!.
0
0
8
Excited to open-source Shift Parallelism, developed at @Snowflake AI Research for LLM inference!. With it, Arctic Inference + @vllm_project delivers:. 🚀3.4x faster e2e latency & 1.06x higher throughput.🚀1.7x faster generation & 2.25x lower response time.🚀16x higher throughput
2
39
165
Very proud of our recent work at @Snowflake AI Research, which spans from the systems layer to the application layer. Check out this article from @VentureBeat which highlights two of our major initiatives: LLM inference performance and Text-to-SQL!.
How Snowflake's open-source text-to-SQL and Arctic inference models solve enterprise AI's two biggest deployment headaches
0
2
14
RT @probablybots: AIDO.ModelGenerator v0.1.2 is now on PyPI. Use the mgen CLI for no-code inference, embedding, and finetuning for the new….
0
6
0
RT @NVIDIAAIDev: 🎉 Congratulations to the FlashInfer team – their technical paper, "FlashInfer: Efficient and Customizable Attention Engine….
0
5
0