
Red Hat AI
@RedHat_AI
Followers
7K
Following
1K
Media
373
Statuses
1K
Deliver AI value with the resources you have, the insights you own, and the freedom you need.
Joined May 2018
LLM inference is too slow, too expensive, and too hard to scale. 🚨 Introducing llm-d, a Kubernetes-native distributed inference framework, to change that—using vLLM (@vllm_project), smart scheduling, and disaggregated compute. Here’s how it works—and how you can use it today:.
4
86
553
RT @Virginia__MM: Red Hat + @NVIDIA = a new wave of agentic AI innovation 💡. See how we're supporting NVIDIA Blackwell AI factories across….
0
2
0
RT @_llm_d_: Are you serving LLMs in production? We need your input for the llm-d project!. Take our 5-min anonymous survey to help guide o….
0
6
0
RT @Tandemn_labs: LLM inference still crawling? 🚨 Meet llm-d—a K8s-native, @vllm_project-powered framework from Red Hat @RedHat_AI that sl….
0
3
0
RT @charles_irl: spotted in the latest @RedHat_AI office hours for @vllm_project -- the LLM Engine Advisor we built on their benchmarking f….
0
3
0
RT @osanseviero: We've taken community feedback very seriously, and that's why for Gemma 3n launch we're so proud to partner with so many i….
0
17
0
llm-compressor v0.6.0 is out. Big improvements for anyone optimizing models for inference with @vllm_project. 1⃣ AWQ now works better for MoEs, with major runtime gains. 2⃣Calibration is faster and smoother with sequential on-loading. This cuts runtime and reduces hardware.
0
6
36
RT @NVIDIAAIDev: The llm-d project is a major step forward for the #opensource AI ecosystem, and we are proud to be one of the founding con….
0
17
0
RT @_EldarKurtic: Our flagship paper on how far careful quantization can really go in practice got accepted as an oral at ACL 2025 (top 8%)….
0
28
0
vLLM Office Hours continue this Thursday. Special topic: GuideLLM: Evaluate your LLM Deployments for Real-World Inference (with Jenny Yi and @markurtz_). our bi-weekly vLLM update (with @mgoin_). Register to get a calendar invite with a GMeet link:
1
0
5
RT @_EldarKurtic: Want to learn more about GuideLLM, the tool used by @charles_irl and @modal_labs' LLM Engine Advisor to easily benchmark….
0
3
0
RT @charles_irl: GuideLLM is a great tool -- we run it massively in parallel on @modal_labs to benchmark inference engines for the LLM Engi….
0
4
0