RedHat_AI Profile Banner
Red Hat AI Profile
Red Hat AI

@RedHat_AI

Followers
8K
Following
1K
Media
455
Statuses
2K

Accelerating AI innovation with open platforms and community. The future of AI is open.

Joined May 2018
Don't wanna be here? Send us removal request.
@RedHat_AI
Red Hat AI
10 days
Red Hat AI gives teams a place to experiment with newly released models, including Mistral 3, on the day they arrive. Our Day Zero guide shows how to run Mistral 3 today using the Red Hat AI Inference Server and Red Hat OpenShift AI: https://t.co/Sr79WfbrPw Happy experimenting!
Tweet card summary image
developers.redhat.com
Key takeaways
@MistralAI
Mistral AI
10 days
Introducing the Mistral 3 family of models: Frontier intelligence at all sizes. Apache 2.0. Details in ๐Ÿงต
1
6
30
@RedHat_AI
Red Hat AI
2 days
The workflow is simple and fast. Load your model, apply the AutoRound modifier, compress the model, and serve it in @vllm_project. Support for Llama, Qwen, and other open-weight LLMs means you can start experimenting right away.
1
1
1
@lovetoknow
LoveToKnow
2 years
I can personally vouch for a dozen of these - from the genius silverware storage, to the perfect leggings with 70+ reviews, to the hands-down best pillows on earth.
49
59
773
@RedHat_AI
Red Hat AI
2 days
AutoRound learns how each tensor should round and clip for the best quality possible. This delivers standout low bit performance in formats like W4A16.
1
0
1
@RedHat_AI
Red Hat AI
2 days
Red Hat AI and @intel have teamed up to bring a major upgrade to low bit LLMs. AutoRound is now integrated directly into LLM Compressor, giving developers a powerful way to shrink models, boost speed, and keep accuracy. And it runs smoothly with @vllm_project. A quick ๐Ÿงต:
1
3
7
@vllm_project
vLLM
3 days
Low-bit LLM quantization doesnโ€™t have to mean painful accuracy trade-offs or massive tuning runs. Intel's AutoRound PTQ algorithm is now integrated into LLM Compressor, producing W4A16 compressed-tensor checkpoints you can serve directly with vLLM across Intel Xeon, Gaudi, Arc
1
36
232
@HaihaoShen
Haihao Shen
3 days
Thanks to the great collaboration with @vllm_project , LLM Compressor team, and @RedHat_AI team for making this happen! If you want a model with smaller-size, high accuracy, and deployed w/ vLLM, AutoRound is your best choice. ๐ŸŒŸ and give a try:
Tweet card summary image
github.com
Advanced quantization toolkit for LLMs and VLMs. Native support for WOQ, MXFP4, NVFP4, GGUF, Adaptive Schemes and seamless integration with Transformers, vLLM, SGLang, and llm-compressor - intel/au...
@vllm_project
vLLM
3 days
Low-bit LLM quantization doesnโ€™t have to mean painful accuracy trade-offs or massive tuning runs. Intel's AutoRound PTQ algorithm is now integrated into LLM Compressor, producing W4A16 compressed-tensor checkpoints you can serve directly with vLLM across Intel Xeon, Gaudi, Arc
1
5
18
@RedHat_AI
Red Hat AI
3 days
If you built agents with Llama Stack's original Agent APIs, you've probably seen that they are being deprecated in favor of the OpenAI compatible Responses API. Migrating does not require starting over. There are two practical paths you can take. Approach 1 is a
Tweet card summary image
github.com
Contribute to opendatahub-io/agents development by creating an account on GitHub.
0
3
10
@Robincjohn
Robin John
3 months
What if your money could do more than grow? Robin Johnโ€™s new book shows you how your investments can honor God and serve your neighbor. Join the Good Investor Movement and start investing with purpose!
0
4
53
@TerryTangYuan
Yuan (Terry) Tang
4 days
๐ŸŽ‰ ๐— ๐—ถ๐—น๐—ฒ๐˜€๐˜๐—ผ๐—ป๐—ฒ ๐˜‚๐—ป๐—น๐—ผ๐—ฐ๐—ธ๐—ฒ๐—ฑ! The ๐˜๐˜ฏ๐˜ง๐˜ฆ๐˜ณ๐˜ฆ๐˜ฏ๐˜ค๐˜ฆ๐˜–๐˜ฑ๐˜ด: ๐˜š๐˜ต๐˜ข๐˜ต๐˜ฆ ๐˜ฐ๐˜ง ๐˜ต๐˜ฉ๐˜ฆ ๐˜”๐˜ฐ๐˜ฅ๐˜ฆ๐˜ญ ๐˜š๐˜ฆ๐˜ณ๐˜ท๐˜ช๐˜ฏ๐˜จ ๐˜Š๐˜ฐ๐˜ฎ๐˜ฎ๐˜ถ๐˜ฏ๐˜ช๐˜ต๐˜ช๐˜ฆ๐˜ด newsletter from @RedHat_AI just reached ๐Ÿญ,๐Ÿฌ๐Ÿฌ๐Ÿฌ ๐˜€๐˜‚๐—ฏ๐˜€๐—ฐ๐—ฟ๐—ถ๐—ฏ๐—ฒ๐—ฟ๐˜€ in only 5 months! ๐Ÿš€ A huge thank-you to everyone whoโ€™s
1
2
13
@RedHat_AI
Red Hat AI
4 days
Are you running vLLM on Kubernetes and tired of guessing concurrency thresholds? This new Red Hat article walks through how to autoscale vLLM on OpenShift AI using real service metrics instead of generic request counts. KServe and KEDA work together to scale GPU model servers
Tweet card summary image
developers.redhat.com
In my previous blog, How to set up KServe autoscaling for vLLM with KEDA, we explored the foundational setup of vLLM autoscaling in Open Data Hub (ODH) using KEDA and the custom metrics autoscaler
0
7
20
@RedHat_AI
Red Hat AI
4 days
Our colleague @dougbtv built a very cool integration that shows what is possible with vLLM-Omni, a new multimodal framework introduced last week. He connected vLLM-Omni to @ComfyUI, a community favorite for designing advanced diffusion workflows. The project includes a new
1
2
7
@vllm_project
vLLM
7 days
๐Ÿ“ขvLLM v0.12.0 is now available. For inference teams running vLLM at the center of their stack, this release refreshes the engine, extends long-context and speculative decoding capabilities, and moves us to a PyTorch 2.9.0 / CUDA 12.9 baseline for future work.
4
20
148
@Lucas_Oil
Lucas Oil Products
21 hours
Salt? Slush? Holiday road trips? ๐ŸŽ„ Slick Mistยฎ makes cleanup fast and satisfying. Perfect for anyone who hates winter grime.
0
1
2
@cedricclyburn
Cedric Clyburn
8 days
It may be my first time at #AWSreInvent but it sure isnโ€™t for @RedHat! Weโ€™re out here in Vegas running live demos all week, including my favorite Blackjack + AI game powered by @vllm_project for model inference โšก๏ธ and #ModelContextProtocol Agents ๐Ÿค–
1
10
16
@RedHat_AI
Red Hat AI
9 days
Itโ€™s @vllm_project party time at NeurIPS!
1
2
36
@michentr
Michael Hentrich
9 days
Red Hat's expanded collaboration with @awscloud is empowering IT decision-makers to run high-performance, efficient #AI inference at scale with @RedHat_AI. Check it out. #AWSreInvent
redhat.com
Red Hat today announced an expanded collaboration with Amazon Web Services (AWS) to power enterprise-grade generative AI (gen AI) on AWS with Red Hat AI and AWS AI silicon.
0
1
1
@PyTorch
PyTorch
10 days
Our latest PyTorch Foundation Spotlight features @RedHat's Joseph Groenenboom and Stephen Watt on the importance of optionality, open collaboration, and strong governance in building healthy and scalable AI ecosystems. In this Spotlight filmed during PyTorch Conference 2025,
2
7
58
@ActonInstitute
Acton Institute
7 days
Christianity and classical liberalismโ€”can they coexist? Carl Trueman and Vincent Phillip Muรฑoz debate the biggest questions facing American conservatism. Christian nationalism. Post-liberalism. Government's role in virtue. Where nationalism becomes dangerous. FULL
6
3
16
@vllm_project
vLLM
10 days
๐ŸŽ‰ Congratulations to the Mistral team on launching the Mistral 3 family! Weโ€™re proud to share that @MistralAI, @NVIDIAAIDev, @RedHat_AI, and vLLM worked closely together to deliver full Day-0 support for the entire Mistral 3 lineup. This collaboration enabled: โ€ข NVFP4
@MistralAI
Mistral AI
10 days
Introducing the Mistral 3 family of models: Frontier intelligence at all sizes. Apache 2.0. Details in ๐Ÿงต
8
42
493
@RedHat_AI
Red Hat AI
10 days
Congrats to @MistralAI on launching the Mistral 3 family under the Apache 2.0 license. We worked together to enable upstream @vllm_project support and collaborated on creating the FP8 and NVFP4 Mistral Large 3 checkpoints through llm-compressor for efficient deployment. ๐Ÿš€
@MistralAI
Mistral AI
10 days
Introducing the Mistral 3 family of models: Frontier intelligence at all sizes. Apache 2.0. Details in ๐Ÿงต
0
3
14
@_EldarKurtic
Eldar Kurtiฤ‡
10 days
We @RedHat_AI have partnered with Mistral to make Mistral Large 3 more accessible to the open-source community. High-quality FP8 and NVFP4 mdls, built with our llm-compressor! Expect models that are 2โ€“3.5x smaller with competitive accuracy across a wide range of evals.
@MistralAI
Mistral AI
10 days
Introducing the Mistral 3 family of models: Frontier intelligence at all sizes. Apache 2.0. Details in ๐Ÿงต
1
1
9
@rhdevelopers
Red Hat Developer
13 days
Explore some of OpenShift #AI's capabilities for scaling LLM model servers with #KServe and #vLLM. While autoscaling has its limitations, it can be a valuable tool for an IT team trying to optimize the costs of the models they are serving. https://t.co/k57HhY2Iyp
Tweet card summary image
developers.redhat.com
vLLM lets you serve nearly any LLM on a wide variety of hardware. However, that hardware can be quite expensive, and you don't want to be burning money with idle GPU resources. Instead, you can
0
9
26