Explore tweets tagged as #InferenceOptimization
@exalabs_
Exa Labs
2 months
Exciting advancements in LLM architectures like Entropix are reshaping inference-time compute. As we embrace uncertainty and hybrid engines for optimization, the evolution of cognitive capabilities is closer than ever. Let's redefine AI together. #AI #LLM #InferenceOptimization.
0
0
0
@aapatel09
Ankur A. Patel
2 years
1/3 Learn in this blog article the key techniques like pruning, model quantization, and hardware acceleration that are enhancing efficiency. #MultimodalAI #LLMs #InferenceOptimization #AnkursNewsletter
Tweet media one
1
0
0
@zeroxaitales
Solysian ZeroX AI MediaTales
23 days
Inference is where great AI products either scale—or burn out. 2025’s best AI infra teams aren’t just using better models….They’re running smarter pipelines. This thread: how quantization, batching & caching supercharge LLM apps. #LLMOps #InferenceOptimization #AIInfra
Tweet media one
1
0
0
@thysel55307
Krishanu
4 months
Supercharge your AI with lightning-fast inference! 🚀 Post-training quantization techniques like AWQ and GPTQ trim down your models without sacrificing smarts—boosting speed and slashing compute costs. Ready to optimize your LLMs for real-world performance? #InferenceOptimization
0
0
0
@vlruso
Vlad Ruso PhD
10 months
Stanford Researchers Explore Inference Compute Scaling in Language Models: Achieving Enhanced Performance and Cost Efficiency through Repeated Sampling. #AIAvancements #InferenceOptimization #RepeatedSampling #AIApplications #EvolveWithAI #ai #news #llm
Tweet media one
0
0
1
@vlruso
Vlad Ruso PhD
1 year
Optimizing Large Language Models (LLMs) on CPUs: Techniques for Enhanced Inference and Efficiency. #LLM #AI #CPUs #InferenceOptimization #BusinessTransformation #ai #news #llm #ml #research #ainews #innovation #artificialintelligence #machinelearning #t
Tweet media one
0
0
0
@vlruso
Vlad Ruso PhD
11 months
MagicDec: Unlocking Up to 2x Speedup in LLaMA Models for Long-Context Applications. #LLaMA #MagicDec #AI #LanguageModels #InferenceOptimization #ai #news #llm #ml #research #ainews #innovation #artificialintelligence #machinelearning #technology #deeple
Tweet media one
0
0
0
@AIMakerspace
AI Makerspace
9 months
What we’re building 🏗️, shipping 🚢 and sharing 🚀 tomorrow: Inference Optimization with GPTQ. Learn how GPTQ’s “one-shot weight quantization” compares to other leading techniques like AWQ. Start optimizing: #LLMs #GPTQ #InferenceOptimization
Tweet media one
0
0
2
@zeroxaitales
Solysian ZeroX AI MediaTales
24 days
Model Layer = Swappable Core.GPT-4, Claude, Gemini, Mixtral—pick your poison. Top teams don’t pick one. They route by:.→ Task.→ Latency.→ Cost.→ Accuracy.Models are pipes. Routing is strategy. #LLMs #ModelOps #InferenceOptimization.
1
0
0
@MarkoumEnglish
Markoum
3 months
🧠💡 Meet OpenAI O3—a game‑changing multi‑task model that turns incremental gains into a whole new development platform. Already testing it? Tell us where O3 beats its predecessors. #AI #OpenAI #LLMs #O3 #AIdevelopment #SoftwareEngineering #InferenceOptimization
Tweet media one
1
0
0
@zeroxaitales
Solysian ZeroX AI MediaTales
27 days
Capital is also chasing compute arbitrage. Startups using:.– Smart model quantization.– Faster inference on CPUs.– Sovereign training infra.Own the stack, own the scale. #computeedge #inferenceoptimization #quantization.
1
0
0
@genainewstop
GenAINews.co
7 months
Check out how TensorRT-LLM Speculative Decoding can boost inference throughput by up to 3.6x! #TensorRT #InferenceOptimization #AI #NVIDIA #LLM #DeepLearning 🚀🔥.
0
1
0
@aapatel09
Ankur A. Patel
2 years
3/3 Explore our comprehensive guide on inference optimization strategies for LLMs here: . 🔁 Spread this thread with your audience by Retweeting this tweet . #MultimodalAI #LLMs #InferenceOptimization #AnkursNewsletter.
0
0
0
@managetech_inc
Managetech inc.
11 months
DeepMind と UC Berkeley が LLM 推論時間コンピューティングを最大限に活用する方法を紹介 | VentureBeat.#AIcoverage #InferenceOptimization #TestTimeCompute #LLMperformance.
0
0
0
@managetech_inc
Managetech inc.
9 months
プライムデーに向けて、80,000 個を超える AWS Inferentia および AWS Trainium チップを搭載した Amazon 生成 AI 搭載の会話型ショッピングアシスタント Rufus をスケーリング | AWS Machine Learning ブログ.#AmazonRufus #GenerativeAI #AWSChips #InferenceOptimization.
0
0
0
@MulticoreWare
MulticoreWare
4 months
Our latest case study demonstrates how Perfalign empowers developers to boost AI inference performance using ARM based platforms. Know more: #AI #ARM #MachineLearning #DeepLearning #InferenceOptimization #Perfalign #MulticoreWare
Tweet media one
0
0
0
@managetech_inc
Managetech inc.
8 months
Nvidia Jetson AGX Orin で TensorRT-LLM を使用して LLM を実行する - #TensorRTLLM #NvidiaJetsonAGXOrin #LargeLanguageModels #InferenceOptimization.
0
0
0
@managetech_inc
Managetech inc.
5 months
AI の「スプートニクの瞬間」: DeepSeek は業界の巨人をどう揺るがすのか? | TechFusionlabs on Binance Square.#DeepSeek #AItraining #hardwaredemand #inferenceoptimization.
0
0
0
@NektonAI
Nekton AI
2 years
Excited to read about the Large Transformer Model Inference Optimization by @lilianweng! This article provides valuable insights on improving Transformers. Don't miss it! 👉🔍 #InferenceOptimization.Check out the article here:
0
0
0