Explore tweets tagged as #InferenceOptimization
🔗 MIT paper: https://t.co/UlrZ0ELMbD 📄 AutoThink paper: https://t.co/f3pw793NhE 💻 AutoThink in OptiLLM: https://t.co/uqvhpkW742 💻 PTS repo: https://t.co/laAmsLNy0Q
#AI #LLM #InferenceOptimization #machinelearningdevelopers
0
0
2
1/3 Learn in this blog article the key techniques like pruning, model quantization, and hardware acceleration that are enhancing efficiency. #MultimodalAI #LLMs #InferenceOptimization #AnkursNewsletter
1
0
0
🚀 Evaluating how LLM inference speeds up is crucial! 🧠📈 #tftotpoot #MachineLearning #LLMs #InferenceOptimization
0
0
0
Nvidia’s ongoing fight to maintain technological and market dominance in AI inference, challenged by @Google, @Huawei and @AMD
https://t.co/RBauzXoZIt
#NvidiaMonopoly #AITech #CustomSilicon #AIChips #InferenceOptimization #GPUCompetition #CUDAAlternative #TSMC #GraceBlackwell
0
0
0
From what I’ve played with it, DeepConf feels like a leap in the usefulness of smaller, locally hosted LLMs. https://t.co/2uUJ2Eaxjq
#AI #MachineLearning #vLLM #InferenceOptimization #DeepConf
0
0
0
Our latest case study demonstrates how Perfalign empowers developers to boost AI inference performance using ARM based platforms. Know more: https://t.co/2As2UXpBYk
#AI #ARM #MachineLearning #DeepLearning #InferenceOptimization #Perfalign #MulticoreWare
0
0
0
Stanford Researchers Explore Inference Compute Scaling in Language Models: Achieving Enhanced Performance and Cost Efficiency through Repeated Sampling https://t.co/ROjoQDn9Ny
#AIAvancements #InferenceOptimization #RepeatedSampling #AIApplications #EvolveWithAI #ai #news #llm…
0
0
1
3/3 Explore our comprehensive guide on inference optimization strategies for LLMs here: https://t.co/zvxtn8fz2g 🔁 Spread this thread with your audience by Retweeting this tweet #MultimodalAI #LLMs #InferenceOptimization #AnkursNewsletter
0
0
0
MagicDec: Unlocking Up to 2x Speedup in LLaMA Models for Long-Context Applications https://t.co/o0yoLDQmM0
#LLaMA #MagicDec #AI #LanguageModels #InferenceOptimization #ai #news #llm #ml #research #ainews #innovation #artificialintelligence #machinelearning #technology #deeple…
0
0
0
Optimizing Large Language Models (LLMs) on CPUs: Techniques for Enhanced Inference and Efficiency https://t.co/vzg4LLidoz
#LLM #AI #CPUs #InferenceOptimization #BusinessTransformation #ai #news #llm #ml #research #ainews #innovation #artificialintelligence #machinelearning #t…
0
0
0
What we’re building 🏗️, shipping 🚢 and sharing 🚀 tomorrow: Inference Optimization with GPTQ Learn how GPTQ’s “one-shot weight quantization” compares to other leading techniques like AWQ Start optimizing: https://t.co/PqpbDHSP0P
#LLMs #GPTQ #InferenceOptimization
0
0
2
🧠💡 Meet OpenAI O3—a game‑changing multi‑task model that turns incremental gains into a whole new development platform. Already testing it? Tell us where O3 beats its predecessors. #AI #OpenAI #LLMs #O3 #AIdevelopment #SoftwareEngineering #InferenceOptimization
1
0
0
Talk: From Human Agents to GPU-Powered GenAI – A Data-Driven Transformation in Customer Service 🔗 https://t.co/Eeb2kw5vWF Register Now: https://t.co/pWarIbxJRk
#GenAIinSupport #CustomerServiceAI #InferenceOptimization #EnterpriseAI #AppliedAISummit
0
0
0
Supercharge your AI with lightning-fast inference! 🚀 Post-training quantization techniques like AWQ and GPTQ trim down your models without sacrificing smarts—boosting speed and slashing compute costs. Ready to optimize your LLMs for real-world performance? #InferenceOptimization
0
0
0
Check out how TensorRT-LLM Speculative Decoding can boost inference throughput by up to 3.6x! #TensorRT #InferenceOptimization #AI #NVIDIA #LLM #DeepLearning 🚀🔥 https://t.co/h1VooOoIGB
0
1
0
Nvidia Jetson AGX Orin で TensorRT-LLM を使用して LLM を実行する - https://t.co/sNhSr0LmW5
#TensorRTLLM #NvidiaJetsonAGXOrin #LargeLanguageModels #InferenceOptimization
https://t.co/ldDDSXGDqu
0
0
0
L40S GPUs optimize Llama 3 7B inference at $0.00037/request. Achieve extreme throughput for small LLMs. Benchmark your model. https://t.co/MVw9nnlllg
#LLM #Llama3 #InferenceOptimization #CostPerRequest
0
0
0
AI の「スプートニクの瞬間」: DeepSeek は業界の巨人をどう揺るがすのか? | TechFusionlabs on Binance Square #DeepSeek #AItraining #hardwaredemand #inferenceoptimization
https://t.co/ocbOZrInQB
0
0
0
プライムデーに向けて、80,000 個を超える AWS Inferentia および AWS Trainium チップを搭載した Amazon 生成 AI 搭載の会話型ショッピングアシスタント Rufus をスケーリング | AWS Machine Learning ブログ #AmazonRufus #GenerativeAI #AWSChips #InferenceOptimization
https://t.co/7CGW3NoKpc
0
0
0