
Together AI
@togethercompute
Followers
51K
Following
2K
Media
626
Statuses
2K
AI pioneers train, fine-tune, and run frontier models on our GPU cloud platform.
San Francisco, CA
Joined November 2022
What if your LLM inference automatically got faster the more you used it? Introducing ATLAS from the Together AI Turbo research team. Read more: https://t.co/ASRNUpqoAE Here’s Together AI Founder and Chief Scientist @tri_dao introducing ATLAS:
7
40
257
The accelerator offers a host of benefits, including: 📈GTM support 💳Platform credits 🧠Engineering expertise 👥Peer community access If you’re ready to take your AI startup to the next level, learn why our accelerator is the perfect resource for you.
0
0
1
🆕Building and scaling your apps just got a lot easier with the Together AI Startup Accelerator!
1
2
4
New post: How to build an AI voice app from scratch! I go over how I built UseWhisper, an OSS AI transcription app, with Next.js + AI SDK + Together AI:
8
24
315
Straight from our Turbo research team, ATLAS offers a fundamental reassessment of the way inference platforms are designed to work and evolve. Details: https://t.co/ASRNUppQL6
1
0
2
The results speak for themselves: ⚪500 TPS on DeepSeek-V3.1 ⚪Up to 4x faster inference vs. baseline ⚪Up to nearly 2x faster than our Turbo speculator
1
0
2
Imagine your LLM inference automatically getting faster in production (by up to 400%!) 🆕Enter: ATLAS–a not so traditional speculator that adapts to your workload as it evolves. The more you use it, the better it performs.
2
0
14
Impressive multimodal thinking model @ 15B parameters from the @ServiceNow AI team! Try it on @togethercompute
We're excited to host Apriel-1.5-15b-Thinker by @ServiceNow's SLAM labs on Together AI! 👉15B parameters, fits on single GPU 👉On par with Deepseek-R1-0528 and Mistral-Medium-1.2 on the Artificial Analysis Intelligence Index Built by @SathwikTejaswi @ServiceNowRSRCH
1
1
9
ICYMI: @NeurIPSConf Paper Acceptance for #NeurIPS2025: Exploring Diffusion Transformer Designs via Grafting. Details: ⬇️
Grafting is accepted to #NeurIPS2025 as an Oral! New methods for converting a trained diffusion transformer into a new architecture (like Hyena, SSM, etc). Really top-notch work by @keshigeyan on this one. Check out his post below for demos, analysis, and models!
0
0
4
Ready to try frontier-level reasoning on a single GPU? ✨ Apriel-1.5-15b-Thinker is available for developers to try now on Together AI's highly scalable, reliable, and cost-efficient platform:
together.ai
15B multimodal reasoning model, 131K context, scores 52 on AA Intelligence Index, competitive with models 10x larger, text-SFT only
0
0
2
The performance speaks for itself: 📊 Matches models 10x larger on reasoning benchmarks 🧮 87% accuracy on AIME'25 math competition ⚡ 131K context window 👁️ Text + image understanding Built with innovative mid-training techniques—no RL required.
1
0
2
We're excited to host Apriel-1.5-15b-Thinker by @ServiceNow's SLAM labs on Together AI! 👉15B parameters, fits on single GPU 👉On par with Deepseek-R1-0528 and Mistral-Medium-1.2 on the Artificial Analysis Intelligence Index Built by @SathwikTejaswi @ServiceNowRSRCH
1
1
8
#SFTechWeek brought the energy! Our AI Builders & Innovators Brunch with @RunwareAI and @Kling_ai sparked some incredible discussions around: 🏗️Building on Together, the AI-native cloud for multimodal workloads 👾How our customers like Cursor push AI forward 🧠Training,
0
0
7
Adaptive speculator that results "in a more than 60% reduction for overall RL training time without changing RL training algorithm." (Also great for regular inference, but this is very neat for RL.)
What if your LLM inference automatically got faster the more you used it? Introducing ATLAS from the Together AI Turbo research team. Read more: https://t.co/ASRNUpqoAE Here’s Together AI Founder and Chief Scientist @tri_dao introducing ATLAS:
2
2
18
1/ Everyone wants BIGGER AI models Lux family co @togethercompute just made existing models 4x FASTER🚄 Their trick? A system called ATLAS that learns from your actual usage patterns + gets better over time––like GPS that learns your daily route vs one that never updates
6
8
49
.@SemiAnalysis_' new InferenceMAX benchmark confirms @nvidia Blackwell platform’s unmatched inference performance and efficiency. Together AI is proud to offer NVIDIA GB200 NVL72 and HGX B200 systems for both inference and training, enabling customers to scale production
📣 NVIDIA Blackwell sets the standard for AI inference on SemiAnalysis InferenceMAX. Our most recent results on the independent benchmarks show NVIDIA’s Blackwell Platform leads AI factory ROI—— see how NVIDIA Blackwell GB200 NVL72 can yield $75 million in token revenue over
1
1
14
ATLAS delivers 400% faster LLM inference by learning from your workloads in real-time ⚡ From today's coverage on @VentureBeat: "The shift from static to adaptive optimization represents a fundamental rethinking of how inference platforms should work." We couldn't agree more!
2
2
6
The results: 4x faster vs. baseline 500 TPS on DeepSeek-V3.1 Outperforms specialized hardware The more you use it, the better it performs.
0
0
17
Unlike static speculators, ATLAS (AdapTive-LeArning Speculator System) learns from your live traffic in real time, automatically adapting as your workload evolves.
1
1
20