Ce Zhang
@ce_zhang
Followers
3K
Following
2K
Media
134
Statuses
729
CTO @ Together @togethercompute Neubauer Associate Professor @UChicago
San Francisco
Joined September 2016
A 7B model beyond Transformer architecture that matches / sometimes outperforms, the strongest 7B Transformer! Thanks @Hessian_AI & @Teknium1 @theemozilla @NousResearch for the collaboration. Play with it here https://t.co/nTU6ZX3Pfn and give us feedback!
Announcing StripedHyena 7B โย an open source model using an architecture that goes beyond Transformers achieving faster performance and longer context. It builds on the lessons learned in past year designing efficient sequence modeling architectures. https://t.co/UGLnfz0Dma
0
4
32
We're excited to host Apriel-1.5-15b-Thinker by @ServiceNow's SLAM labs on Together AI! ๐15B parameters, fits on single GPU ๐On par with Deepseek-R1-0528 and Mistral-Medium-1.2 on the Artificial Analysis Intelligence Index Built by @SathwikTejaswi @ServiceNowRSRCH
1
1
8
Breaking: @VFSGlobal x Together AI announce strategic partnership. Weโre partnering with VFS Global to scale secure, responsible, and high-performance AI solutions for global mobility. Millions of visa applications. 160+ countries. One mission: faster, more transparent, and
3
2
11
The Washington Post processes 1.79 billion tokens every month powering "Ask The Post AI" They needed reliable inference without vendor lock-in. Fixed costs. Full model ownership. Together AI's Dedicated Endpoints delivered.
1
1
8
Announcing ReceiptHero โ an app to help people track their finances! It'll take in any receipts you have, extract the total $, and categorize it for you (dining, groceries, utilities, ect). 100% free & open source. Powered by llama 4 on @togethercompute.
16
32
538
I'm building a realtime video analysis app! It takes screenshots every 500ms, sends it to llama 4 on @togethercompute, and streams back the results. I want to extend it to be able to perform actions too (record my screen & send me a text when a video finishes for example).
39
33
438
Together Instant Clusters, offering ready to use, self-service NVIDIA GPUs, are now Generally Available ๐
2
4
24
Building AI agents for complex engineering tasks โ building chatbots ๐งต Most AI agents today excel at short, simple tasks. But automating multi-day engineering workflows? Thatโs a whole different game. At Together AI, we learned this the hard way while optimizing LLM
4
20
68
Therapeutic AI isn't just "helpful" AI ๐ง @slingshotai_inc built a psychology foundation model that knows when to push back, stay silent, or offer new perspectives. And now - 50,000+ people are getting specialized mental health support.
2
4
27
๐คOpenAI's open models are here. gpt-oss models just landed on Together AI. Achieves near-parity with o4- mini, trained using o3 techniques. Build anything, deploy anywhere๐ฅ
13
24
112
A small update - we had more traffic than anticipated. However, the endpoints are now scalable on Together AI for all models, including the 671B MoE. Test out the model here: https://t.co/Od1NXYVBxU (A huge thanks to the folks at @togethercompute for making this happen so
together.ai
671B mixture-of-experts model matching Deepseek R1 performance, 60% shorter reasoning chains, approaching o3 and Claude 4 capabilities
Today, we are releasing 4 hybrid reasoning models of sizes 70B, 109B MoE, 405B, 671B MoE under open license. These are some of the strongest LLMs in the world, and serve as a proof of concept for a novel AI paradigm - iterative self-improvement (AI systems improving themselves).
4
14
82
๐ก๏ธ VirtueGuard is LIVE on Together AI ๐ AI security and safety model that screens input and output for harmful content: โก Under 10ms ๐ฟ๐ฒ๐๐ฝ๐ผ๐ป๐๐ฒ ๐ฏ ๐ด๐ต% ๐ฎ๐ฐ๐ฐ๐๐ฟ๐ฎ๐ฐ๐ vs 76% (AWS Bedrock) ๐ง ๐๐ผ๐ป๐๐ฒ๐
๐-๐ฎ๐๐ฎ๐ฟ๐ฒ - adapts to your policies, not just keywords ๐
4
5
24
We built an open source voice note taking app using our fast Whisper implementation! Check it out -> usewhisperโ.โio https://t.co/t2mMWs4LqS
6
4
59
We now have the fastest speeds for DeepSeek R1 โ up to 330 tokens/sec running on B200s! Here it is in action โ video is not sped up!
Together AI Sets a New Bar: Fastest Inference for DeepSeek-R1-0528 Weโve upgraded the Together Inference Engine to run on @NVIDIA Blackwell GPUsโand the results speak for themselves: ๐ Highest known serverless throughput: 334 tokens/sec ๐โFastest time to first answer token:
8
8
77
Together AI Sets a New Bar: Fastest Inference for DeepSeek-R1-0528 Weโve upgraded the Together Inference Engine to run on @NVIDIA Blackwell GPUsโand the results speak for themselves: ๐ Highest known serverless throughput: 334 tokens/sec ๐โFastest time to first answer token:
7
14
106
Kimi K2 is now available on https://t.co/NO0aADGvEz for free!
๐จMAJOR DROP: Kimi K2 just landed on Together AI ๐ An open-source 1T parameter model that beats proprietary LLMs in creativity, coding, and tool use while delivering 60-70% cost savings. Built for agents. Priced for scale. ๐
1
9
46
We just launched a new "dictate" feature on Together Chat powered by our new Whisper model! The video is not sped up โ it's really that fast!
4
3
31
๐ We just launched speech-to-text APIs designed for real-time applications. Our Whisper V3 Large deployment delivers transcription 15x faster than OpenAI while maintaining full accuracy. Sub-second processing that actually keeps up with conversation speed โก
4
4
52
Announcing DeepSWE ๐ค: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-32B. DeepSWE achieves 59% on SWEBench-Verified with test-time scaling (and 42.2% Pass@1), topping the SWEBench leaderboard for open-weight models. Built in
8
81
496
Together AIโs first GB200 cluster built by Dell!
3
10
107