ben_athi Profile Banner
Ben Athiwaratkun Profile
Ben Athiwaratkun

@ben_athi

Followers
868
Following
2K
Media
59
Statuses
375

Leading Turbo Team @ Together AI. prev: @awscloud @MSFTResearch, @Cornell PhD.

Earth, the Milky Way.
Joined July 2014
Don't wanna be here? Send us removal request.
@ben_athi
Ben Athiwaratkun
1 year
Introducing: Bifurcated Attention -- Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs @ #icml 2024. TL;DR -- We can generate hundreds of samples with the same latency as one without approximation for any attention-based LLMs. Poster session happening on
Tweet media one
2
11
53
@ben_athi
Ben Athiwaratkun
13 days
RT @togethercompute: šŸ¤–OpenAI's open models are here. gpt-oss models just landed on Together AI. Achieves near-parity with o4- mini, train….
0
23
0
@ben_athi
Ben Athiwaratkun
1 month
If you’re at icml and interested in LMM efficiency research, come chat with us at Together AI Booth!.
@togethercompute
Together AI
1 month
Together AI Sets a New Bar: Fastest Inference for DeepSeek-R1-0528. We’ve upgraded the Together Inference Engine to run on @NVIDIA Blackwell GPUs—and the results speak for themselves:.šŸ“ˆ Highest known serverless throughput: 334 tokens/sec.šŸƒā€Fastest time to first answer token:
Tweet media one
1
0
4
@ben_athi
Ben Athiwaratkun
1 month
TL;DR - one way to push the quality-efficiency frontier: . obtain high quality generations via a collection of LLMs -> distill to a smaller model -> get a higher quality small model that is more inference-efficient than the original collection of models. Poster session.
@JunlinWang3
Junlin Wang
1 month
Work done during my internship at Together AI is being presented at #icml25. Come and check it out! . We propose a new model alignment pipeline that harness collective intelligence from open-source llms!
Tweet media one
0
1
4
@ben_athi
Ben Athiwaratkun
1 month
Come check out our poster on speeding up LLM, happening now til 1.30 pm. TL;DR — we show that we can hide the latency of all reduce operations in tensor parallel setting by modifying residual architecture to overlap MLP and attention.
@zhang_muru
Muru Zhang
1 month
I'm at #ICML2025, presenting Ladder-Residual ( at the first poster session tomorrow morning (7/15 11am-1:30pm), looking forward to seeing you at.West Exhibition Hall B2-B3 #W-1000!
Tweet media one
0
0
6
@ben_athi
Ben Athiwaratkun
2 months
RT @togethercompute: Announcing DeepSWE šŸ¤–: our fully open-sourced, SOTA software engineering agent trained purely with RL on top of Qwen3-3….
0
80
0
@ben_athi
Ben Athiwaratkun
2 months
RT @togethercompute: šŸ”“āš” FLUX.1 Kontext [dev] just landed on Together AI. First open-weight model w/ proprietary-level image editing:. šŸŽØ Per….
0
6
0
@ben_athi
Ben Athiwaratkun
2 months
Open Deep Research app + fully open recipe ā˜ŗļø.
@togethercompute
Together AI
2 months
Introducing the Open Deep Research app!. Generate detailed reports on any topic with open source LLMs. Free & fully open source. We’re releasing everything: evaluation dataset, code, app, and blog.šŸ”„
0
3
11
@ben_athi
Ben Athiwaratkun
2 months
RT @JonSaadFalcon: How can we close the generation-verification gap when LLMs produce correct answers but fail to select them? .🧵 Introduci….
0
63
0
@ben_athi
Ben Athiwaratkun
2 months
RT @nutlope: Our open deep research app is launching in 24 hours!. Generate reports about any topic using OSS LLMs. 100% free & open sourc….
0
37
0
@ben_athi
Ben Athiwaratkun
2 months
RT @vipulved: .@togethercompute is building 2 gigawatts of AI factories (~100,000 GPUs) in the EU over the next 4 years with the first phas….
0
18
0
@ben_athi
Ben Athiwaratkun
2 months
RT @togethercompute: šŸš€ New research: YAQA — Yet Another Quantization Algorithm (yes, pronounced like yaca/jackfruit 🄭). Led by @tsengalb99,….
0
5
0
@ben_athi
Ben Athiwaratkun
3 months
RT @togethercompute: šŸ”” New blog post on how we can attain large speedups for our inference customers using custom speculators! šŸš€. Key benef….
0
5
0
@ben_athi
Ben Athiwaratkun
4 months
If you're at ICLR and passionate about optimizing language models for speed and efficiency, swing by the Together AI booth for a chat.
Tweet media one
1
31
70
@ben_athi
Ben Athiwaratkun
4 months
RT @LindaHe49140661: Excited to share our work on scaling LLMs to handle million-token contexts! Training models for ultra-long sequences i….
0
45
0
@ben_athi
Ben Athiwaratkun
4 months
RT @JunlinWang3: Excited to share work from my @togethercompute internship—a deep dive into inference‑time scaling methods 🧠. We rigorously….
0
54
0
@ben_athi
Ben Athiwaratkun
4 months
Deep Research with open source models (+ open recipe).
@togethercompute
Together AI
4 months
Introducing Open Deep Research!. A fully open-source Deep Research tool that:.• writes comprehensive reports.• does multi-hop search and reasoning.• generates cover images & pod-casts!. We’re releasing everything: evaluation dataset, code and blog.šŸ”„. Example output reportšŸ‘‡
0
1
6
@ben_athi
Ben Athiwaratkun
4 months
RT @AlpayAriyak: Excited to present our project in collaboration with Agentica: .14B LLM trained with Code RL that reaches OpenAI's o3-mini….
0
28
0
@ben_athi
Ben Athiwaratkun
4 months
RT @togethercompute: Announcing DeepCoder-14B – an o1 & o3-mini level coding reasoning model fully open-sourced!. We’re releasing everythin….
0
347
0