amiruci Profile Banner
Amir Haghighat Profile
Amir Haghighat

@amiruci

Followers
2K
Following
2K
Media
80
Statuses
660

Co-founder @basetenco

San Francisco, CA
Joined May 2009
Don't wanna be here? Send us removal request.
@amiruci
Amir Haghighat
2 months
We closed our series D at $2.1b. It happened 8 months after our series C, which seems too fast until you consider the facts: 2-years worth of growth in 8 months, virtually 0 customer churn, healthy margins, and QoQ NDR numbers that are considered top-tier YoY. The market demand
31
20
319
@amiruci
Amir Haghighat
5 days
Amazing work by @drfeifei and the @worldlabs team to push the boundaries of multimodal AI.
@theworldlabs
World Labs
5 days
Introducing Marble by World Labs: a foundation for a spatially intelligent future. Create your world at https://t.co/V267VJu1H9
3
0
18
@amiruci
Amir Haghighat
6 days
with the lowest time-to-first-token:
0
0
15
@amiruci
Amir Haghighat
6 days
A few days ago Kimi K2 Thinking significantly narrowed the capability gap between open and closed LLMs. Today Baseten is the only provider to deliver over 100 tok/sec on this massive 1T-parameter model.
13
46
636
@cline
Cline
7 days
The fastest provider for kimi-k2-thinking <now in Cline>
@basetenco
Baseten
7 days
It’s Monday, and we could all use a little help thinking. Thankfully we have the new Kimi K2 Thinking to do it for us. Kimi K2 Thinking is now live in our Model APIs with the most performant TTFT (0.3 sec) and TPS (140) on @OpenRouterAI & @ArtificialAnlys . If you’re looking
22
31
438
@tuhinone
Tuhin Srivastava
18 days
Cursor 2.0 feels game changing - fast agentic workflows unlock new levels of creativity and productivity. Congrats to the team!
@cursor_ai
Cursor
19 days
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
7
4
117
@amiruci
Amir Haghighat
24 days
And we have the highest tok/sec using nvidia GPUs:
0
0
5
@amiruci
Amir Haghighat
24 days
There's an obsession with tok/sec as *the* metric in LLM inference. But in latency-sensitive use cases the metic that matters more is time-to-first-token: - Code edit use cases have short outputs and overall latency is heavily determined by ttft - Voice AI use cases care about
5
5
35
@ArtificialAnlys
Artificial Analysis
27 days
GLM-4.6 providers overview: we are benchmarking API endpoints offered by Baseten, GMI, Parasail, Novita, Deepinfra GLM-4.6 (Reasoning) from @Zai_org is one of the most intelligent open weights models, with intelligence close to GPT-OSS-120b (high), DeepSeek V3.2 Exp (Reasoning)
16
23
272
@basetenco
Baseten
28 days
We see the massive AWS outage. Baseten web app is down but inference, new deploys, training jobs, and the model management APIs are unaffected.
3
8
31
@cline
Cline
1 month
GLM 4.6 fans! @basetenco just soared to the top as the fastest provider in Artificial Analysis for the model. > a 114 TPS and <0.18s TTFT. > That's 2x faster than the next best option on both metrics. Available now in Cline.
15
36
439
@amiruci
Amir Haghighat
1 month
Go team @FactoryAI!
@FactoryAI
Factory
1 month
Deploy and serve custom models with enterprise-grade infrastructure on @basetenco. Special promo for Factory users: receive $500 Model API credits when you fill out this form.
0
0
33
@basetenco
Baseten
2 months
What do Superhuman, Baseten, and Ricky Bobby all have in common? An obsession with speed. If you’re a Superhuman user, you know their email app lives and dies by how fast their users can get through all things email.
1
1
13
@basetenco
Baseten
2 months
If you see a doctor today, chances are they're using OpenEvidence for trustworthy, up-to-date medical information at their fingertips. We're thrilled to support OpenEvidence's mission with the speed (<160 ms latency) and reliability physicians require at the point of care.
3
9
29
@Madisonkanna
Madison Kanna
2 months
@rtfeldman teaching me zed live now!! https://t.co/cRPiwkD0sT
3
7
57
@NVIDIAAI
NVIDIA AI
2 months
📈 @basetenco users are scaling smarter with us: ✅ 5× throughput on high-traffic endpoints ✅ 50% lower cost per token ✅ Up to 38% lower latency on the largest LLMs Built on NVIDIA Blackwell + TensorRT-LLM + Dynamo on @googlecloud—driving efficiency, speed & adoption at scale.
8
21
109
@Madisonkanna
Madison Kanna
2 months
Next live stream is tomorrow at 10:30AM pst. Sharing some fun announcements, guests joining, and we're giving away a bunch of our shirts!
20
3
100
@lilyjclifford
lily clifford
3 months
🚀 Arcana v2 is here. Rime’s next-gen TTS makes voice AI sound truly human. More languages. More realism. More deployment options. 🧵👇
7
10
51
@amiruci
Amir Haghighat
3 months
It's important to support newly released open-weight models on day 1. But it's not noteworthy. What's noteworthy is to have the inference optimization muscle to immediately blow the competition out of water on latency and throughput. As measured by OpenRouter:
12
14
87
@tuhinone
Tuhin Srivastava
3 months
We're very excited to be an @OpenAI launch partner for GPT OSS. Today's a big day for open models, and we have day 0 support for GPT OSS 120b via our Model APIs: https://t.co/hMLzTy2dek We'll be rolling out more performance optimizations and benchmarks over the coming hours and
Tweet card summary image
baseten.co
120B MoE open model by OpenAI
12
19
91