Benchmark @benchmark X Profile

Benchmark

@benchmark

Followers

90K

Following

6

Media

171

Statuses

11K

Benchmark focuses on early-stage venture investing in consumer, marketplaces, open-source, AI, infrastructure, and enterprise software.

https://t.co/2KZAOuP1w5

Joined June 2009

Don't wanna be here? Send us removal request.

Mercor

@mercor_ai

3 days

We’ve doubled the size of APEX, our benchmark for measuring whether frontier models can perform economically valuable work across four jobs: investment banking associate, management consultant, big law associate, and primary care physician (MD). The new APEX leaderboard shows: -

3

28

56

Bret Taylor

@btaylor

2 days

Sierra uses 15+ frontier and open source models for low latency tool calling and decision-making, precision classification, long-context reasoning, and empathy/tone. We call this a constellation of models, and it’s a key ingredient to the state of the art performance of agents

sierra.ai

Agents built on Sierra are assembled from 15+ purpose-built models working in concert, so they can handle complex tasks with speed, precision, and on-brand execution.

23

22

403

Jeffrey Wang

@jeffzwang

3 days

Exa is now my default search engine! For the longest time, Exa wasn’t general enough or fast enough for daily driving. But we’ve dramatically improved our index, algorithm, and latency. What happens with simple searches? We detect that and return you useful information as fast

Exa

@ExaAILabs

3 days

Smarter than your default search engine, faster than your default chat app Try the new https://t.co/cQ6UlWHnKY

23

16

378

Will Bryk

@WilliamBryk

5 days

Exa and Benchmark are hosting a special event at NeurIPS Thursday evening. DM me your spicy takes on semantic retrieval if you want to come aboard ⛴️

10

6

112

Peter Fenton

@peterfenton

15 days

This astonishing $100M milestone points to the real story—the human story. My third journey with @btaylor from the beginning has given me a front-row seat to his arc of growth. He and @claybavor embody decades of personal evolution, intersecting perfectly with the most explosive

Bret Taylor

@btaylor

15 days

Sierra just hit $100M in ARR, just seven quarters since we launched in February 2024. @claybavor and I are very grateful to our customers and proud of the Sierra team, who has redefined the meaning of intensity and craftsmanship. I have never had this much fun in my career.

8

7

171

Sunday

@sundayrobotics

16 days

It started with @tonyzzhao and @chichengcc in their apartment in April 2024. We moved out of our hacker house in Mountain View in February of 2025 with 10 people. Today, we have over 30 team members. Velocity of progress to a real, useful robot is everything to us. The progress

87

62

613

Bret Taylor

@btaylor

15 days

Sierra just hit $100M in ARR, just seven quarters since we launched in February 2024. @claybavor and I are very grateful to our customers and proud of the Sierra team, who has redefined the meaning of intensity and craftsmanship. I have never had this much fun in my career.

78

52

1K

Max Junestrand

@MaxJunestrand

16 days

We don’t just ship software. We help our customers succeed by being a partner during the most significant period of change the legal industry has ever seen. Tune in to the conversation I had with @chetanp at @SlushHQ. Full video: https://t.co/Bom850CAjN

3

6

21

Drishan Arora

@drishanarora

17 days

Today, we are releasing the best open-weight LLM by a US company: Cogito v2.1 671B. On most industry benchmarks and our internal evals, the model performs competitively with frontier closed and open models, while being ahead of any US open model (such as the best versions of

89

110

726

Sunday

@sundayrobotics

17 days

After 18 months in stealth, dozens of prototypes, millions of real-home demonstrations, and one final all-nighter, we’re thrilled for you to say hello to Memo

200

289

3K

Max Junestrand

@MaxJunestrand

18 days

Tomorrow, I’ll be on the Main Stage at @SlushHQ with Benchmark’s @chetanp to talk about one of my favourite parts of @WeAreLegora's story. Looking forward to this one! More details here: https://t.co/nR9F6uxorT

0

3

14

Manus

@ManusAI

18 days

Today we're launching Manus Browser Operator. Any browser can now become an AI browser. One extension. No download. No new setup. Your browser already works. Your logins. Your sessions. Your habits. Now with the full power of Manus.

111

334

3K

Everett Randle

@EverettRandle

22 days

En route to the Capital of Capital!

TBPN

@tbpn

22 days

BREAKING: @EverettRandle will be live on TBPN today at 12:30p PT

6

4

129

Lin Qiao

@lqiao

23 days

🚀 Fireworks Reinforcement Fine-Tuning (RFT) launched! After many months of iteration with real world use cases, we are excited to launch Fireworks RFT public preview. It’s a managed RL service that turns open frontier models (e.g. DeepSeek V3, Kimi K2) into custom agents for

22

49

346

LangChain

@LangChainAI

23 days

💻Sandboxes for DeepAgents We're excited to launch Sandboxes for DeepAgents, a new set of integrations that allow you to safely execute arbitrary DeepAgent code and bash commands in remote sandboxes. Supports @RunloopAI @daytonaio @modal Your DeepAgent runs locally (or

14

63

387

Cerebras

@cerebras

25 days

Introducing Cerebras for Nations, our global initiative to advance and scale sovereign AI. How it works: 1️⃣ We will build world-class AI supercomputers with our WSE-3 chips and CS-3 systems 2️⃣ Co-develop state-of-the-art models and deploy with the world’s fastest inference 3️⃣

8

23

143

se

@seyong

25 days

35K active users on fomo each week in the last month 👀

26

6

150

Andrew Feldman

@andrewdfeldman

27 days

The Wall Street Journal is starting to see what we’ve seen at @cerebras for years. The “reticle limit” - the boundary that defines how large a chip can be - has become the ceiling for progress. It has kept chips the size of postage stamps for more than 20 years. Every new

132

258

3K

Manus

@ManusAI

25 days

Excited to share how we're working with @NotionHQ to transform knowledge bases into execution engines. Here are some ways people are transforming their workflows👇🧵 https://t.co/JEUVhtt1bD

manus.im

Discover how real users are leveraging the Manus-Notion MCP integration to transform their workflows. Learn how the Model Context Protocol enables bidirectional data flow, turning Notion from a...

7

15

131

Everett Randle

@EverettRandle

26 days

Like meeting a long-lost sibling for the first time -- thanks for the great conversation Harry!

Harry Stebbings

@HarryStebbings

26 days

I am so bored of hearing podcasts with guests that have done the podcast tour. Benchmark added Ev Randle as their latest GP. This is his first public appearance as a Benchmark GP. - Why Margins Matter Less in AI - Why Mega Funds Will Not Produce Good Returns - OpenAI vs

2

5

107