Shriyash Upadhyay @shriyashku X Profile

Shriyash Upadhyay

@shriyashku

Followers

346

Following

300

Media

7

Statuses

237

Founder @withmartian

https://t.co/z5wnSwMcRh

Joined February 2018

Don't wanna be here? Send us removal request.

Shriyash Upadhyay

@shriyashku

2 years

Safety is being subsumed to capitalism because when there is misalignment between the two, capitalism wins. The only way to make sure AI is safe is to make a strong capitalist case for the technologies that will make AI safe: creating an economic incentive to understand models.

2

0

9

Shriyash Upadhyay

@shriyashku

1 day

I guess I'll tweet more about both? Seems like a weak local solution. Strong global solutions pending...

0

Shriyash Upadhyay

@shriyashku

1 day

I wish startup people knew more about how research actually works. I also wish researchers knew more about how startups actually work.

1

0

Shriyash Upadhyay

@shriyashku

4 days

Admittedly, maybe Roam made you a thinker a few years ago, probably Obsidian now, sadly

0

Shriyash Upadhyay

@shriyashku

4 days

If AI-made software can solve the obvious problems, more value in software will come from identity. Roam makes you a thinker. Apple makes you a taste-maker. Nintendo makes you fun. SWE will be less about tools and more about mirrors. Reflecting back who the user wants to be.

2

0

2

Thomas Dohmke

@ashtom

4 days

tl;dr Today, we’re announcing our new company @EntireHQ to build the next developer platform for agent–human collaboration. Open, scalable, independent, and backed by a $60M seed round. Plus, we are shipping Checkpoints to automatically capture agent context. In the last three

Entire

@EntireHQ

4 days

Beep, boop. Come in, rebels. We’ve raised a 60m seed round to build the next developer platform. Open. Scalable. Independent. And we ship our first OSS release today. https://t.co/OvPKCcjXbq

165

290

2K

Shriyash Upadhyay

@shriyashku

4 days

Really interesting to think about what coding looks like if we reject repos of code as the right abstraction. I worked on PL stuff and even interned at PL cos back in the day because it's such an intellectually fascinating question. Excited for @ashtom to take a big swing at it!

Entire

@EntireHQ

4 days

Beep, boop. Come in, rebels. We’ve raised a 60m seed round to build the next developer platform. Open. Scalable. Independent. And we ship our first OSS release today. https://t.co/OvPKCcjXbq

1

0

5

Shriyash Upadhyay

@shriyashku

4 days

I know I'm taking the adage a bit too literally in this tweet, but the more generalized form of the advice is also wrong

0

Shriyash Upadhyay

@shriyashku

4 days

Sources -- Painkillers: https://t.co/LXnwQomyWB Supplements:

finance.yahoo.com

The global dietary supplements market size is calculated at USD 203.42 billion in 2025 and is expected to grow from USD 218.88 billion in 2026 to USD 430.39 billion by 2034, with a CAGR of 7.78% from...

1

0

Shriyash Upadhyay

@shriyashku

4 days

Amusing how much startup advice is just wrong. "Build painkillers, not vitamins" The global supplements market: ~$203 billion. Painkillers: $87 billion.

2

1

2

Shriyash Upadhyay

@shriyashku

5 days

Thinking of training an RL model specialized for @openclaw using ARES. So it gets better performance and much lower token costs. Is this something folks would be interested in using? 100 likes and I put up an endpoint

3

2

14

Shriyash Upadhyay

@shriyashku

5 days

You can try to extrapolate current trends in software and land on "all software will become A/B testing". This is reductionist. You can't A/B test your way to the iPhone.

0

2

Shriyash Upadhyay

@shriyashku

9 days

Josh will be covering some very cool pieces of the ARES roadmap that *you* could help build

Josh Greaves

@joshgreaves_ml

9 days

We'll be presenting the ARES roadmap at office hours tomorrow at 2pm PT. If you're interested in agents | RL | interp and want to contribute to open-source send me a DM for more info. https://t.co/tGymWgVm5b

0

1

6

Shriyash Upadhyay

@shriyashku

16 days

As part of Prod, can confirm @bfspector is quite wonderful. Congrats to the whole team

0

1

Shriyash Upadhyay

@shriyashku

16 days

If you're building a god, there are apparently three ways to name things: - Technical Mumbo Jumbo (ChatGPT) - Very serious (Anthropic) - With a sense of... let's call it child-like wonder

Flapping Airplanes

@flappyairplanes

17 days

Announcing Flapping Airplanes! We’ve raised $180M from GV, Sequoia, and Index to assemble a new guard in AI: one that imagines a world where models can think at human level without ingesting half the internet.

2

0

1

Augustine Mavor-Parker

@MavorParker

16 days

RL progress is bottlenecked by infra for training and evaluation. @VmaxAI is excited to be partnering @withmartian, generating environments for the Agentic Research and Evaluation (ARES) framework

7

30

74

Augustine Mavor-Parker

@MavorParker

16 days

This is a preview of many more tasks to come for Ares!

Josh Greaves

@joshgreaves_ml

16 days

ARES uses the Harbor task format ( @alexgshaw ). It comes with SWE-Bench Verified, TerminalBench2, SWESmith, and everything else in the Harbor ecosystem. We're also releasing 1k new JavaScript tasks with @VmaxAI ( @MavorParker @matthewjsargent ) to help the ecosystem grow.

1

5

19

Shriyash Upadhyay

@shriyashku

16 days

If you’re building agents, RL algorithms, harnesses, or benchmarks: plug into ARES. Let’s make the online-RL coding stack a shared public good.

0

2

Shriyash Upadhyay

@shriyashku

16 days

Where ARES fits: we’re open-sourcing the missing horizontal layer—Gym-like + async-native infra for true online RL on coding agents, with the boundary at the LLM interface. The history of RL shows RL for LLMs should be online. Repo:

github.com

Agentic Research and Evaluation Suite. Contribute to withmartian/ares development by creating an account on GitHub.

1

0

2

Shriyash Upadhyay

@shriyashku

16 days

@cognition/@windsurf SWE-1.5 points the same way: end-to-end RL on a custom harness (Cascade), high-fidelity coding envs + hardened rewards. https://t.co/Rfw9SzzKpT &

windsurf.com

SWE-1.5 is our latest frontier model, delivering near-SOTA coding performance at unprecedented speed.

1

0

Shriyash Upadhyay

@shriyashku

16 days

@cursor_ai's Tab RL (online RL on editor feedback -- probably the first massive IRL online example):

cursor.com

Our new Tab model makes 21% fewer suggestions while having 28% higher accept rate.

1

0