rishicomplex Profile Banner
Rishi Mehta Profile
Rishi Mehta

@rishicomplex

Followers
3K
Following
4K
Media
37
Statuses
254

Solve i̶n̶t̶e̶l̶l̶i̶g̶e̶n̶c̶e̶ ̶ coding, use it to solve everything else | Research @AnthropicAI | Past: RL @GoogleDeepmind: AlphaProof co-lead, Gemini.

San Francisco, CA
Joined July 2009
Don't wanna be here? Send us removal request.
@rishicomplex
Rishi Mehta
19 days
Btw one of the biggest perks of working at A\ is unlimited Claude Code :)
5
0
42
@rishicomplex
Rishi Mehta
19 days
We're hiring on the Code RL team at Anthropic! Small, fast-moving team. Low ego, high impact. If you're a star engineer/researcher excited to push the frontier of AI-powered SWE, there's nowhere better to be. We care about getting this right. DM or apply here!
Tweet card summary image
job-boards.greenhouse.io
San Francisco, CA | New York City, NY
@claudeai
Claude
19 days
Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use. Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.
20
20
510
@rishicomplex
Rishi Mehta
19 days
We reported a broader suite of SWE evals with Opus 4.5 (swebench, swe bench pro, swebench multilingual, aider). But as always, actually using the model is the real eval
@redtachyon
Ariel
19 days
lfg we got new SOTA on django
3
0
29
@rishicomplex
Rishi Mehta
19 days
A litmus test I have when I work with a model is "frequency at which I feel the urge to swear". Haven't felt it yet with Opus 4.5.
0
1
1
@rishicomplex
Rishi Mehta
19 days
We launched a new Opus! Try coding with it. Besides being a jump on all the benchmarks, this model feels competent in a way I haven't felt with any other model.
@claudeai
Claude
19 days
Introducing Claude Opus 4.5: the best model in the world for coding, agents, and computer use. Opus 4.5 is a step forward in what AI systems can do, and a preview of larger changes to how work gets done.
1
1
54
@rishicomplex
Rishi Mehta
20 days
The number of papers I read is proportional to the number of hours I spend in flights
1
0
5
@rishicomplex
Rishi Mehta
20 days
In the arc agi 2 human baseline of 60%, it looks like humans were given 2 attempts, with the second attempt conditioned on knowing the first attempt failed. In the AI attempts, the two attempts are independent. Seems like an important disparity? @fchollet @mikeknoop
3
0
28
@rishicomplex
Rishi Mehta
22 days
I'm in
@bryan_johnson
Bryan Johnson
22 days
ok, you've had a week to think about it. who's in? no sugar nov 21 - jan 2 eat all you want: blueberries, blackberries, raspberries, strawberries, cherries, apples, kiwi, pomegranate, etc. avoid cookies, pies, cakes, sweet drinks, chocolates, cocktails, anything with added
0
0
0
@rishicomplex
Rishi Mehta
22 days
Oh.
@karpathy
Andrej Karpathy
22 days
@TheVixhal your post challenged me. every one of your points is wrong but i had to think about each for a while :)
0
0
9
@rishicomplex
Rishi Mehta
23 days
London folks check this out
@sammcallister
sam mcallister
23 days
I feel like we should do this again This weekend only for the last time? Air Mail Chiltern Street, London First come, first served This weekend only
2
0
12
@rishicomplex
Rishi Mehta
23 days
Asked nano banana pro to make a comic about suppandi (simpleton Indian comic book character I grew up on). Pretty good - this is legit something I could imagine being published in tinkle lol
1
0
5
@rishicomplex
Rishi Mehta
23 days
Note how the other models don't even get the board size right let alone understanding chess
1
0
0
@rishicomplex
Rishi Mehta
23 days
And this is nano banana (non-pro)
2
0
0
@rishicomplex
Rishi Mehta
23 days
This is chatgpt for the same prompt
2
0
0
@rishicomplex
Rishi Mehta
23 days
Wow, nano banana pro is amazing. Prompt: Make an image of an interesting mate-in-1 chess puzzle. Make the aspect ratio such that it's easy to view on a phone.
8
1
8
@rishicomplex
Rishi Mehta
26 days
This had better be good
@demishassabis
Demis Hassabis
26 days
It's nearly 3 here, my favourite part of the night shift… locked in... 💪🚀
2
0
11
@rishicomplex
Rishi Mehta
28 days
Looking to move to a new share house in sf in early jan, dm me if you have any leads!
1
8
15
@rishicomplex
Rishi Mehta
29 days
Btw it's hidden away in the appendix of the alphaproof paper but we solved minif2f-valid (canonical theorem proving benchmark) last year! Doesn't happen with most benchmarks even after they're saturated, because a few problems in the end prove too hard / ambiguous / unsolvable.
2
6
26
@rishicomplex
Rishi Mehta
1 month
"my role is to be on the bottom of things" beautiful
@thegautamkamath
Gautam Kamath
1 month
I admire Don Knuth for many things. But perhaps the most admirable is quitting email before I was even born.
0
0
2
@rishicomplex
Rishi Mehta
1 month
Nice article that reports the results of mathematicians trying AlphaProof on some real problems, and sometimes finding it useful
@TaliaRinger
Talia Ringer 🕊
1 month
New from me, in @Nature: Mathematicians put AI model AlphaProof to the test. A solicited News & Views article about @GoogleDeepMind AlphaProof that was an absolute joy to write! https://t.co/HBnQg22MEP
0
1
8