Mike Knoop @mikeknoop X Profile

Mike Knoop

@mikeknoop

Followers

23K

Following

10K

Media

251

Statuses

4K

co-founder @ndea and @zapier @arcprize

https://t.co/hkLF2Dr7bc

sf bay area

Joined July 2009

Don't wanna be here? Send us removal request.

Mike Knoop

@mikeknoop

5 months

Today we’re releasing our first public preview of ARC-AGI-3: the first three games. Version 3 is a big upgrade over v1 and v2 which are designed to challenge pure deep learning and static reasoning. In contrast, v3 challenges interactive reasoning (eg. agents). The full version

113

67

518

Mike Knoop

@mikeknoop

1 day

Humans and organizations pour untold resources into complexity management and still do it poorly. AI reasoning can now handle this. We'll prefer it not because it's cheaper but because it's better.

4

2

35

Finesse Attire

@finesse_attire

1 month

🥶 Say goodbye to freezing hands. Water and windproof. Great for use with touch screen devices. Get it now: https://t.co/FTIoYl3alh

14

649

7K

Mike Knoop

@mikeknoop

2 days

AI reasoning performance is tied to knowledge which we don't want. Finding a renewable source of OOD data will help evaluate and inspire progress on this problem. One clear example is the future. Over sufficient time the future is OOD from the past. As a thought experiment,

6

3

42

Mike Knoop

@mikeknoop

2 days

Smart

Caleb Watney

@calebwatney

2 days

NSF is launching one of the most ambitious experiments in federal science funding in 75 years. The program is called Tech Labs, and the goal is to invest ~$1 billion to seed new institutions of science and technology for the 21st century. Instead of funding projects, the NSF

0

12

Sam Altman

@sama

3 days

390x cost reduction in a year!

ARC Prize

@arcprize

3 days

A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year

279

401

6K

Mike Knoop

@mikeknoop

3 days

On an energy basis, my best estimate is human efficiency for solving simple ARC v1 tasks is 1,000,000X higher than last December's unreleased o3 (High) preview. That number is now about 10,000X with today's GPT 5.2 Pro (X-High). ARC-AGI-1 pinpointed the advent of AI reasoning

ARC Prize

@arcprize

3 days

A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year

13

16

274

Greg Kamradt

@GregKamradt

5 days

Isaac Liao made "ARC-AGI Without Pretraining" I spoke with a small lab researcher (can't be named) who said, "what Isaac did was wild, out of nowhere, we're still trying to unpack it"

ARC Prize

@arcprize

5 days

ARC Prize 2025 Winners Interviews Paper Award 3rd Place @LiaoIsaac91893 shares the story behind CompressARC - an MDL-based, single puzzle-trained neural code golf system that achieves ~20–34% on ARC-AGI-1 and ~4% on ARC-AGI-2 without any pretraining or external data.

6

19

149

Mike Knoop

@mikeknoop

3 days

Pretty clear the latest Nov/Dec 2025 family of AI reasoning systems have significantly improved fluid intelligence over knowledge domains they were trained on (eg. code). Big step up from 6-9 months ago. ARC-AGI shows as much improvement.

Nathan Baschez

@nbashaw

4 days

the downstream effects of claude 4.5 opus will be studied

4

6

113

Cyril Gorlla

@CyrilGorlla

5 days

We just launched @CTGTInc's Mentat, an OpenAI-compatible API that gives enterprises deterministic control over LLM behavior. Benchmarks showed clear gains in accuracy, truthfulness, and hallucination prevention. We built it after seeing models ignore correct information or

11

17

85

ARC Prize

@arcprize

8 days

ARC Prize 2025 Winners Interview Series Paper Award 1st Place @jm_alexia shares details about the Tiny Recursive Model (TRM) - a ~7M-parameter, single-network recursive model w/ separate answer and latent states that attains ~45% on ARC-AGI-1 and ~8% on ARC-AGI-2

3

31

126

Mike Knoop

@mikeknoop

7 days

Really cool!

Parag Agrawal

@paraga

9 days

One of the integrations I’ve been using to automate my work.

0

2

Mike Knoop

@mikeknoop

8 days

This is a category of benchmarks I find interesting as the future is renewably OOD from the past. Success over sufficient time demonstrates adaptation.

Shen Zhuoran

@CMS_Flash

8 days

Future is the only unhackable evaluation. Stock market to future prediction is like code and math to reasoning.

4

2

32

Parag Agrawal

@paraga

9 days

2. Zapier https://t.co/6pJAzHJbgc

Parallel Web Systems

@p0

10 days

Parallel is now integrated with @zapier Use this integration to give any automation access to powerful structured web search, for example: - Slack bots that enrich leads - Inbound emails to trigger deep research reports - Market research agents Here's @KSaunack for a demo:

2

31

Mike Knoop

@mikeknoop

9 days

One paper that was not officially submitted to ARC Prize 2025 but deserves recognition is @makingAGI's HRM. It kicked off a line of research around zero-pretraining DL, the exact sort of "new ideas" ARC Prize exists to inspire.

1

7

147

Mike Knoop

@mikeknoop

9 days

One idea: train foundation and process models on non-IDD sets but evaluate them jointly. This gives a reasoning generalization score to grind against.

0

4

Mike Knoop

@mikeknoop

9 days

AI reasoning performance is currently bound to model knowledge. I think this is the root cause of "jagged intelligence". To make progress we have to figure out how to disentangle knowledge from reasoning.

14

3

78

Mike Knoop

@mikeknoop

9 days

Congrats to all the ARC Prize 2025 winners! AGI progress still depends on new ideas. This year we had over 90 new papers submitted, double the 2024 competition, and overall quality was much higher. Thank you to everyone working to push us forward.

ARC Prize

@arcprize

9 days

ARC Prize 2025 Paper Award Winners 1st / "Less is More: Recursive Reasoning with Tiny Networks" (TRM) / A. Jolicoeur-Martineau / $50k 2nd / "Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI" (SOAR) / J. Pourcel et al. / $20k 3rd /

1

2

43

Mike Knoop

@mikeknoop

9 days

I've been reflecting on the progress of ARC in 2025. What it means for AGI progress and the future of ARC Prize. My full analysis here:

arcprize.org

Winners, analysis, and interviews.

ARC Prize

@arcprize

9 days

Announcing the ARC Prize 2025 Top Score & Paper Award winners The Grand Prize remains unclaimed Our analysis on AGI progress marking 2025 the year of the refinement loop

5

8

80

Mike Knoop

@mikeknoop

11 days

Great post. It's unfortunate that shifting AGI goalposts is associated with luddism. Pointing out flaws and building theories is how to drive progress -- in fact it's a strong bull signal as we get more humans studying the real issues, increasing the likelihood we solve them.

Dwarkesh Patel

@dwarkesh_sp

12 days

New post: Thoughts on AI progress (Dec 2025) 1. What are we scaling?

4

2

57

François Chollet

@fchollet

4 years

Two perfectly compatible messages I've been repeating for years: 1. Scaling up deep learning will keep paying off. 2. Scaling up deep learning won't lead to AGI, because deep learning on its own is missing key properties required for general intelligence.

62

153

1K

Mike Knoop

@mikeknoop

18 days

I've learned that building useful AI benchmarks has much in common with building useful products. You cannot design either in isolation. Both need strong contact with reality and iteration to make them great.

17

3

39