Steve Newman
@snewmanpv
Followers
4K
Following
4K
Media
96
Statuses
904
Co-founder of Writely (aka Google Docs) and 7 other startups. Now at the Golden Gate Institute for AI, working to bring AI’s toughest questions into focus.
Joined October 2010
"If you don't have time to do it right, when will you have time to do it over?" used to be one of my favorite pearls of wisdom. But when "do it over" means 30 seconds tweaking a prompt...
0
0
5
How far will AI go? Well, the tech-industrial complex has tasted blood, motivating investment that may soon exceed US spending on WWII. A review of AI history suggests we may well be within a few decades of an Unrecognizable Age. My latest post (link ⬇️) presents the case.
3
3
37
Lighthaven has quickly become one of the premier conference venues in SF. Many of the most interesting events I get invited to are there. And of course, it's where @rootsofprogress hosts Progress Conference. LessWrong is a great resource too. They are fundraising now:
Lightcone Infrastructure's 2026 fundraiser is live! We build beautiful things for truth-seeking and world-saving. We run LessWrong, Lighthaven, Inkhaven, designed AI-2027, and so many more things. All for the price of less than one OpenAI staff engineer ($2M/yr). More in 🧵.
1
3
48
I have joined the ranks of those who can't get anything done when Claude is down. We are rapidly heading toward a future in which cognition has a single point of failure.
3
2
27
(Molly credits this idea to @the_sbell) Incidentally, the interview is a great overview of the impact of AI on work – where things stand, the wide range of possibilities for where things might go, and the key unknowns. Worth a listen:
humanetech.com
We're a nonprofit exposing the negative effects of persuasive technology and social media and empowering people to take action. Discover The Social Dilemma, our podcast, course, and more.
0
0
3
An interesting idea from @Mollykinder (being interviewed alongside @emollick ): if we'd like to see a world where AI is augmenting workers, rather than replacing them, then we should invest in benchmarks for AIs working with people – not just AIs working alone.
5
4
28
A nice example of how complicated model behavior (and thus model training) can be. It seems that training models to resist prompt injection also trains them to "avoid mentioning anything that seems sketchy in tool call results" – even sketchy things you'd want to hear about!
One analysis from our pre-release audit of Opus 4.5 stands out to me. Our behavioral evals uncovered an example of apparent deception by the model. By analyzing the internal activations, we identified a suspected root cause, and cases of similar behavior during training. (1/7)
0
1
14
A new style of work is emerging, which I call "hyperproductivity". Practitioners don't do their job, they spend their time optimizing agents to do their job, including the part of their job where they're optimizing the agents. The early stages of the Singularity will look like
4
2
40
It's 2025, and the software development community has not managed to address the profound vulnerability-by-default of the Python "pickle" library. We're ngmi. (Translation: a standard feature of the Python language has been known for decades to be wildly insecure, but hasn't
3
0
11
It just dawned on me: the @METR_Evals time horizon measurements don't (AFAIK) use Claude Code (or Codex). But those tools clearly represent the frontier of AI coding. @joel_bkr does METR have any plans to benchmark these coding agents?
We estimate that Claude Sonnet 4.5 has a 50%-time-horizon of around 1 hr 53 min (95% confidence interval of 50 to 235 minutes) on our agentic multi-step software engineering tasks. This estimate is lower than the current highest time-horizon point estimate of around 2 hr 15 min.
6
3
136
I applaud this catalogue of agreements, but I disagree with @sayashk @random_walker @DKokotajlo @eli_lifland @thlarsen on one point: by the end of 2029, AIs will certainly be able to book a flight to Paris, and I would place money on that at odds 😄 From the linked article:
4
0
13
This is a wonderfully clarifying article, presenting a long list of things that that some people with divergent expectations for AI still agree on. The future is unclear, but some aspects are less murky, and some actions look sensible across a wide range of scenarios.
0
1
13
I wrote a piece for @snewmanpv ‘s blog: after the excellent Curve conference come pivotal times for AI policy. There’s common ground, but politics are getting ugly, and temptation to grab cheap wins on the flanks grows. In the next few months, we'll see if the center can hold.
Another guest post inspired by The Curve! @anton_d_leicht writes about the possibilities of cooperation around AI policy that come into view at an event like The Curve, and the need to extend that spirit beyond the boundaries of the conference.
1
2
19
Another guest post inspired by The Curve! @anton_d_leicht writes about the possibilities of cooperation around AI policy that come into view at an event like The Curve, and the need to extend that spirit beyond the boundaries of the conference.
1
1
15
I've been pathetically far behind on AI coding, but I finally installed Claude Code. I've started vibe-coding Chrome extensions to customize sites I visit a lot, and all I can think is: "Now I have a machine gun – ho ho ho!".
0
0
8