Arvind Narayanan
@random_walker
Followers
126K
Following
20K
Media
896
Statuses
13K
Princeton CS prof and Director @PrincetonCITP. Coauthor of "AI Snake Oil" and "AI as Normal Technology". https://t.co/ZwebetjZ4n Views mine.
Princeton, NJ
Joined December 2007
I’m excited to announce I’ve started a YouTube channel. I plan to publish videos regularly explaining my views on AI and its present and future impacts. My first video asks: What happens if there’s an AI crash? https://t.co/sEGeoCyHmk This is my first foray into video (beyond
17
50
334
📢📢 I'm looking for a postdoctoral fellow and so are many of my amazing faculty colleagues @PrincetonCITP. The center's mission is to understand and improve the relationship between tech and society. Apply soon for full consideration. Details: https://t.co/AGphhLkU60 The center
1
21
96
💯 absolutely right! "We think studying the coupling between models and scaffolds is an important research direction going forward, especially as more developers release scaffolds that their models might be finetuned to work well with"
CORE-Bench is solved (using Opus 4.5 with Claude Code) TL;DR: Last week, we released results for Opus 4.5 on CORE-Bench, a benchmark that tests agents on scientific reproducibility tasks. Earlier this week, Nicholas Carlini reached out to share that an updated scaffold that uses
0
3
50
Agent scaffolds are as important as models.
CORE-Bench is solved (using Opus 4.5 with Claude Code) TL;DR: Last week, we released results for Opus 4.5 on CORE-Bench, a benchmark that tests agents on scientific reproducibility tasks. Earlier this week, Nicholas Carlini reached out to share that an updated scaffold that uses
10
20
200
CORE-Bench is solved (using Opus 4.5 with Claude Code) TL;DR: Last week, we released results for Opus 4.5 on CORE-Bench, a benchmark that tests agents on scientific reproducibility tasks. Earlier this week, Nicholas Carlini reached out to share that an updated scaffold that uses
26
104
740
This paper really is groundbreaking. It solves a long-standing embarrassment in machine learning: despite all the hype around deep learning, traditional tree-based methods (XGBoost, CatBoost, random forests, etc) have dominated tabular data—the most common data format in
74
398
3K
This is a great piece with some mind-boggling statistics. - At Brown and Harvard, more than 20% of undergraduates are registered as disabled - At Amherst: more than 30 percent - At Stanford: nearly 40 percent Soon, many of these schools "may have more students receiving
987
3K
17K
Many of us intuitively feel that the field of mathematics is going to change, so let's unpack the likely outcomes, without resorting to hyperbole or doomerism.
10
30
267
Right now this "agentic reviewer" is like putting a newspaper online as HTML. It imitates existing referee reports and optimizes agreement with scores, so it learns the current tastes, fads, and biases of human reviewers rather than questioning them. From an economic perspective
Releasing a new "Agentic Reviewer" for research papers. I started coding this as a weekend project, and @jyx_su made it much better. I was inspired by a student who had a paper rejected 6 times over 3 years. Their feedback loop -- waiting ~6 months for feedback each time -- was
8
13
154
I think that Hao made a bad but honest mistake and I don't mean to attack her overall character as a journalist. In contrast, I would like to take this opportunity to directly attack the journalistic integrity of More Perfect Union, who are much more influential in the AI water
18
61
700
NYTimes article quotes someone saying they are "terrified" of Waymo in paragraph 6. Waits until paragraph 33 (out of 44 paragraphs) to mention that they are 91 percent safer than human drivers. How outraged would liberals be if a news outlet covered vaccines like this?
71
281
3K
When I was in college at UChicago, I dated another student from Appalachia. She once told me she had gotten straight As in high school calculus -- and when she took the AP exam, she got a 1, the worst possible grade. (And she was pretty talented, she later on pretty well at
The fact is that high schools are graduating kids with As and Bs in advanced math courses who haven't mastered foundational skills. The data from the UCSD report makes that clear. 20% took calculus in high school! Their GPA in math classes is ~3.6! This is happening all over the
123
442
6K
Don't make me tap the sign
We will now be able to discover new drugs 1000s of times faster. Thanks to AI, all diseases will be curable during the 2030s. MADD - Multi Agent Drug Discovery Orchestra, a multi agent AI system designed to massively accelerate the early stages of drug discovery, especially
82
500
6K
@snewmanpv @sayashk @DKokotajlo @eli_lifland @thlarsen IMO reliability is only one of the limitations; the others are preference elicitation and the adversarial nature of the environment. I've written a bit about it here: https://t.co/r2lqcoeEQ4 (That said I do think the reliability issue alone is pretty challenging — you'd need ~3
Google's Deep Research is an excellent application of agentic capabilities. One example something it can do pretty well is search for all my podcasts and interviews and create a webpage listing them. Cuts down effort at least 10x compared to doing it manually. The reason it works
2
1
17
An LLM-generated paper is in the top 17% of ICLR submissions in terms of average reviewer score, having received two 8's. The paper has tons of BS jargon and hallucinated references. Fortunately, one reviewer actually looked at the paper and gave it a zero. 1/3
40
146
1K
Glad somebody did this (expert interviews on why LLMs are not currently AGI, and why they could be) feat: @random_walker, @DKokotajlo, @ben_j_todd, @daniel_d_kang, @rohinmshah
New AISI report mapping cruxes behind whether AI progress might be fast or slow on the path to systems near or beyond human-level at most cognitive tasks. The goal is not to resolve uncertainties but reflect them: we don't know how AI will go, and should plan accordingly!
2
22
141
Common ground between the authors of AI 2027 and AI as Normal Technology! Coauthored article below.
24
67
404
We enjoyed the opportunity for productive discussion with the authors of AI 2027 to find areas of common ground. We are also planning an “adversarial collaboration”.
11
31
256
Marc Andreessen has become the avatar of societal decay, the representation of what technology looks like with no vision of the good behind it.
74
286
3K