random_walker Profile Banner
Arvind Narayanan Profile
Arvind Narayanan

@random_walker

Followers
126K
Following
20K
Media
896
Statuses
13K

Princeton CS prof and Director @PrincetonCITP. Coauthor of "AI Snake Oil" and "AI as Normal Technology". https://t.co/ZwebetjZ4n Views mine.

Princeton, NJ
Joined December 2007
Don't wanna be here? Send us removal request.
@random_walker
Arvind Narayanan
3 months
I’m excited to announce I’ve started a YouTube channel. I plan to publish videos regularly explaining my views on AI and its present and future impacts. My first video asks: What happens if there’s an AI crash? https://t.co/sEGeoCyHmk This is my first foray into video (beyond
17
50
334
@random_walker
Arvind Narayanan
2 days
📢📢 I'm looking for a postdoctoral fellow and so are many of my amazing faculty colleagues @PrincetonCITP. The center's mission is to understand and improve the relationship between tech and society. Apply soon for full consideration. Details: https://t.co/AGphhLkU60 The center
1
21
96
@abeirami
Ahmad Beirami ✈️ NeurIPS
2 days
💯 absolutely right! "We think studying the coupling between models and scaffolds is an important research direction going forward, especially as more developers release scaffolds that their models might be finetuned to work well with"
@sayashk
Sayash Kapoor
3 days
CORE-Bench is solved (using Opus 4.5 with Claude Code) TL;DR: Last week, we released results for Opus 4.5 on CORE-Bench, a benchmark that tests agents on scientific reproducibility tasks. Earlier this week, Nicholas Carlini reached out to share that an updated scaffold that uses
0
3
50
@AlexGDimakis
Alex Dimakis
2 days
Agent scaffolds are as important as models.
@sayashk
Sayash Kapoor
3 days
CORE-Bench is solved (using Opus 4.5 with Claude Code) TL;DR: Last week, we released results for Opus 4.5 on CORE-Bench, a benchmark that tests agents on scientific reproducibility tasks. Earlier this week, Nicholas Carlini reached out to share that an updated scaffold that uses
10
20
200
@sayashk
Sayash Kapoor
3 days
CORE-Bench is solved (using Opus 4.5 with Claude Code) TL;DR: Last week, we released results for Opus 4.5 on CORE-Bench, a benchmark that tests agents on scientific reproducibility tasks. Earlier this week, Nicholas Carlini reached out to share that an updated scaffold that uses
26
104
740
@burkov
BURKOV
3 days
This paper really is groundbreaking. It solves a long-standing embarrassment in machine learning: despite all the hype around deep learning, traditional tree-based methods (XGBoost, CatBoost, random forests, etc) have dominated tabular data—the most common data format in
74
398
3K
@DKThomp
Derek Thompson
4 days
This is a great piece with some mind-boggling statistics. - At Brown and Harvard, more than 20% of undergraduates are registered as disabled - At Amherst: more than 30 percent - At Stanford: nearly 40 percent Soon, many of these schools "may have more students receiving
987
3K
17K
@slotkinjr
Dr. Jon Slotkin
4 days
I have a guest essay in @nytimes today about autonomous vehicle safety. I wrote it because I’m tired of seeing children die. Done right, we can eliminate car crashes as a leading cause of death in the United States @Waymo recently released data covering nearly 100 million
326
1K
6K
@HarmonicMath
Harmonic
6 days
Many of us intuitively feel that the field of mathematics is going to change, so let's unpack the likely outcomes, without resorting to hyperbole or doomerism.
10
30
267
@itamarcaspi
Itamar Caspi
12 days
Right now this "agentic reviewer" is like putting a newspaper online as HTML. It imitates existing referee reports and optimizes agreement with scores, so it learns the current tastes, fads, and biases of human reviewers rather than questioning them. From an economic perspective
@AndrewYNg
Andrew Ng
12 days
Releasing a new "Agentic Reviewer" for research papers. I started coding this as a weekend project, and @jyx_su made it much better. I was inspired by a student who had a paper rejected 6 times over 3 years. Their feedback loop -- waiting ~6 months for feedback each time -- was
8
13
154
@AndyMasley
Andy Masley
17 days
I think that Hao made a bad but honest mistake and I don't mean to attack her overall character as a journalist. In contrast, I would like to take this opportunity to directly attack the journalistic integrity of More Perfect Union, who are much more influential in the AI water
18
61
700
@binarybits
Timothy B. Lee
21 days
NYTimes article quotes someone saying they are "terrified" of Waymo in paragraph 6. Waits until paragraph 33 (out of 44 paragraphs) to mention that they are 91 percent safer than human drivers. How outraged would liberals be if a news outlet covered vaccines like this?
71
281
3K
@johnloeber
John Loeber 🎢
22 days
When I was in college at UChicago, I dated another student from Appalachia. She once told me she had gotten straight As in high school calculus -- and when she took the AP exam, she got a 1, the worst possible grade. (And she was pretty talented, she later on pretty well at
@KelleyKga
Kelley K
23 days
The fact is that high schools are graduating kids with As and Bs in advanced math courses who haven't mastered foundational skills. The data from the UCSD report makes that clear. 20% took calculus in high school! Their GPA in math classes is ~3.6! This is happening all over the
123
442
6K
@RuxandraTeslo
Ruxandra Teslo 🧬
22 days
Don't make me tap the sign
@Dr_Singularity
Dr Singularity
23 days
We will now be able to discover new drugs 1000s of times faster. Thanks to AI, all diseases will be curable during the 2030s. MADD - Multi Agent Drug Discovery Orchestra, a multi agent AI system designed to massively accelerate the early stages of drug discovery, especially
82
500
6K
@random_walker
Arvind Narayanan
22 days
@snewmanpv @sayashk @DKokotajlo @eli_lifland @thlarsen IMO reliability is only one of the limitations; the others are preference elicitation and the adversarial nature of the environment. I've written a bit about it here: https://t.co/r2lqcoeEQ4 (That said I do think the reliability issue alone is pretty challenging — you'd need ~3
@random_walker
Arvind Narayanan
1 year
Google's Deep Research is an excellent application of agentic capabilities. One example something it can do pretty well is search for all my podcasts and interviews and create a webpage listing them. Cuts down effort at least 10x compared to doing it manually. The reason it works
2
1
17
@micahgoldblum
Micah Goldblum
23 days
An LLM-generated paper is in the top 17% of ICLR submissions in terms of average reviewer score, having received two 8's. The paper has tons of BS jargon and hallucinated references. Fortunately, one reviewer actually looked at the paper and gave it a zero. 1/3
40
146
1K
@g_leech_
gavin leech (Non-Reasoning)
23 days
Glad somebody did this (expert interviews on why LLMs are not currently AGI, and why they could be) feat: @random_walker, @DKokotajlo, @ben_j_todd, @daniel_d_kang, @rohinmshah
@geoffreyirving
Geoffrey Irving
1 month
New AISI report mapping cruxes behind whether AI progress might be fast or slow on the path to systems near or beyond human-level at most cognitive tasks. The goal is not to resolve uncertainties but reflect them: we don't know how AI will go, and should plan accordingly!
2
22
141
@DKokotajlo
Daniel Kokotajlo
24 days
Common ground between the authors of AI 2027 and AI as Normal Technology! Coauthored article below.
24
67
404
@random_walker
Arvind Narayanan
24 days
We enjoyed the opportunity for productive discussion with the authors of AI 2027 to find areas of common ground. We are also planning an “adversarial collaboration”.
11
31
256
@AndyMasley
Andy Masley
25 days
@rtwlz
Riley Walz
25 days
Video of a San Francisco Muni train flying 50mph out of the Sunset Tunnel after the driver fell asleep. Luckily the train didn't crash.
22
300
11K
@JeremiahDJohns
Jeremiah Johnson 🌐
26 days
Marc Andreessen has become the avatar of societal decay, the representation of what technology looks like with no vision of the good behind it.
74
286
3K