Peter Hase @peterbhase X Profile

Peter Hase

@peterbhase

Followers

3K

Following

2K

Media

57

Statuses

481

AI Institute Fellow at Schmidt Sciences. Postdoc at Stanford NLP Group. Previously: Anthropic, AI2, Google, Meta, UNC Chapel Hill

https://t.co/rmhRicOgvN

New York, NY

Joined April 2019

Don't wanna be here? Send us removal request.

Peter Hase

@peterbhase

1 year

My last PhD paper 🎉: fundamental problems with model editing for LLMs! We present *12 open challenges* with definitions/benchmarks/assumptions, inspired by work on belief revision in philosophy To provide a way forward, we test model editing against Bayesian belief revision 🧵

3

75

305

Schmidt Sciences

@schmidtsciences

2 days

We're excited to welcome 28 new AI2050 Fellows! This 4th cohort of researchers are pursuing projects that include building AI scientists, designing trustworthy models, and improving biological and medical research, among other areas. https://t.co/8oY7xdhxvF

6

27

172

Sarah Wiegreffe

@sarahwiegreffe

3 days

I am recruiting 2 PhD students to work on LM interpretability at UMD @umdcs starting in fall 2026! We are #3 in AI and #4 in NLP research on @CSrankings. Come join us in our lovely building just a few miles from Washington, D.C. Details in 🧵

12

154

715

Stewart Slocum

@StewartSlocum1

16 days

Techniques like synthetic document fine-tuning (SDF) have been proposed to modify AI beliefs. But do AIs really believe the implanted facts? In a new paper, we study this empirically. We find: 1. SDF sometimes (not always) implants genuine beliefs 2. But other techniques do not

5

37

185

Peter Hase

@peterbhase

16 days

I would encourage technical AI types to consider working in grantmaking! Schmidt Sciences is hiring for a unique position where you get to continue your own research at the same time Link:

jobs.lever.co

Summary Schmidt Sciences invites recent PhD graduates in AI and computer science to apply for a 12-18 month fellows-in-residence program. Reporting to the Director of the AI Institute at Schmidt...

4

29

145

Peter Hase

@peterbhase

2 months

My research code has never been sloppier than when written by AI. So many silently failing training runs What works well for me: - rubber ducking in a web app What costs me hours on a 1 week lag: - pressing tab

roon

@tszzl

2 months

right now is the time where the takeoff looks the most rapid to insiders (we don’t program anymore we just yell at codex agents) but may look slow to everyone else as the general chatbot medium saturates

0

7

Brenden Lake

@LakeBrenden

2 months

Our new lab for Human & Machine Intelligence is officially open at Princeton University! Consider applying for a PhD or Postdoc position, either through the depts. of Computer Science or Psychology. You can register interest on our new website https://t.co/fRPhtmJdrH (1/2)

10

64

595

Cas (Stephen Casper)

@StephenLCasper

2 months

📌📌📌 I'm excited to be on the faculty job market this fall. I updated my website with my CV. https://t.co/4Ddv6tN0jq

stephencasper.com

Visit the post for more.

8

22

173

Peter Hase

@peterbhase

3 months

Shower thought: LLMs still have very incoherent notions of evidence, and they update in strange ways when presented with information in-context that is relevant to their beliefs. I really wonder what will happen when LLM agents start doing interp on themselves and see the source

5

23

Hannah Rose Kirk

@hannahrosekirk

4 months

My team at @AISecurityInst is hiring! This is an awesome opportunity to get involved with cutting-edge scientific research inside government on frontier AI models. I genuinely love my job and the team 🤗 Link: https://t.co/poiWqKlmgt More Info: ⬇️

3

24

110

Nouha Dziri

@nouhadziri

4 months

Current agents are highly unsafe, o3-mini one of the most advanced models in reasoning score 71% in executing harmful requests 😱 We introduce a new framework for evaluating agent safety✨🦺 Discover more 👇 👩‍💻 Code & data: https://t.co/mw6XVDMc6q 📄 Paper:

Sanidhya Vijayvargiya

@sanidhya903

4 months

1/ AI agents are increasingly being deployed for real-world tasks, but how safe are they in high-stakes settings? 🚨 NEW: OpenAgentSafety - A comprehensive framework for evaluating AI agent safety in realistic scenarios across eight critical risk categories. 🧵

2

16

70

Miles Turpin

@milesaturpin

4 months

New @Scale_AI paper! 🌟 LLMs trained with RL can exploit reward hacks but not mention this in their CoT. We introduce verbalization fine-tuning (VFT)—teaching models to say when they're reward hacking—dramatically reducing the rate of undetected hacks (6% vs. baseline of 88%).

9

70

282

Peter Hase

@peterbhase

4 months

Overdue job update -- I am now: - A Visiting Scientist at @schmidtsciences, supporting AI safety and interpretability - A Visiting Researcher at the Stanford NLP Group, working with @ChrisGPotts I am so grateful I get to keep working in this fascinating and essential area, and

15

22

174

Fazl Barez@EMNLP 🇨🇳

@FazlBarez

4 months

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵

28

145

657

Michel

@JustenMichel

4 months

really interesting to see just how gendered excitement about AI is, even among AI experts

15

40

239

FAR.AI

@farairesearch

5 months

🤔 Can lie detectors make AI more honest? Or will they become sneakier liars? We tested what happens when you add deception detectors into the training loop of large language models. Will training against probe-detected lies encourage honesty? Depends on how you train it!

4

11

69

Jiaxin Wen

@jiaxinwen22

5 months

New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive or better than using human supervision. Using this approach, we are able to train a Claude 3.5-based assistant that beats its human-supervised counterpart.

37

162

1K

Dongkeun Yoon

@dongkeun_yoon

6 months

🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.

9

49

302

Peter Hase

@peterbhase

6 months

colab: https://t.co/zodx4iOj5O For aficionados, the post also contains some musings on “tuning the random seed” and how to communicate uncertainty associated with this process

colab.research.google.com

Colab notebook

0

Peter Hase

@peterbhase

6 months

Are p-values missing in AI research? Bootstrapping makes model comparisons easy! Here's a new blog/colab with code for: - Bootstrapped p-values and confidence intervals - Combining variance from BOTH sample size and random seed (eg prompts) - Handling grouped test data Link ⬇️

1

3

9

Yu Su

@ysu_nlp

6 months

New AI/LLM Agents Track at #EMNLP2025! In the past few years, it feels a bit odd to submit agent work to *CL venues because one had to awkwardedly fit it into Question Answering or NLP Applications. Glad to see agent research finally finds home at *CL! Kudos to the PC for

9

25

186