peterbhase Profile Banner
Peter Hase Profile
Peter Hase

@peterbhase

Followers
3K
Following
2K
Media
57
Statuses
481

AI Institute Fellow at Schmidt Sciences. Postdoc at Stanford NLP Group. Previously: Anthropic, AI2, Google, Meta, UNC Chapel Hill

New York, NY
Joined April 2019
Don't wanna be here? Send us removal request.
@peterbhase
Peter Hase
1 year
My last PhD paper 🎉: fundamental problems with model editing for LLMs! We present *12 open challenges* with definitions/benchmarks/assumptions, inspired by work on belief revision in philosophy To provide a way forward, we test model editing against Bayesian belief revision 🧵
3
75
305
@schmidtsciences
Schmidt Sciences
2 days
We're excited to welcome 28 new AI2050 Fellows! This 4th cohort of researchers are pursuing projects that include building AI scientists, designing trustworthy models, and improving biological and medical research, among other areas. https://t.co/8oY7xdhxvF
6
27
172
@sarahwiegreffe
Sarah Wiegreffe
3 days
I am recruiting 2 PhD students to work on LM interpretability at UMD @umdcs starting in fall 2026! We are #3 in AI and #4 in NLP research on @CSrankings. Come join us in our lovely building just a few miles from Washington, D.C. Details in 🧵
12
154
715
@StewartSlocum1
Stewart Slocum
16 days
Techniques like synthetic document fine-tuning (SDF) have been proposed to modify AI beliefs. But do AIs really believe the implanted facts? In a new paper, we study this empirically. We find: 1. SDF sometimes (not always) implants genuine beliefs 2. But other techniques do not
5
37
185
@peterbhase
Peter Hase
16 days
I would encourage technical AI types to consider working in grantmaking! Schmidt Sciences is hiring for a unique position where you get to continue your own research at the same time Link:
jobs.lever.co
Summary Schmidt Sciences invites recent PhD graduates in AI and computer science to apply for a 12-18 month fellows-in-residence program. Reporting to the Director of the AI Institute at Schmidt...
4
29
145
@peterbhase
Peter Hase
2 months
My research code has never been sloppier than when written by AI. So many silently failing training runs What works well for me: - rubber ducking in a web app What costs me hours on a 1 week lag: - pressing tab
@tszzl
roon
2 months
right now is the time where the takeoff looks the most rapid to insiders (we don’t program anymore we just yell at codex agents) but may look slow to everyone else as the general chatbot medium saturates
0
0
7
@LakeBrenden
Brenden Lake
2 months
Our new lab for Human & Machine Intelligence is officially open at Princeton University! Consider applying for a PhD or Postdoc position, either through the depts. of Computer Science or Psychology. You can register interest on our new website https://t.co/fRPhtmJdrH (1/2)
10
64
595
@StephenLCasper
Cas (Stephen Casper)
2 months
📌📌📌 I'm excited to be on the faculty job market this fall. I updated my website with my CV. https://t.co/4Ddv6tN0jq
Tweet card summary image
stephencasper.com
Visit the post for more.
8
22
173
@peterbhase
Peter Hase
3 months
Shower thought: LLMs still have very incoherent notions of evidence, and they update in strange ways when presented with information in-context that is relevant to their beliefs. I really wonder what will happen when LLM agents start doing interp on themselves and see the source
5
5
23
@hannahrosekirk
Hannah Rose Kirk
4 months
My team at @AISecurityInst is hiring! This is an awesome opportunity to get involved with cutting-edge scientific research inside government on frontier AI models. I genuinely love my job and the team 🤗 Link: https://t.co/poiWqKlmgt More Info: ⬇️
3
24
110
@nouhadziri
Nouha Dziri
4 months
Current agents are highly unsafe, o3-mini one of the most advanced models in reasoning score 71% in executing harmful requests 😱 We introduce a new framework for evaluating agent safety✨🦺 Discover more 👇 👩‍💻 Code & data: https://t.co/mw6XVDMc6q 📄 Paper:
@sanidhya903
Sanidhya Vijayvargiya
4 months
1/ AI agents are increasingly being deployed for real-world tasks, but how safe are they in high-stakes settings? 🚨 NEW: OpenAgentSafety - A comprehensive framework for evaluating AI agent safety in realistic scenarios across eight critical risk categories. 🧵
2
16
70
@milesaturpin
Miles Turpin
4 months
New @Scale_AI paper! 🌟 LLMs trained with RL can exploit reward hacks but not mention this in their CoT. We introduce verbalization fine-tuning (VFT)—teaching models to say when they're reward hacking—dramatically reducing the rate of undetected hacks (6% vs. baseline of 88%).
9
70
282
@peterbhase
Peter Hase
4 months
Overdue job update -- I am now: - A Visiting Scientist at @schmidtsciences, supporting AI safety and interpretability - A Visiting Researcher at the Stanford NLP Group, working with @ChrisGPotts I am so grateful I get to keep working in this fascinating and essential area, and
15
22
174
@FazlBarez
Fazl Barez@EMNLP 🇨🇳
4 months
Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵
28
145
657
@JustenMichel
Michel
4 months
really interesting to see just how gendered excitement about AI is, even among AI experts
15
40
239
@farairesearch
FAR.AI
5 months
🤔 Can lie detectors make AI more honest? Or will they become sneakier liars? We tested what happens when you add deception detectors into the training loop of large language models. Will training against probe-detected lies encourage honesty? Depends on how you train it!
4
11
69
@jiaxinwen22
Jiaxin Wen
5 months
New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive or better than using human supervision. Using this approach, we are able to train a Claude 3.5-based assistant that beats its human-supervised counterpart.
37
162
1K
@dongkeun_yoon
Dongkeun Yoon
6 months
🙁 LLMs are overconfident even when they are dead wrong. 🧐 What about reasoning models? Can they actually tell us “My answer is only 60% likely to be correct”? ❗Our paper suggests that they can! Through extensive analysis, we investigate what enables this emergent ability.
9
49
302
@peterbhase
Peter Hase
6 months
colab: https://t.co/zodx4iOj5O For aficionados, the post also contains some musings on “tuning the random seed” and how to communicate uncertainty associated with this process
Tweet card summary image
colab.research.google.com
Colab notebook
0
0
0
@peterbhase
Peter Hase
6 months
Are p-values missing in AI research? Bootstrapping makes model comparisons easy! Here's a new blog/colab with code for: - Bootstrapped p-values and confidence intervals - Combining variance from BOTH sample size and random seed (eg prompts) - Handling grouped test data Link ⬇️
1
3
9
@ysu_nlp
Yu Su
6 months
New AI/LLM Agents Track at #EMNLP2025! In the past few years, it feels a bit odd to submit agent work to *CL venues because one had to awkwardedly fit it into Question Answering or NLP Applications. Glad to see agent research finally finds home at *CL! Kudos to the PC for
9
25
186