
Peter Hase
@peterbhase
Followers
3K
Following
2K
Media
57
Statuses
470
Visiting Scientist at Schmidt Sciences. Visiting Researcher at the Stanford NLP Group Previously: Anthropic, AI2, Google, Meta, UNC Chapel Hill
New York, NY
Joined April 2019
My last PhD paper š: fundamental problems with model editing for LLMs!. We present *12 open challenges* with definitions/benchmarks/assumptions, inspired by work on belief revision in philosophy. To provide a way forward, we test model editing against Bayesian belief revision.š§µ
3
74
307
RT @hannahrosekirk: My team at @AISecurityInst is hiring! This is an awesome opportunity to get involved with cutting-edge scientific reseaā¦.
0
24
0
RT @nouhadziri: Current agents are highly unsafe, o3-mini one of the most advanced models in reasoning score 71% in executing harmful requeā¦.
0
15
0
RT @milesaturpin: New @Scale_AI paper! š. LLMs trained with RL can exploit reward hacks but not mention this in their CoT. We introduce verā¦.
0
77
0
Overdue job update -- I am now:.- A Visiting Scientist at @schmidtsciences, supporting AI safety and interpretability.- A Visiting Researcher at the Stanford NLP Group, working with @ChrisGPotts. I am so grateful I get to keep working in this fascinating and essential area, and.
15
22
174
RT @FazlBarez: Excited to share our paper: "Chain-of-Thought Is Not Explainability"! . We unpack a critical misconception in AI: models expā¦.
0
136
0
RT @JustenMichel: really interesting to see just how gendered excitement about AI is, even among AI experts
0
47
0
RT @farairesearch: š¤ Can lie detectors make AI more honest? Or will they become sneakier liars?. We tested what happens when you add deceptā¦.
0
10
0
RT @jiaxinwen22: New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive orā¦.
0
157
0
RT @dongkeun_yoon: š LLMs are overconfident even when they are dead wrong. š§ What about reasoning models? Can they actually tell us āMy anā¦.
0
49
0
colab: For aficionados, the post also contains some musings on ātuning the random seedā and how to communicate uncertainty associated with this process.
colab.research.google.com
Colab notebook
0
0
0
RT @ysu_nlp: New AI/LLM Agents Track at #EMNLP2025! . In the past few years, it feels a bit odd to submit agent work to *CL venues becauseā¦.
0
24
0
RT @vaidehi_patil_: šØ Introducing our @TmlrOrg paper āUnlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaā¦.
0
37
0
RT @EliasEskin: Extremely excited to announce that I will be joining @UTAustin @UTCompSci in August 2025 as an Assistant Professor! š. Iāmā¦.
0
65
0
RT @rowankwang: New Anthropic Alignment Science blog post: Modifying LLM Beliefs with Synthetic Document Finetuning. We study a technique fā¦.
0
46
0
RT @amuuueller: Lots of progress in mech interp (MI) lately! But how can we measure when new mech interp methods yield real improvements ovā¦.
0
38
0
RT @sydneymlevine: šAnnouncement time!šIn Spring 2026, I will be joining the NYU Psych department as an Assistant Professor! My lab will sā¦.
0
13
0
RT @maksym_andr: Excited to present our recent work on AI safety at this event!. If you're coming to ICLR 2025 in Sā¦.
0
9
0
RT @yanda_chen_: My first paper @AnthropicAI is out!. We show that Chains-of-Thought often donāt reflect modelsā true reasoningāposing chalā¦.
0
87
0