cogconfluence Profile Banner
Sarah Schwettmann Profile
Sarah Schwettmann

@cogconfluence

Followers
3K
Following
6K
Media
338
Statuses
2K

Co-founder and Chief Scientist, @TransluceAI // Research Scientist, @MIT_CSAIL

dessert of the real
Joined October 2015
Don't wanna be here? Send us removal request.
@TransluceAI
Transluce
1 month
We’re open-sourcing Docent under an Apache 2.0 license. Check out our public codebase to self-host Docent, peek under the hood, or open issues & pull requests! The hosted version remains the easiest way to get started with one click and use Docent with zero maintenance overhead.
@TransluceAI
Transluce
2 months
Docent, our tool for analyzing complex AI behaviors, is now in public alpha! It helps scalably answer questions about agent behavior, like “is my model reward hacking” or “where does it violate instructions.” Today, anyone can get started with just a few lines of code!
1
13
75
@sayashk
Sayash Kapoor
2 months
Agent benchmarks lose *most* of their resolution because we throw out the logs and only look at accuracy. I’m very excited that HAL is incorporating @TransluceAI’s Docent to analyze agent logs in depth. Peter’s thread is a simple example of the type of analysis this enables,
@PKirgis
Peter Kirgis
2 months
OpenAI claims hallucinations persist because evaluations reward guessing and that GPT-5 is better calibrated. Do results from HAL support this conclusion? On AssistantBench, a general web search benchmark, GPT-5 has higher precision and lower guess rates than o3!
3
12
70
@TransluceAI
Transluce
2 months
At Transluce, we train investigator agents to surface specific behaviors in other models. Can this approach scale to frontier LMs? We find it can, even with a much smaller investigator! We use an 8B model to automatically jailbreak GPT-5, Claude Opus 4.1 & Gemini 2.5 Pro. (1/)
5
39
244
@metasj
Sam Klein📚🏛️
2 months
@ImanolSchlag and team at SwissAI just released Apertus, a gorious 70B model trained on 1000+ languages. People across the #PublicAI network have been building a publicly hosted frontend for it: try it out via the new inference utility at https://t.co/Vy8bvBNyxX ! #SwissAIWeeks
publicai.co
A nonprofit, open-source service to make public and sovereign AI models more accessible.
@ptsankov
Petar Tsankov
2 months
We ran a full security & compliance evaluation of the just released 🇨🇭 Swiss LLM, 🤖 Apertus, developed by ETH Zurich & EPFL. Answers to most common questions below 👇 1/10
0
2
9
@cogconfluence
Sarah Schwettmann
2 months
and a skyspace from James Turrell, whose work with light inspired my Vision in Art and Neuroscience class at MIT for nearly a decade. sf has a way of sneaking up on my senses with the surprisingly familiar 🫶
0
0
14
@cogconfluence
Sarah Schwettmann
2 months
found two things in the de Young sculpture garden today that I had no idea were here! a beehive piece from Pierre Huyghe, who I worked with in 2022 to install a hive in simulation (along with a real one) on an island in Norway…
1
0
24
@TransluceAI
Transluce
2 months
Docent, our tool for analyzing complex AI behaviors, is now in public alpha! It helps scalably answer questions about agent behavior, like “is my model reward hacking” or “where does it violate instructions.” Today, anyone can get started with just a few lines of code!
6
35
200
@cogconfluence
Sarah Schwettmann
2 months
keeping you fed and hydrated 🫡
@verdakorz
verda🪄✨
2 months
went to a san francisco party yesterday evening
0
1
21
@TransluceAI
Transluce
3 months
This Friday we're hosting "From Theory to Practice to Policy", a fireside chat between Yo Shavit (@yonashav) and Shafi Goldwasser. If you're local to SF and interested in the relationship between new technologies and policy, register to join! https://t.co/Or3R9E79uk
Tweet card summary image
luma.com
Join Yonadav Goldwasser Shavit (OpenAI) and Shafi Goldwasser (UC Berkeley) for a discussion spanning theory, practice, and policy. Topics we'll discuss…
2
7
25
@aryaman2020
Aryaman Arora
3 months
if you think data cleaning is beneath you then ngmi
@heeney_luke
Luke Heeney
3 months
Academia must be the only industry where extremely high-skilled PhD students spend much of their time doing low value work (like data cleaning). A 1st year management consultant outsources this immediately. Imagine the productivity gains if PhDs could focus on thinking
11
30
675
@lukebeehewitt
Luke 🐝 Hewitt
3 months
Largest ever (by far) randomized controlled trial evaluating the persuasive capabilities of LLMs
@KobiHackenburg
Kobi Hackenburg
3 months
Today (w/ @UniofOxford @Stanford @MIT @LSEnews) we’re sharing the results of the largest AI persuasion experiments to date: 76k participants, 19  LLMs, 707 political issues. We examine “levers” of AI persuasion: model scale, post-training, prompting, personalization, & more 🧵
1
3
6
@aryaman2020
Aryaman Arora
3 months
maybe I will live tweet the actionable interp workshop panel
11
8
100
@cogconfluence
Sarah Schwettmann
3 months
opportune moment for a pic of a talk written in blood @ActInterp
@ActInterp
Actionable Interpretability Workshop ICML2025
3 months
Huge thanks to Sarah Schwettmann for a fascinating keynote on "AI Investigators for Understanding AI Systems" 🤖 @cogconfluence @TransluceAI
0
0
22
@TransluceAI
Transluce
3 months
At #ICML2025? Come chat about investigator agents and model behavior with @ChowdhuryNeil and @_ddjohnson at West Exhibition Hall #1012, now until 1:30pm
0
3
16
@aryaman2020
Aryaman Arora
3 months
please come to East building poster #1108 (ballroom A) rn
@ZhengxuanZenWu
Zhengxuan Wu
4 months
ICML ✈️ this week. open to chat and learn mech interp from you. @aryaman2020 and i have cool ideas about steering, just come to our AxBench poster. new steering blog: https://t.co/ZPIIejq82M 中文:
2
8
43
@WiMLworkshop
WiML
4 months
First Panel at WiML @ ICML 2025! Join us for a candid convo on career pivots, leadership & growth with: Amy (@yayitsamyzhang) • Eleni (@Eleni30fillou) • Sarah (@cogconfluence) 🗓️ Wed 11am #WiML #ICML2025
0
7
16
@_ddjohnson
Daniel Johnson
4 months
I'll be at ICML! Stop by our Thursday morning poster to hear about our investigator agents. Also excited to talk to people about understanding LM behaviors and personas during the conference! Feel free to reach out, DMs open!
@TransluceAI
Transluce
4 months
We'll be at #ICML2025 🇨🇦 this week! Here are a few places you can find us: Monday: Jacob (@JacobSteinhardt) speaking at Post-AGI Civilizational Equilibria ( https://t.co/wtratbvRnF) Wednesday: Sarah (@cogconfluence) speaking at @WiMLworkshop at 10:15 and as a panelist at 11am
0
2
21
@WiMLworkshop
WiML
4 months
Exciting! Don’t miss Sarah (@cogconfluence) speaking at 10:15am and joining the Redefining Success panel at 11am. See you there! 🇨🇦 #WiML #ICML2025
@TransluceAI
Transluce
4 months
We'll be at #ICML2025 🇨🇦 this week! Here are a few places you can find us: Monday: Jacob (@JacobSteinhardt) speaking at Post-AGI Civilizational Equilibria ( https://t.co/wtratbvRnF) Wednesday: Sarah (@cogconfluence) speaking at @WiMLworkshop at 10:15 and as a panelist at 11am
0
1
5
@TransluceAI
Transluce
4 months
We'll be at #ICML2025 🇨🇦 this week! Here are a few places you can find us: Monday: Jacob (@JacobSteinhardt) speaking at Post-AGI Civilizational Equilibria ( https://t.co/wtratbvRnF) Wednesday: Sarah (@cogconfluence) speaking at @WiMLworkshop at 10:15 and as a panelist at 11am
1
7
40
@cogconfluence
Sarah Schwettmann
4 months
1
0
9