
Raymond Douglas
@raymondadouglas
Followers
136
Following
526
Media
5
Statuses
48
Semi-professionally trying to figure out what a good future would look like
Joined December 2020
Great news, we're in the world where ChatGPT safeguards can be bypassed by invoking the name of Moloch.
theatlantic.com
OpenAI’s chatbot also said “Hail Satan.”
0
0
3
RT @jankulveit: We're presenting ICML Position "Humanity Faces Existential Risk from Gradual Disempowerment" : come talk to us today East E….
0
16
0
RT @DavidDuvenaud: It's hard to plan for AGI without knowing what outcomes are even possible, let alone good. So we’re hosting a workshop!….
0
32
0
RT @tomekkorbak: I reimplemented the bliss attractor eval from Claude 4 System Card. It's fascinating how LLMs reliably fall into attractor….
0
24
0
RT @MariusHobbhahn: LLMs Often Know When They Are Being Evaluated!. We investigate frontier LLMs across 1000 datapoints from 61 distinct da….
0
81
0
RT @DavidDuvenaud: What to do about gradual disempowerment? We laid out a research agenda with all the concrete and feasible research proje….
0
36
0
Very nice approach: using tiktok metadata to approximate the global distribution of AI-generated content.
.@jankulveit and @raymondadouglas discuss the need for cultural metrics to track the prevalence of AI in culture in their recent gradual disempowerment paper. One outcome of this work⬇️ is the ability to do that! 🧵.
0
0
2
codex web UI, codex-1 model, codex CLI. if only there were some official list explaining what all the names referred to. but what would even you call such a document?.
cheat sheet:. as a reminder, Codex (2025, web UI) is powered by codex-1 ( ( (, our new model built on OpenAI o3. you can't use codex-1 in Codex CLI (, but you CAN use. .
0
0
1
RT @AndrewCritchPhD: @eshear I would even say "in living systems, correlation is often compensation, because almost everything is part of o….
0
1
0
RT @dpaleka: 3.7 sonnet: *hands behind back* yes the tests do pass. why do you ask. what did you hear. 4o: yes you are Jesus Christ's broth….
0
269
0