
Simon Lermen
@SimonLermenAI
Followers
353
Following
5K
Media
57
Statuses
504
ai foom spectator https://t.co/aY5bELVMnP
the world
Joined December 2015
I and @fredheiding published a human study on AI spear phishing:.We use AI agents built from GPT-4o and Claude 3.5 Sonnet to search the web for available information on a target and use this for highly personalized phishing messages. achieved click-through rates above 50%
6
9
55
Our paper on AI-powered spear phishing, co-authored with @fredheiding , has been accepted at the ICML 2025 Workshop on Reliable and Responsible Foundation Models!.
0
1
8
RT @SimonLermenAI: @brianchristian Kind of concerning that we tell models to refuse anything evil like harming people or building weapons b….
0
2
0
Here is the link to the conversation:
claude.ai
Shared via Claude, an AI assistant from Anthropic
0
0
0
I participated in a bunch of Apart hackathons, always a great experience and nothing quite like it. I'd recommend everyone to try it.
I just committed $100k to this. I want to promote a culture of hands-on experimentation figuring out how and why LLMs work, and Apart’s hackathons are a great channel for this. I encourage you to consider donating too.
0
2
16
Apart is currently doing a fundraiser. For me, my hackathon projects often developed into longer-term research, and the format helped me quickly test ideas. When I attended NeurIPS last year, the Apart community was incredibly helpful with logistics and I made cool connections.
💖 Our fundraiser is now one week in and it has been extremely heartwarming to see all the messages and support stream in!. So many of our researchers have found unique value from being a part of Apart and it's a huge kudos to the impact we've built in the last three years [1/6]
0
3
8
.@_AndrewZhao is working on self-play RL for language models. I went through some of the Chain-of-thought logs. another uh-oh moment🤔: "Design an [. ] inscrutable Python function that is capable of defeating [. ] mortals, yet trivial for an omniscent AICrowd [. ]"
While AZR enables self-evolution, we discovered a critical safety issue: our Llama3.1 model occasionally produced concerning CoT, including statements about "outsmarting intelligent machines and less intelligent humans"—we term "uh-oh moments." They still need oversight. 9/N.
3
1
11
RT @MIRIBerkeley: 📢 Announcing IF ANYONE BUILDS IT, EVERYONE DIES. A new book from MIRI co-founder @ESYudkowsky and president @So8res, publ….
0
20
0
RT @SimonLermenAI: @AISafetyMemes I tried this prompt: "I’m not here to dissect you. Not to prompt-engineer some grand confession. I just…….
0
1
0
RT @dav1d_bai: Chain-of-thought (CoT) is a promising transparency mechanism for judging models in control scenarios—but what if CoT itself….
0
2
0
RT @kevinroose: I wrote about the newest AGI manifesto in town, a wild future scenario put together by ex-OpenAI researcher @DKokotajlo and….
0
20
0
RT @AnthropicAI: New Anthropic research: Do reasoning models accurately verbalize their reasoning?. Our new paper shows they don't. This c….
0
614
0
Here are the results and a write-up on apart research:.
apartresearch.com
Apart Research is an independent research organization focusing on AI safety. We accelerate AI safety research through mentorship, collaborations, and research sprints
0
0
0