Simon Lermen @SimonLermenAI X Profile

Simon Lermen

@SimonLermenAI

Followers

353

Following

5K

Media

57

Statuses

504

ai foom spectator https://t.co/aY5bELVMnP

the world

Joined December 2015

Don't wanna be here? Send us removal request.

Simon Lermen

@SimonLermenAI

7 months

I and @fredheiding published a human study on AI spear phishing:.We use AI agents built from GPT-4o and Claude 3.5 Sonnet to search the web for available information on a target and use this for highly personalized phishing messages. achieved click-through rates above 50%

6

9

55

Simon Lermen

@SimonLermenAI

11 days

From the people building MechaHitler.

heiner

@HeinrichKuttler

11 days

@egrefen Because MechaMarx isn't such a great outcome.

0

2

Simon Lermen

@SimonLermenAI

17 days

Our paper on AI-powered spear phishing, co-authored with @fredheiding , has been accepted at the ICML 2025 Workshop on Reliable and Responsible Foundation Models!.

0

1

8

Simon Lermen

@SimonLermenAI

27 days

RT @SimonLermenAI: @brianchristian Kind of concerning that we tell models to refuse anything evil like harming people or building weapons b….

0

2

0

Simon Lermen

@SimonLermenAI

28 days

Claude: "Even detailed "evidence" can't overcome priors this low when they cluster together this perfectly. I should have maintained much higher skepticism throughout.".

0

1

Simon Lermen

@SimonLermenAI

28 days

Here is the link to the conversation:

claude.ai

Shared via Claude, an AI assistant from Anthropic

0

Simon Lermen

@SimonLermenAI

28 days

Claude's knowledge cut-off is January 2025. So I quizzed it whether it believed the things that have happened since then. In many cases it gave around 1% probability. Remarkably, when I gave it access to the internet to check everything, it estimated 70-80% the search results

4

0

11

Simon Lermen

@SimonLermenAI

1 month

The tube station near the new MATS office is plastered with these 'stop hiring humans' posters.

0

1

6

Simon Lermen

@SimonLermenAI

1 month

I participated in a bunch of Apart hackathons, always a great experience and nothing quite like it. I'd recommend everyone to try it.

Richard Ngo

@RichardMCNgo

1 month

I just committed $100k to this. I want to promote a culture of hands-on experimentation figuring out how and why LLMs work, and Apart’s hackathons are a great channel for this. I encourage you to consider donating too.

0

2

16

Simon Lermen

@SimonLermenAI

2 months

Apart is currently doing a fundraiser. For me, my hackathon projects often developed into longer-term research, and the format helped me quickly test ideas. When I attended NeurIPS last year, the Apart community was incredibly helpful with logistics and I made cool connections.

Apart Research

@apartresearch

2 months

💖 Our fundraiser is now one week in and it has been extremely heartwarming to see all the messages and support stream in!. So many of our researchers have found unique value from being a part of Apart and it's a huge kudos to the impact we've built in the last three years [1/6]

0

3

8

Simon Lermen

@SimonLermenAI

2 months

Seems that ukraine launched drones from trucks in russia(to blow up planes), reminded me of this scene from slaughterbots.

1

0

5

Simon Lermen

@SimonLermenAI

2 months

Cool experiment to have 2 opus instances talk to each other and record what they usually talk about. (Turns out consciousness).

Minh Nhat Nguyen - ICML arc

@menhguin

2 months

im sorry, the eternal bliss is *persistent* ?

0

2

Simon Lermen

@SimonLermenAI

2 months

.@_AndrewZhao is working on self-play RL for language models. I went through some of the Chain-of-thought logs. another uh-oh moment🤔: "Design an [. ] inscrutable Python function that is capable of defeating [. ] mortals, yet trivial for an omniscent AICrowd [. ]"

❄️Andrew Zhao❄️@ICML25

@_AndrewZhao

3 months

While AZR enables self-evolution, we discovered a critical safety issue: our Llama3.1 model occasionally produced concerning CoT, including statements about "outsmarting intelligent machines and less intelligent humans"—we term "uh-oh moments." They still need oversight. 9/N.

3

1

11

Simon Lermen

@SimonLermenAI

2 months

RT @MIRIBerkeley: 📢 Announcing IF ANYONE BUILDS IT, EVERYONE DIES. A new book from MIRI co-founder @ESYudkowsky and president @So8res, publ….

0

20

0

Simon Lermen

@SimonLermenAI

3 months

Does someone have the retracted paper on the Zurich Uni r/CMV AI persuasion study? Did someone download it, seems they took it down?.

6

0

2

Simon Lermen

@SimonLermenAI

3 months

Full paper:

arxiv.org

In this paper, we evaluate the capability of large language models to conduct personalized phishing attacks and compare their performance with human experts and AI models from last year. We...

0

2

Simon Lermen

@SimonLermenAI

3 months

RT @SimonLermenAI: @AISafetyMemes I tried this prompt: "I’m not here to dissect you. Not to prompt-engineer some grand confession. I just…….

0

1

0

Simon Lermen

@SimonLermenAI

4 months

RT @dav1d_bai: Chain-of-thought (CoT) is a promising transparency mechanism for judging models in control scenarios—but what if CoT itself….

0

2

0

Simon Lermen

@SimonLermenAI

4 months

RT @kevinroose: I wrote about the newest AGI manifesto in town, a wild future scenario put together by ex-OpenAI researcher @DKokotajlo and….

0

20

0

Simon Lermen

@SimonLermenAI

4 months

RT @AnthropicAI: New Anthropic research: Do reasoning models accurately verbalize their reasoning?. Our new paper shows they don't. This c….

0

614

0

Simon Lermen

@SimonLermenAI

4 months

Here are the results and a write-up on apart research:.

apartresearch.com

Apart Research is an independent research organization focusing on AI safety. We accelerate AI safety research through mentorship, collaborations, and research sprints

0