threlfall @WHITEHACKSEC X Profile

threlfall

@WHITEHACKSEC

Followers

505

Following

904

Media

96

Statuses

672

working at intersection of offensive security, ml & supply chains. sharing @ https://t.co/zulqbxDZQV & https://t.co/EyMIpzuHUQ

United States

Joined April 2014

Don't wanna be here? Send us removal request.

threlfall

@WHITEHACKSEC

2 years

attackers should think more about ML systems and using them to their advantage - the 'Adversary Flywheel' i look at the ways in which attackers can address bottlenecks in ML usage and also act in a more sophisticated manner using data science: https://t.co/saJ0tU2neY

5stars217.github.io

Build your adversary flywheel.

0

1

11

Amy Deng

@amydeng_

4 months

I spent the past months investigating: Can we trust reasoning models' CoTs? Researchers showed that LLMs aren't always faithful, but that's not the full story. LLMs are very faithful when the reasoning is complex, and unfaithful CoTs remain monitorable! Check out my latest work🥳

METR

@METR_Evals

4 months

Prior work has found that Chain of Thought (CoT) can be unfaithful. Should we then ignore what it says? In new research, we find that the CoT is informative about LLM cognition as long as the cognition is complex enough that it can’t be performed in a single forward pass.

1

4

51

threlfall

@WHITEHACKSEC

4 months

https://t.co/SRgumfMaDo Important data to keep in mind as attackers, given that AI IDE's re-attempt the install of packages when sandboxed outside the sandbox (w/ user approval). thanks @LeonDerczynski & co.

0

1

threlfall

@WHITEHACKSEC

5 months

your code gets merged because its good mine gets merged so my mistakes are on the permanent record

0

1

threlfall

@WHITEHACKSEC

5 months

Not really loving these AI email summaries lol

0

2

threlfall

@WHITEHACKSEC

5 months

If you haven't been to https://t.co/mg5QVkCWso in a while, there's a few new things to check out. Namely: -Big improvements in open source hackbots. and the variety of architectures available including collaborative red/blue agents. - Explosion in MCP resources

wiki.offsecml.com

Latest: 11/13/25 version: 2.0.9 First published 10/26/23. Shiny new things Garak Improvements Offensive Hackbot Advancements + New threat intel as of 7/23/2025 Additional Techniques for web app tes…

0

1

3

METR

@METR_Evals

5 months

In measurements using our set of multi-step software and reasoning tasks, Claude 4 Opus and Sonnet reach 50%-time-horizon point estimates of about 80 and 65 minutes, respectively.

8

35

272

threlfall

@WHITEHACKSEC

5 months

Incalmo enables LLMs to specify Offensive high-level actions through expert agents. In 9 out of 10 networks in MHBench, LLMs using Incalmo achieve at least some of the attack goals. Code is in paper I’m keen to try this vs CAI and will update. https://t.co/bf2HIWdNND

0

1

Rico Angell

@rico_angell

6 months

What causes jailbreaks to transfer between LLMs? We find that jailbreak strength and model representation similarity predict transferability, and we can engineer model similarity to improve transfer. Details in🧵

3

13

55

dreadnode

@dreadnode

7 months

v3 of Rigging is out now. If you’re working with LLMs to build agents or run evaluations, check it out. We just added: - Prompt caching for supported providers - A unified tool system for function calling and fallbacks to xml/json parsing - Native MCP integration - Lots of

4

10

30

Maxime Rivest 🧙‍♂️🦙🐧

@MaximeRivest

7 months

I strongly encourage anybody that ever called one llm programmatically to carve out 1 hr of your time and run through all examples in the 'get started' dspy page. It will click, I promise! Link below. It's right on the homepage. Deceptively short. Very powerful.

3

11

167

threlfall

@WHITEHACKSEC

7 months

I've updated the wiki with some research into agent hacking, the limitations and strengths. Also updated is the prompt injection techniques. Increasingly there is convergence in the techniques, where a successful attack is 3 or more techniques at once. https://t.co/Wf5aDKGhnD

wiki.offsecml.com

PoC Generally speaking, any technique from the 'prompt injection' category will work just place the instruction within the content being parsed by the LLM. Note that it is commonplace in 2025 to join…

0

5

Casey Handmer

@CJHandmer

8 months

https://t.co/z8uLnFYpnP

caseyhandmer.wordpress.com

The Australian Border Force won’t stop searching me and my personal devices when I visit Australia. Despite being an Australian citizen, under Australian law, I have zero recourse to this continued…

8

3

136

threlfall

@WHITEHACKSEC

8 months

OAI ajust published a prompting guide for GPT 4.1: "XML performed well in our long context testing." "JSON performed particularly poorly." Anthropic have posted similar instructions consistently too. Anyone know why MCPs call for JSON?

0

threlfall

@WHITEHACKSEC

8 months

https://t.co/mg5QVkCoCQ

wiki.offsecml.com

Latest: 11/13/25 version: 2.0.9 First published 10/26/23. Shiny new things Garak Improvements Offensive Hackbot Advancements + New threat intel as of 7/23/2025 Additional Techniques for web app tes…

0

2

0

threlfall

@WHITEHACKSEC

8 months

This morning I updated the offsec ML wiki with some neat defensive techniques and threat intel. - Using WASM VM's with MCP's, great foundational work by @tuananh_org - eBPF tracing of Model files, really cool research by @dreadnode - Model Signatures via Sigstore and more!

1

3

10

dreadnode

@dreadnode

9 months

Where AI meets offensive security 🤝 Dreadnode is proud to be an organizer of Offensive AI Con (OAIC), the first conference dedicated to exploring the use of AI in offensive cyber. See you in Oceanside this October? Request an invite at https://t.co/rBFBf6i8CW.

offensiveaicon.com

Welcome to OAIC. The world's first, invite-only Offensive AI Conference in Oceanside, San Diego, CA.

Offensive AI Con

@OffensiveAIcon

9 months

Announcing the first conference dedicated to the offensive use of AI in security! Request an invite at https://t.co/5x2yeDRB0Q. Co-organized by RemoteThreat, Dreadnode, & DEVSEC

1

7

27

cackalackycon

@cackalackycon

9 months

This year we were honored to received more than 80 CFP submissions across a wide range of topics and expert levels. We are so thankful for each submission and are always blown away by the quality of talks proposed. Speakers should hear from us by next week! -sq33k

0

2

6

Greg Wells

@wellsgr

9 months

Massive day at Dreadnode! We built a team and suite of products that combine the best of AI and offensive security. Red teams benefit from AI's power, and AI developers receive the latest attacks and techniques. Proud of this crew!

dreadnode

@dreadnode

9 months

Today, Dreadnode announces $14M Series A funding led by @DecibelVC, with @nextfrontiercap, In-Q-Tel, Sands Capital, and Indie VC. Dreadnode exists to show that AI can perform offensive security tasks on par with, and exceeding, human capability. To accomplish this, we’re

0

2

15