joshua_saxe Profile Banner
Joshua Saxe Profile
Joshua Saxe

@joshua_saxe

Followers
3K
Following
26K
Media
236
Statuses
3K

AI+cybersecurity at Meta; past lives in academic history, labor / community organizing, classical/jazz piano, hacking scene

Wichita, KS
Joined May 2013
Don't wanna be here? Send us removal request.
@joshua_saxe
Joshua Saxe
2 months
Today our AI security team @ Meta launched open source tools to support the open source GenAI ecosystem, including:. - LlamaFirewall; a security-first guardrail framework for mitigating agentic prompt injection, misalignment, and insecure coding risks:
Tweet media one
3
46
131
@joshua_saxe
Joshua Saxe
4 days
0
1
0
@joshua_saxe
Joshua Saxe
4 days
AI security notes, July 8th: Improving our (poor) ability to forecast AI so we can do better AI security work
Tweet media one
1
1
8
@joshua_saxe
Joshua Saxe
7 days
RT @nabeelqu: Ok, a few reflections on the book:. 1. qntm defines antimemes as self-erasing information, but this book has a different (but….
0
117
0
@joshua_saxe
Joshua Saxe
7 days
RT @natolambert: Helen (@hlntnr) is one of those people who you should be following closely if you care about where AI is heading and the (….
0
39
0
@joshua_saxe
Joshua Saxe
9 days
RT @xwang_lk: The teasing LeCun gets from some LLM believers today might be nothing compared to the skepticism he faced in the 90s. Back th….
0
36
0
@joshua_saxe
Joshua Saxe
14 days
RT @EdwardRaffML: Hark, a book has appeared! "How Large Language Models Work" with @drewfarris & @BlancheMinerva , is finally officially do….
0
4
0
@joshua_saxe
Joshua Saxe
15 days
0
0
1
@joshua_saxe
Joshua Saxe
15 days
Thoughts on how fundamental LLM limitations lead to insecure vibe code (and what we can do about this), and also on the right way to think about prompt injection guardrail bypasses
Tweet media one
1
0
1
@joshua_saxe
Joshua Saxe
15 days
RT @corbtt: I wouldn't have said this 6 months ago, but I now believe all serious agents will be RL'd on their specific task. The gains are….
0
34
0
@joshua_saxe
Joshua Saxe
22 days
RT @francoisfleuret: BTW, if you don't know those terms:. - "aleatoric randomness" is real randomness, and. - "epistemic randomness" is app….
0
19
0
@joshua_saxe
Joshua Saxe
1 month
RT @AlexGDimakis: There are still posts about 'new papers showing AI models cannot reason'. There are unfortunately problems into how these….
0
19
0
@joshua_saxe
Joshua Saxe
1 month
RT @ezraklein: Many of my more leftist friends (and frenemies) have pushed me on whether Abundance has “a theory of power.” . I often say i….
0
406
0
@joshua_saxe
Joshua Saxe
1 month
Link to post:
0
0
2
@joshua_saxe
Joshua Saxe
1 month
My (aspirationally) weekly attempt at tracking AI security, on why this might be the highest leverage period for the AI security community, new risks in AI coding, AI social engineering, prompt injection as boring appsec, and the speculative-to-real risk conveyor belt.
Tweet media one
2
0
16
@joshua_saxe
Joshua Saxe
2 months
RT @vijaybolina: Introducing Agent Name Service (ANS): DNS for AI agents. Secure, interoperable, and PKI-backed discovery for the future of….
0
20
0
@joshua_saxe
Joshua Saxe
2 months
Link:
0
1
2
@joshua_saxe
Joshua Saxe
2 months
Weekly attempt to think in public about AI security; this week's topics being:.- A society of agents should defend our computer networks.- Social engineering as 'patient 0' for attacker AI adoption.- Reinforcement learning as a security data unblocker, and security risk
Tweet media one
3
1
8
@joshua_saxe
Joshua Saxe
2 months
RT @francoisfleuret: Years ago I made a list of how to cheat in ML for a course, and it remains surprisingly up-to-date. In increasing ord….
0
32
0
@joshua_saxe
Joshua Saxe
2 months
RT @DrJimFan: The Physical Turing Test: your house is a complete mess after a Sunday hackathon. On Monday night, you come home to an immacu….
0
213
0
@joshua_saxe
Joshua Saxe
2 months
Idea: augment osquery with the ability to mix in calls to a local small LLM that processes query result data and outputs JSON. Now all the endpoints in your enterprise can do LLM analytics on stuff like shell, process tree, and registry data, in parallel, at zero API token cost.
1
0
4