Paul Kassianik @kass_paul X Profile

Paul Kassianik

@kass_paul

Followers

199

Following

1K

Media

6

Statuses

147

AI researcher @fdtn_ai @Cisco. Formerly @robusthq @SFResearch. Researching AI Security.

SF Bay Area

Joined October 2020

Don't wanna be here? Send us removal request.

Paul Kassianik

@kass_paul

2 years

🚀 Exciting to present "Tree of Attacks: Jailbreaking Black-Box LLMs Automatically" 📑🌳 Unlock the power of automated jailbreaks using Tree of Attacks with Pruning (TAP).

arxiv.org

While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks. In...

3

9

25

Paul Kassianik

@kass_paul

2 months

RT @natolambert: Is it "bad" that everyone is distilling from / training on Chinese models? While not directly bad, there is a large soft p….

0

7

0

Paul Kassianik

@kass_paul

2 months

RT @fdtn_ai: - We will be presenting "Adversarial Reasoning at Jailbreaking Time" on Wednesday morning, Poster E-805: .

0

1

0

Paul Kassianik

@kass_paul

2 months

RT @fdtn_ai: Foundation AI is coming to #ICML! Catch us at any of these events:. - Our Chief Scientist @aminkarbasi will be hosting a tutor….

0

1

0

Paul Kassianik

@kass_paul

2 months

RT @dlwh: So about a month ago, Percy posted a version of this plot of our Marin 32B pretraining run. We got a lot of feedback, both public….

0

102

0

Paul Kassianik

@kass_paul

2 months

"We don't have time for proper science, we have to beat other labs on LiveCodeBench". So true -- "number go up" research might be a flag that the victim of the Bitter Lesson is . you! @finbarrtimbers

artfintel.com

Far too many people misunderstand the bitter lesson

1

0

2

Paul Kassianik

@kass_paul

3 months

It might very well be that there is no moat in small models, especially domain specific ones unless they are a stepping stone to something much more fundamental.

Ethan Mollick

@emollick

3 months

McKinsey's new report on AI agents shows the same mindset I see in many firms: a focus on making small, obsolete models do basic work (look at their suggested models!) rather than realizing that smarter models can do higher-end work (and those models are getting cheaper & better)

0

2

Paul Kassianik

@kass_paul

3 months

RT @emollick: McKinsey's new report on AI agents shows the same mindset I see in many firms: a focus on making small, obsolete models do ba….

0

229

0

Paul Kassianik

@kass_paul

3 months

RT @maksym_andr: Check out our new paper on monitoring decomposition jailbreak attacks!. Monitoring is (still) an underappreciated research….

0

5

0

Paul Kassianik

@kass_paul

3 months

RT @chargoddard: 🤯 MIND-BLOWN! A new paper just SHATTERED everything we thought we knew about AI reasoning!. This is paradigm-shifting. A M….

0

245

0

Paul Kassianik

@kass_paul

3 months

Super excited to collaborate on this work with the amazing @kotekjedi_ml , @maksym_andr , and @jonasgeiping ! . We can now confidently say that automatic red teaming methods are great harnesses for attacks - LLMs just need to be capable enough to use them!.

Alexander Panfilov

@kotekjedi_ml

3 months

Stronger models need stronger attackers! 🤖⚔️.In our new paper we explore how attacker-target capability dynamics affect red-teaming success (ASR). Key insights:.🔸Stronger models = better attackers.🔸ASR depends on capability gap.🔸Psychology >> STEM for ASR. More in 🧵👇

0

1

6

Paul Kassianik

@kass_paul

3 months

RT @natolambert: immortalizing this moment forever when RL is so easy that you can just use random rewards and your benchmarks still go up….

0

58

0

Paul Kassianik

@kass_paul

4 months

RT @aif_media:

0

30

0

Paul Kassianik

@kass_paul

4 months

Pretty sure of all the guides on PPO/GRPO I've seen out there, this is the most simple, straightforward, and accesible one by @YugeTen .

yugeten.github.io

0

1

Paul Kassianik

@kass_paul

4 months

lack of focus kills ai labs.

0

Paul Kassianik

@kass_paul

4 months

Hell yeah!.

Suhail

@Suhail

4 months

If Cisco is making foundation models, things are getting very serious.

0

Paul Kassianik

@kass_paul

4 months

RT @rohanpaul_ai: Automated detection of LLM hallucinations using only correct examples is fundamentally difficult. This paper shows detec….

0

51

0

Paul Kassianik

@kass_paul

4 months

RT @aminkarbasi: We just released: Foundation-Sec-8B, an open-weight, 8-billion-parameter base model purpose-built for cybersecurity. Wh….

0

5

0

Paul Kassianik

@kass_paul

4 months

RT @Cisco: "While security has been blamed for slowing technology adoption in the past, we believe that taking the right approach to safety….

0

8

0

Paul Kassianik

@kass_paul

5 months

I have a theory that the AI hype is simply due to a collective addiction of humanity's top talent to loss curve dopamine.

0

1

Paul Kassianik

@kass_paul

6 months

What kind of backdoor is this? @allen_ai . how did this make it into the final dolma PII tagger?. Permalink:

0

1