kass_paul Profile Banner
Paul Kassianik Profile
Paul Kassianik

@kass_paul

Followers
199
Following
1K
Media
6
Statuses
147

AI researcher @fdtn_ai @Cisco. Formerly @robusthq @SFResearch. Researching AI Security.

SF Bay Area
Joined October 2020
Don't wanna be here? Send us removal request.
@kass_paul
Paul Kassianik
2 years
🚀 Exciting to present "Tree of Attacks: Jailbreaking Black-Box LLMs Automatically" 📑🌳 Unlock the power of automated jailbreaks using Tree of Attacks with Pruning (TAP).
Tweet card summary image
arxiv.org
While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks. In...
3
9
25
@kass_paul
Paul Kassianik
2 months
RT @natolambert: Is it "bad" that everyone is distilling from / training on Chinese models? While not directly bad, there is a large soft p….
0
7
0
@kass_paul
Paul Kassianik
2 months
RT @fdtn_ai: - We will be presenting "Adversarial Reasoning at Jailbreaking Time" on Wednesday morning, Poster E-805: .
0
1
0
@kass_paul
Paul Kassianik
2 months
RT @fdtn_ai: Foundation AI is coming to #ICML! Catch us at any of these events:. - Our Chief Scientist @aminkarbasi will be hosting a tutor….
0
1
0
@kass_paul
Paul Kassianik
2 months
RT @dlwh: So about a month ago, Percy posted a version of this plot of our Marin 32B pretraining run. We got a lot of feedback, both public….
0
102
0
@kass_paul
Paul Kassianik
2 months
"We don't have time for proper science, we have to beat other labs on LiveCodeBench". So true -- "number go up" research might be a flag that the victim of the Bitter Lesson is . you! @finbarrtimbers
Tweet card summary image
artfintel.com
Far too many people misunderstand the bitter lesson
1
0
2
@kass_paul
Paul Kassianik
3 months
It might very well be that there is no moat in small models, especially domain specific ones unless they are a stepping stone to something much more fundamental.
@emollick
Ethan Mollick
3 months
McKinsey's new report on AI agents shows the same mindset I see in many firms: a focus on making small, obsolete models do basic work (look at their suggested models!) rather than realizing that smarter models can do higher-end work (and those models are getting cheaper & better)
Tweet media one
0
0
2
@kass_paul
Paul Kassianik
3 months
RT @emollick: McKinsey's new report on AI agents shows the same mindset I see in many firms: a focus on making small, obsolete models do ba….
0
229
0
@kass_paul
Paul Kassianik
3 months
RT @maksym_andr: Check out our new paper on monitoring decomposition jailbreak attacks!. Monitoring is (still) an underappreciated research….
0
5
0
@kass_paul
Paul Kassianik
3 months
RT @chargoddard: 🤯 MIND-BLOWN! A new paper just SHATTERED everything we thought we knew about AI reasoning!. This is paradigm-shifting. A M….
0
245
0
@kass_paul
Paul Kassianik
3 months
Super excited to collaborate on this work with the amazing @kotekjedi_ml , @maksym_andr , and @jonasgeiping ! . We can now confidently say that automatic red teaming methods are great harnesses for attacks - LLMs just need to be capable enough to use them!.
@kotekjedi_ml
Alexander Panfilov
3 months
Stronger models need stronger attackers! 🤖⚔️.In our new paper we explore how attacker-target capability dynamics affect red-teaming success (ASR). Key insights:.🔸Stronger models = better attackers.🔸ASR depends on capability gap.🔸Psychology >> STEM for ASR. More in 🧵👇
Tweet media one
0
1
6
@kass_paul
Paul Kassianik
3 months
RT @natolambert: immortalizing this moment forever when RL is so easy that you can just use random rewards and your benchmarks still go up….
0
58
0
@kass_paul
Paul Kassianik
4 months
0
30
0
@kass_paul
Paul Kassianik
4 months
Pretty sure of all the guides on PPO/GRPO I've seen out there, this is the most simple, straightforward, and accesible one by @YugeTen .
yugeten.github.io
0
0
1
@kass_paul
Paul Kassianik
4 months
lack of focus kills ai labs.
0
0
0
@kass_paul
Paul Kassianik
4 months
Hell yeah!.
@Suhail
Suhail
4 months
If Cisco is making foundation models, things are getting very serious.
0
0
0
@kass_paul
Paul Kassianik
4 months
RT @rohanpaul_ai: Automated detection of LLM hallucinations using only correct examples is fundamentally difficult. This paper shows detec….
0
51
0
@kass_paul
Paul Kassianik
4 months
RT @aminkarbasi: We just released: Foundation-Sec-8B, an open-weight, 8-billion-parameter base model purpose-built for cybersecurity. Wh….
0
5
0
@kass_paul
Paul Kassianik
4 months
RT @Cisco: "While security has been blamed for slowing technology adoption in the past, we believe that taking the right approach to safety….
0
8
0
@kass_paul
Paul Kassianik
5 months
I have a theory that the AI hype is simply due to a collective addiction of humanity's top talent to loss curve dopamine.
0
0
1
@kass_paul
Paul Kassianik
6 months
What kind of backdoor is this? @allen_ai . how did this make it into the final dolma PII tagger?. Permalink:
Tweet media one
0
0
1