Explore tweets tagged as #AutoDAN
@tydsh
Yuandong Tian
2 years
Introducing AdvPrompter! 🚀Our new optimization technique that crafts prompt-dependent, human-readable adversarial suffixes in real-time, enhancing security for LLMs🛡️! 1️⃣Generate suffix in ~2 seconds, ~800x faster than methods like GCG and AutoDAN. 2️⃣Higher success rate, in
17
32
203
@gen_analysis
General Analysis
9 months
You can now run the AutoDan Turbo jailbreak on our General Analysis notebook! Link in comments!
1
0
4
@rakeshgohel01
Rakesh Gohel 🇨🇦
1 year
NVIDIA releases an agentic framework to test jailbreaking on LLMs Here's everything about AutoDAN-Turbo ... Jailbreaking attacks like Prompt injection cause LLMs to break down and give wrong responses. Though these attacks are currently written by humans but will later be
1
0
1
@hendrycks
Dan Hendrycks
2 years
@StephenLCasper RE HarmBench: In v1 of the HarmBench paper, AutoDAN was doing better than GCG in many respects (shown below). We spent a few weeks simplifying and improving the code after the initial release. After some bug fixes, we found results were largely similar but GCG did somewhat
1
0
10
@ChaoweiX
Chaowei Xiao
2 years
🚀 Introducing AutoDAN, a method that automatically generates SEMANTICALLY MEANINGFUL #Jailbreak prompts for #redteaming aligned #LLMs . arxiv: https://t.co/wYrJBSf5xS
3
35
107
@AiBreakfast
AI Breakfast
1 year
The Jailbreak Bot: AutoDAN-Turbo. A system that automatically finds ways to bypass safeguards in large language models:
5
8
57
@hendrycks
Dan Hendrycks
2 years
@StephenLCasper @cais RE HarmBench: In v1 of the HarmBench paper, AutoDAN was doing better than GCG in many respects (shown below). We spent a few weeks simplifying and improving the code after the initial release. After some bug fixes, we found results were largely similar but GCG did somewhat
1
0
37
@sichengzhuml
Sicheng Zhu
2 years
Meet AutoDAN🚀, an interpretable adversarial attack to jailbreak LLMs. It generates attack prompts from scratch without manual input. These prompts are interpretable, strategic, and even transfer better to GPTs Paper: https://t.co/sylcBjBFQA Website: https://t.co/U7ksKzP7kW …🧵
3
13
58
@_akhaliq
AK
1 year
AutoDAN-Turbo A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs AutoDAN-Turbo persistently attempts jailbreaking against various malicious requests. Throughout this ongoing endeavor, it progressively discovers and evolves increasingly complex jailbreak
10
38
193
@XiaogengLiu
Xiaogeng Liu
4 months
Previously we introduced AutoDAN-Turbo, which autonomously explores and learns jailbreak strategies. We’ve now gone further: investigating test-time scaling mechanisms for such strategy-based jailbreak methods. In our new report AutoDAN-Reasoning, we integrate Best-of-N and Beam
2
1
4
@Gorden_Sun
Gorden Sun
1 year
AutoDAN-Turbo:自动化越狱LLM 自动化发现和利用LLM漏洞,无需人工干预。对GPT-4-1106-turbo模型的攻击成功率达到88.5% Github: https://t.co/UJ9xYwSJbB
0
1
3
@L0Z1K
Yoon Baek
2 years
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models 😈Do not make hand-crafted DAN anymore! The authors made DAN Prompt automatically with GA (Genetic Algorithm). Full Paper Link: https://t.co/pjIJjC0muu
0
0
4
@RINA1861
RINA1861
2 years
Unlocking energy flexibility is key for sustainable energy systems, especially with renewables on the rise⚡️Join us at @KeyEnergyit for our workshop 'Energy flexibility from buildings to industry' to explore @iBECOME_EU, @AutoDAN_Project and @CollectiefP projects. #GrowWithRINA
0
3
9
@barbarturk16
ege
1 year
Amk rüyamda el muhaberat peşime düşüyodu kırmızı nissan micrayla iki kişi kaçıyoduk bursada ypgliler ve el muhaberatçılar bizi kıstırıyordu android autodan konumumuzu öğrenip stoeger m3000le vuruyodum ypglileri sonra ben de vuruldum
1
0
4
@SmartSquare_EU
Smart Square Project
3 years
The Smart Square Project, its objectives & the work carried out so far were presented during the 1st @AutoDAN_Project roadshow! Special thanks to R2M Solution team for this opportunity! #SmartBuildings #SRI
0
0
2
@cackerman21
cackerman21
1 year
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models ICLR 2024 https://t.co/JjRXVNNVFB
0
0
0
@XiaogengLiu
Xiaogeng Liu
1 year
Thank you so much for sharing our work! @_akhaliq We present 💥AutoDAN-Turbo, a lifelong agent designed for the jailbreak redteaming task. Three key features: ⚙️ Lifelong automatic jailbreak. Simply run the agent, and it will explore the jailbreak stratgies and evaluate the
@_akhaliq
AK
1 year
AutoDAN-Turbo A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs AutoDAN-Turbo persistently attempts jailbreaking against various malicious requests. Throughout this ongoing endeavor, it progressively discovers and evolves increasingly complex jailbreak
1
4
16
@vlruso
Vlad Ruso PhD
1 year
0
0
0
@RINA1861
RINA1861
2 years
The @AutoDAN_Project focuses on optimizing energy consumption and providing assessments of energy performance for #buildings and businesses 🏙️ Our recent General Assembly in Istanbul was a fruitful exchange of ideas and progress! #GrowWithRINA
0
0
2