#AutoDAN X Hashtag | Muskviewer

Explore tweets tagged as #AutoDAN

Yuandong Tian

@tydsh

2 years

Introducing AdvPrompter! 🚀Our new optimization technique that crafts prompt-dependent, human-readable adversarial suffixes in real-time, enhancing security for LLMs🛡️! 1️⃣Generate suffix in ~2 seconds, ~800x faster than methods like GCG and AutoDAN. 2️⃣Higher success rate, in

17

32

203

General Analysis

@gen_analysis

9 months

You can now run the AutoDan Turbo jailbreak on our General Analysis notebook! Link in comments!

1

0

4

Rakesh Gohel 🇨🇦

@rakeshgohel01

1 year

NVIDIA releases an agentic framework to test jailbreaking on LLMs Here's everything about AutoDAN-Turbo ... Jailbreaking attacks like Prompt injection cause LLMs to break down and give wrong responses. Though these attacks are currently written by humans but will later be

1

0

1

Dan Hendrycks

@hendrycks

2 years

@StephenLCasper RE HarmBench: In v1 of the HarmBench paper, AutoDAN was doing better than GCG in many respects (shown below). We spent a few weeks simplifying and improving the code after the initial release. After some bug fixes, we found results were largely similar but GCG did somewhat

1

0

10

Chaowei Xiao

@ChaoweiX

2 years

🚀 Introducing AutoDAN, a method that automatically generates SEMANTICALLY MEANINGFUL #Jailbreak prompts for #redteaming aligned #LLMs . arxiv: https://t.co/wYrJBSf5xS

3

35

107

AI Breakfast

@AiBreakfast

1 year

The Jailbreak Bot: AutoDAN-Turbo. A system that automatically finds ways to bypass safeguards in large language models:

5

8

57

Dan Hendrycks

@hendrycks

2 years

@StephenLCasper @cais RE HarmBench: In v1 of the HarmBench paper, AutoDAN was doing better than GCG in many respects (shown below). We spent a few weeks simplifying and improving the code after the initial release. After some bug fixes, we found results were largely similar but GCG did somewhat

1

0

37

Sicheng Zhu

@sichengzhuml

2 years

Meet AutoDAN🚀, an interpretable adversarial attack to jailbreak LLMs. It generates attack prompts from scratch without manual input. These prompts are interpretable, strategic, and even transfer better to GPTs Paper: https://t.co/sylcBjBFQA Website: https://t.co/U7ksKzP7kW …🧵

3

13

58

AK

@_akhaliq

1 year

AutoDAN-Turbo A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs AutoDAN-Turbo persistently attempts jailbreaking against various malicious requests. Throughout this ongoing endeavor, it progressively discovers and evolves increasingly complex jailbreak

10

38

193

Xiaogeng Liu

@XiaogengLiu

4 months

Previously we introduced AutoDAN-Turbo, which autonomously explores and learns jailbreak strategies. We’ve now gone further: investigating test-time scaling mechanisms for such strategy-based jailbreak methods. In our new report AutoDAN-Reasoning, we integrate Best-of-N and Beam

2

1

4

Gorden Sun

@Gorden_Sun

1 year

AutoDAN-Turbo：自动化越狱LLM 自动化发现和利用LLM漏洞，无需人工干预。对GPT-4-1106-turbo模型的攻击成功率达到88.5% Github： https://t.co/UJ9xYwSJbB

0

1

3

Yoon Baek

@L0Z1K

2 years

AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models 😈Do not make hand-crafted DAN anymore! The authors made DAN Prompt automatically with GA (Genetic Algorithm). Full Paper Link: https://t.co/pjIJjC0muu

0

4

RINA1861

@RINA1861

2 years

Unlocking energy flexibility is key for sustainable energy systems, especially with renewables on the rise⚡️Join us at @KeyEnergyit for our workshop 'Energy flexibility from buildings to industry' to explore @iBECOME_EU, @AutoDAN_Project and @CollectiefP projects. #GrowWithRINA

0

3

9

ege

@barbarturk16

1 year

Amk rüyamda el muhaberat peşime düşüyodu kırmızı nissan micrayla iki kişi kaçıyoduk bursada ypgliler ve el muhaberatçılar bizi kıstırıyordu android autodan konumumuzu öğrenip stoeger m3000le vuruyodum ypglileri sonra ben de vuruldum

1

0

4

Smart Square Project

@SmartSquare_EU

3 years

The Smart Square Project, its objectives & the work carried out so far were presented during the 1st @AutoDAN_Project roadshow! Special thanks to R2M Solution team for this opportunity! #SmartBuildings #SRI

0

2

cackerman21

@cackerman21

1 year

AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models ICLR 2024 https://t.co/JjRXVNNVFB

0

Xiaogeng Liu

@XiaogengLiu

1 year

Thank you so much for sharing our work! @_akhaliq We present 💥AutoDAN-Turbo, a lifelong agent designed for the jailbreak redteaming task. Three key features: ⚙️ Lifelong automatic jailbreak. Simply run the agent, and it will explore the jailbreak stratgies and evaluate the

AK

@_akhaliq

1 year

AutoDAN-Turbo A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs AutoDAN-Turbo persistently attempts jailbreaking against various malicious requests. Throughout this ongoing endeavor, it progressively discovers and evolves increasingly complex jailbreak

1

4

16

Vlad Ruso PhD

@vlruso

1 year

AutoDAN-Turbo: A Black-Box Jailbreak Method for LLMs with a Lifelong Agent https://t.co/pjfGi24Y6J #AutoDANTurbo #LLMSafety #JailbreakAttacks #AIInnovation #MachineLearning #ai #news #llm #ml #research #ainews #innovation #artificialintelligence #machinelearning #technology #…

0

RINA1861

@RINA1861

2 years

The @AutoDAN_Project focuses on optimizing energy consumption and providing assessments of energy performance for #buildings and businesses 🏙️ Our recent General Assembly in Istanbul was a fruitful exchange of ideas and progress! #GrowWithRINA

0

2