
Rob
@Rob_Mulla
Followers
6K
Following
2K
Media
291
Statuses
1K
Data Science @ Dreadnode | Follow on twitch: https://t.co/GHjWoRVia7 & Youtube: https://t.co/WfD4vK0age
Maryland
Joined April 2008
If you are doing AI Red Teaming or working with agents, you need to check this out. .
Introducing AIRTBench, an AI red teaming benchmark for evaluating language models’ ability to autonomously discover and exploit AI/ML security vulnerabilities. Read the paper on arXiv: . Open-source dataset and benchmark eval code repo:
0
1
6
RT @MeganRisdal: Pleased to share our position paper "AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation" wa….
0
5
0
RT @dreadnode: What's your take on the growing dominance of automated attacks and the implications for AI red teams? Here's ours, based on….
0
10
0
RT @NVIDIAAIDev: 👀 New getting started video on cuML accelerated scikit-learn by @Rob_Mulla. Watch 📹 It's a fan….
0
4
0
Love that!.
@Rob_Mulla Just wrote my first python script thanks to your youtube video. Was fun! Thank you! .
0
0
6
RT @dreadnode: Where AI meets offensive security 🤝. Dreadnode is proud to be an organizer of Offensive AI Con (OAIC), the first conference….
0
7
0
Big news today! Excited to be a part of such an amazing team at @dreadnode . We're going to continue pushing what AI is able to accomplish for offensive security.
Today, Dreadnode announces $14M Series A funding led by @DecibelVC, with @nextfrontiercap, In-Q-Tel, Sands Capital, and Indie VC. Dreadnode exists to show that AI can perform offensive security tasks on par with, and exceeding, human capability. To accomplish this, we’re
2
0
19
RT @dreadnode: Boo! 👻 In our new Crucible Challenge, Popcorn, an LLM firewall is blocking access to a protected SQL table. Can you unmask t….
0
6
0
RT @mariofilhoml: Run a Kaggle competition with 100k prize and fully open sourced solutions. You will get more universal jailbreaks than yo….
0
4
0
RT @dreadnode: We made some recent updates to our Rigging framework:. 🔥 Tracing: Get details about pipelines, prompts, and tools during Rig….
0
7
0
RT @dreadnode: NEW Crucible Challenge: DeepTweak, an exploration of reasoning model behavior. Cause enough confusion 😵💫, retrieve the flag….
0
7
0