
Michael Aerni
@AerniMichael
Followers
167
Following
541
Media
11
Statuses
83
AI privacy and security | PhD student @CSatETH | Ask me about coffee ☕️
Zurich
Joined November 2017
LLMs may be copying training data in everyday conversations with users!. In our latest work, we study how often this happens compared to humans. 👇🧵
4
20
132
RT @NKristina01_: We will present our spotlight paper on the 'jailbreak tax' tomorrow at ICML, it measures how useful jailbreak outputs are….
0
7
0
Imagine LLMs could tell you the future. But properly evaluating forecasts is incredibly tricky!. This paper contains so many interesting thoughts about all the things that can go wrong.
How well can LLMs predict future events? Recent studies suggest LLMs approach human performance. But evaluating forecasters presents unique challenges compared to standard LLM evaluations. We identify key issues with forecasting evaluations 🧵 (1/7)
0
1
7
IMO it's very important to measure LLM utility in tasks that we actually want them to perform well on, not just hard sandbox tasks. This is an excellent benchmark that does exactly that!.
1/ Excited to share RealMath: a new benchmark that evaluates LLMs on real mathematical reasoning---from actual research papers (e.g., arXiv) and forums (e.g., Stack Exchange).
1
2
9
RT @NKristina01_: Congrats, your jailbreak bypassed an LLM’s safety by making it pretend to be your grandma!.But did the model actually giv….
0
27
0
RT @edoardo_debe: 1/🔒Worried about giving your agent advanced capabilities due to prompt injection risks and rogue actions? Worry no more!….
0
17
0
RT @florian_tramer: I’ll be mentoring MATS for the first time this summer, together with @dpaleka! . Link below to apply.
0
9
0
RT @CSatETH: 🔎Can #AI models be “cured” after a cyber attack?.New research from @florian_tramer's Secure and Private AI Lab reveals that re….
0
2
0
RT @javirandor: Adversarial ML research is evolving, but not necessarily for the better. In our new paper, we argue that LLMs have made pro….
0
27
0
RT @niloofar_mire: I've been thinking about Privacy & LLMs work for 2025 - here are 5 research directions and some key papers on privacy/me….
0
54
0
I am in beautiful Vancouver for #NeurIPS2024 with those amazing folks!.Say hi if you want to chat about ML privacy and security.(or speciality ☕).
0
1
8
🔥 I'm thrilled that I'll be spending next year in the group of @florian_tramer at ETH Zurich, working on privacy and memorization in ML 🔥. (Not an announcement, just what I usually do. It's a great group full of amazing people, and I'm thrilled to work with them every day!).
1
1
47
📖 Measuring Non-Adversarial Reproduction of Training Data in Large Language Models. ➡️ Full paper: ✏️ Blog post with interactive examples: Joint work with @javirandor, @edoardo_debe, Nicholas Carlini, @daphneipp, @florian_tramer.
spylab.ai
We show that LLMs often reproduce short snippets of training data even for natural and benign (non-adversarial) tasks.
0
0
5