matthieu_meeus Profile Banner
Matthieu Meeus Profile
Matthieu Meeus

@matthieu_meeus

Followers
229
Following
455
Media
36
Statuses
127

PhD student @ImperialCollege Privacy/Safety + AI https://t.co/UBo5kgRqbU

London, England
Joined September 2022
Don't wanna be here? Send us removal request.
@matthieu_meeus
Matthieu Meeus
2 months
(1/9) LLMs can regurgitate memorized training data when prompted adversarially. But what if you *only* have access to synthetic data generated by an LLM?. In our @icmlconf paper, we audit how much information synthetic data leaks about its private training data 🐦🌬️
Tweet media one
3
5
25
@matthieu_meeus
Matthieu Meeus
1 day
RT @_igorshilov: If you are at USENIX, come to our presentation and poster!. 9am PDT in Ballroom 6A.
0
2
0
@grok
Grok
6 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
396
661
3K
@matthieu_meeus
Matthieu Meeus
19 days
RT @l2m2_workshop: L2M2 is happening this Friday in Vienna at @aclmeeting #ACL2025NLP! We look forward to the gathering of memorization re….
Tweet card summary image
sites.google.com
Program
0
12
0
@matthieu_meeus
Matthieu Meeus
28 days
Presenting two papers at the MemFM workshop at ICML! . Both touch upon how near duplicates (and beyond) in LLM training data contribute to memorization. - - @_igorshilov @yvesalexandre
Tweet media one
4
10
56
@matthieu_meeus
Matthieu Meeus
1 month
Lovely place for a conference! . Come see my poster on privacy auditing of synthetic text data at 11am today! . East Exhibition Hall A-B #E-2709
Tweet media one
@matthieu_meeus
Matthieu Meeus
2 months
(1/9) LLMs can regurgitate memorized training data when prompted adversarially. But what if you *only* have access to synthetic data generated by an LLM?. In our @icmlconf paper, we audit how much information synthetic data leaks about its private training data 🐦🌬️
Tweet media one
0
1
17
@matthieu_meeus
Matthieu Meeus
1 month
Will be at @icmlconf in Vancouver next week! 🇨🇦. Will be presenting our poster on privacy auditing of synthetic text. Presentation here And will also be presenting two papers at the MemFM workshop! Hit me up if you want to chat!.
@matthieu_meeus
Matthieu Meeus
2 months
(1/9) LLMs can regurgitate memorized training data when prompted adversarially. But what if you *only* have access to synthetic data generated by an LLM?. In our @icmlconf paper, we audit how much information synthetic data leaks about its private training data 🐦🌬️
Tweet media one
1
5
34
@matthieu_meeus
Matthieu Meeus
2 months
RT @_igorshilov: New paper accepted @ USENIX Security 2025!. We show how to identify training samples most vulnerable to membership inferen….
computationalprivacy.github.io
Estimate training sample vulnerability by analyzing loss traces - achieving 92% precision without shadow models
0
1
0
@matthieu_meeus
Matthieu Meeus
2 months
Or more technical: we apply GCG to model checkpoints during alignment, using the optimized suffix from the previous one to initialize the next. Turns out that GCG is super sensitive to its initialization, and that model checkpoints can guide you to a successful solution
Tweet media one
0
0
2
@matthieu_meeus
Matthieu Meeus
2 months
Check out our recent work on prompt injection attacks! . Tl;DR: aligned LLMs show to defend against prompt injection; yet with a strong attacker (GCG on steroids), we find that successful attacks (almost) always exist, but are just harder to find.
@yvesalexandre
Yves-A. de Montjoye
2 months
Have you ever uploaded a PDF 📄 to ChatGPT 🤖 and asked for a summary? There is a chance the model followed hidden instructions inside the file instead of your prompt 😈. A thread 🧵
Tweet media one
2
5
18
@matthieu_meeus
Matthieu Meeus
2 months
(9/9) This was work done during my internship @Microsoft Research last Summer, now accepted at @icmlconf 🇨🇦 . Was a great pleasure to collaborate with and learn from @LukasWutschitz, @xEFFFFFFF, @shrutitople and @rzshokri.
0
0
1
@matthieu_meeus
Matthieu Meeus
2 months
(8/9) Finally, we test DP-SGD (ε=8). As expected, our strongest MIAs fall to AUC=0.5. 💡 Differential privacy works as expected, while its synthetic text can maintain high utility!.
1
0
1
@matthieu_meeus
Matthieu Meeus
2 months
(7/9) We propose a new canary design:.(i) an in-distribution prefix of length F (more easily echoed through the synthetic data).(ii) high-perplexity suffix (better memorized by the model). Canaries with 0 < F < max improve detection in data-only MIAs.
Tweet media one
1
0
0
@matthieu_meeus
Matthieu Meeus
2 months
(6/9) We then find a surprising tradeoff:. 🔹 Model-based MIAs work better with rare, high-perplexity canaries. 🔹 For data-based MIAs, it's low-perplexity, in-distribution canaries. Rare canaries are memorized more by the model, but don't echo through synthetic text!
Tweet media one
1
0
0
@matthieu_meeus
Matthieu Meeus
2 months
(5/9) We compare data-based MIAs to an attacker who has access to the target model logits (model-based MIAs). To match the vulnerability of model-based MIAs, our data-only attacks need canaries to appear ~8× more often in training
Tweet media one
1
0
0
@matthieu_meeus
Matthieu Meeus
2 months
(4/9) We extract MIA signal from synthetic data by (1) training an n-gram model to compute canary loss; (2) computing the similarity to the nearest synthetic samples. 2-gram works the best with MIA AUC reaching 0.7+ ➡️ synthetic text leaks information from its private data!
Tweet media one
1
0
0
@matthieu_meeus
Matthieu Meeus
2 months
(3/9) People include “canaries” 🐦 (unique sequences) in LLM training data to study memorization through Membership Inference Attacks (MIAs). We introduce the first data-only MIAs that audit privacy risks without model access, tracing how canaries echo through synthetic text 🌬️.
1
0
0
@matthieu_meeus
Matthieu Meeus
2 months
(2/9) We generate synthetic data as in prior work: we finetune a model on a private dataset to then sample synthetic text from the finetuned model. The synthetic text then bears similar *utility* to its original version. But does the fact that it’s synthetic make it private?
Tweet media one
1
0
0
@matthieu_meeus
Matthieu Meeus
2 months
How good can privacy attacks against LLM pretraining get if you assume a very strong attacker? Check it out in our preprint ⬇️.
@iliaishacked
Ilia Shumailov🦔
2 months
Are modern large language models (LLMs) vulnerable to privacy attacks that can determine if given data was used for training? Models and dataset are quite large, what should we even expect? Our new paper looks into this exact question. 🧵 (1/10)
Tweet media one
1
5
20
@matthieu_meeus
Matthieu Meeus
3 months
RT @_akhaliq: Google presents Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models .
0
19
0
@matthieu_meeus
Matthieu Meeus
3 months
!!!.
@Harvard
Harvard University
3 months
Without its international students, Harvard is not Harvard.
0
0
6