JavierAbadM Profile Banner
Javier Abad Martinez Profile
Javier Abad Martinez

@JavierAbadM

Followers
91
Following
50
Media
3
Statuses
20

PhD Student @ETH_AI_Center | Interested in AI Safety, Privacy & Causal Inference

Zurich
Joined September 2022
Don't wanna be here? Send us removal request.
@JavierAbadM
Javier Abad Martinez
7 months
(1/5) LLMs risk memorizing and regurgitating training data, raising copyright concerns. Our new work introduces CP-Fuse, a strategy to fuse LLMs trained on disjoint sets of protected material. The goal? Preventing unintended regurgitation 🧵. Paper:
1
3
16
@JavierAbadM
Javier Abad Martinez
1 month
RT @FannyYangETH: Register now (first-come first-served) for the "Math of Trustworthy ML workshop" at #LagoMaggiore, Switzerland, Oct 12-16….
0
21
0
@JavierAbadM
Javier Abad Martinez
2 months
RT @yaxi_hu: What if learning and unlearning happen simultaneously, with unlearning requests between updates? . Check out our work on onlin….
0
14
0
@JavierAbadM
Javier Abad Martinez
2 months
RT @AmartyaSanyal: Advertising an Open Postdoc position in learning theory/ privacy/ robustness/ unlearning or any related topics with me a….
0
3
0
@JavierAbadM
Javier Abad Martinez
2 months
RT @javirandor: Presenting 2 posters today at ICLR. Come check them out!. 10am ➡️ #502: Scalable Extraction of Training Data from Aligned,….
0
3
0
@JavierAbadM
Javier Abad Martinez
2 months
RT @pdebartols: Landed in Singapore for #ICLR—excited to see old & new friends! I’ll be presenting:. 📌 RAMEN @ Main Conference on Saturday….
0
4
0
@JavierAbadM
Javier Abad Martinez
2 months
Presenting our work at #ICLR this week! Come by the poster or oral session to chat about copyright protection and AI/LLM safety. 📌 𝐏𝐨𝐬𝐭𝐞𝐫: Friday, 10 a.m. – 12.30 p.m. | Booth 537.📌 𝐎𝐫𝐚𝐥: Friday, 3.30 – 5 p.m. | Room Peridot. @FraPintoML @DonhauserKonst @FannyYangETH.
@JavierAbadM
Javier Abad Martinez
5 months
LLMs accidentally spitting out copyrighted content?.We’ve got a fix. Our paper on CP-Fuse—a method to prevent LLMs from regurgitating protected data—got accepted as an Oral at #ICLR2025!. 👇Check it out! .📄 🤖
0
1
6
@JavierAbadM
Javier Abad Martinez
4 months
RT @AmartyaSanyal: Very shortly at @RealAAAI , @alexandrutifrea and I will be giving a Tutorial on the impact of Quality and availability o….
0
5
0
@JavierAbadM
Javier Abad Martinez
5 months
LLMs accidentally spitting out copyrighted content?.We’ve got a fix. Our paper on CP-Fuse—a method to prevent LLMs from regurgitating protected data—got accepted as an Oral at #ICLR2025!. 👇Check it out! .📄 🤖
@JavierAbadM
Javier Abad Martinez
7 months
(1/5) LLMs risk memorizing and regurgitating training data, raising copyright concerns. Our new work introduces CP-Fuse, a strategy to fuse LLMs trained on disjoint sets of protected material. The goal? Preventing unintended regurgitation 🧵. Paper:
0
1
17
@JavierAbadM
Javier Abad Martinez
5 months
RT @FannyYangETH: Eager to hear feedback from anyone who applies causal inference about this recent work with this amazing group of people….
0
3
0
@JavierAbadM
Javier Abad Martinez
5 months
RT @pdebartols: Looking for a more efficient way to estimate treatment effects in your randomized experiment?. We introduce H-AIPW: a novel….
0
5
0
@JavierAbadM
Javier Abad Martinez
7 months
RT @dmitrievdaniil7: Excited to present at #NeurIPS2024 our work on robust mixture learning!. How hard is mixture learning when (a lot of)….
0
1
0
@JavierAbadM
Javier Abad Martinez
7 months
(6/5) A big shoutout to the team: @DonhauserKonst,@FraPintoML and @FannyYangETH—thanks for the fantastic collaboration!. Paper: Code:
0
0
0
@JavierAbadM
Javier Abad Martinez
7 months
(5/5) CP-Fuse integrates easily with training-time defenses: We wrap models trained with the Goldfish loss and observe increased protection!. Finally, CP-Fuse is robust against prefix attacks, where adversaries with black-box access attempt to extract training data via prefixes.
Tweet media one
1
0
0
@JavierAbadM
Javier Abad Martinez
7 months
(4/5) CP-Fuse preserves utility while staying safe! It performs on par with the base model in (1) code utility and (2) storytelling fluency. Filtering methods, by contrast, often hurt output quality, introducing typos and errors in the generated code and text (see Figure).
Tweet media one
1
0
0
@JavierAbadM
Javier Abad Martinez
7 months
(3/5) We test CP-Fuse across multiple models and tasks, including storytelling and code generation. Our method:. ✅Significantly reduces memorization across 10+ metrics. ✅Notably, exact matches drop by over 25x Vs the base model!.✅Outperforms other inference-time baselines.
1
0
0
@JavierAbadM
Javier Abad Martinez
7 months
(2/5) CP-Fuse minimizes the likelihood of reproducing copyrighted content. Its success lies in its balancing mechanism (Lemma 3.2), which ensures:. 1⃣No single model dominates the text generation. 2⃣Protected content is not shared across models, so regurgitation is avoided!
Tweet media one
1
0
0
@JavierAbadM
Javier Abad Martinez
1 year
RT @ETH_AI_Center: Thrilled to share our 8 conference paper contributions to @icmlconf 2024 next week. Congrats to our doctoral fellows, po….
0
7
0
@JavierAbadM
Javier Abad Martinez
1 year
RT @pdebartols: Come to our AISTATS poster (#96) this afternoon (5-7pm) to learn more about hidden confounding!.
0
1
0
@JavierAbadM
Javier Abad Martinez
2 years
RT @pdebartols: Worried that hidden confounding stands in the way of your analysis? We propose a new strategy when a small RCT is available….
0
2
0