Mansi Sakarvadia
@Mansi__S
Followers
84
Following
103
Media
10
Statuses
44
Bluesky: @mansisakarvadia.bsky.social Computer Science/Machine Learning Ph.D. Student @UChicago & @globus. @doecsgf Computational Science Graduate Fellow.
Chicago, IL
Joined August 2015
"Without long-term, foundational, and high-risk federal research investments, the seeds of innovation cannot take root," Rebecca Willett and Henry Hoffman write in a new commentary piece.
fortune.com
America’s future economic growth depends on federal research investment in AI and computing.
2
4
8
Check out a recent interview in which I discuss the recent Nobel Prizes and some thoughts on the impact on both the domain sciences and ML communities.
New 🎙️episode: @Mansi__S of @UChicago and @JoshVermaas of @MSUDOEPlantLab discuss the #AI #Nobels and their impact on computing and research: https://t.co/PJOhOHC4oE Both are part of the @doecsgf community. #HPC
0
2
5
Reflecting on my 2024 PhD journey: passed my qualifying exam, spent the summer at Berkeley, mentored undergrad students, and tackled the fast pace of AI/ML research. It’s been a year of milestones and growth! Read more here: https://t.co/8LiZaZd8Rn
#PhDJourney #AIResearch
0
2
3
Congrats to Jordan for winning 1st place at SC24 student poster competition! It was super fun to mentor him this summer on his project "Mind Your Manners: Detoxifying Language Models via Attention Head Intervention".
I am super proud of Jordan Pettyjohn, an undergraduate student I had the privilege of working with this past summer, for winning the Student Research Competition at the @Supercomputing conference! 🏆 His work studied ablation strategies for toxicity in #LLMs.
0
1
4
Towards Interpreting Language Models: A Case Study in Multi-Hop Reasoning paper: https://t.co/QL7VV6rOF7 This method improves multi-hop reasoning in language models by injecting “memories” into key attention heads, increasing accuracy in complex tasks. An open-source tool,
3
31
120
Excited to share our latest work: "SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques"! 🧠 https://t.co/XAG79rS3ij By @arhamk1216, @toddknife, @Nchudson95, @Mansi__S, @dcgrzenda, @aswathy__ajith, Jordan Pettyjohn, @chard_kyle, and @ianfoster.
2
4
3
7/ 🙏 Special Thanks: A huge shoutout to my incredible co-authors from multiple institutions for their contributions to this work: @aswathy__ajith, Arham Khan, @Nchudson95, @calebgeniesse, @nsfzyzz, @chard_kyle, @ianfoster, Michael Mahoney
0
1
4
6/ 🌍 Scalable Impact: Our methods aren’t just for small models! We show that they scale effectively to larger LMs, providing robust memorization mitigation without compromising performance across different sizes of models. Exciting progress for real-world applications!
1
1
3
5/💡Best Approach: Our proposed unlearning method, BalancedSubnet, outperforms others by effectively removing memorized info while maintaining high accuracy.
1
1
3
4/🧪 Key Findings: Unlearning-based methods are faster and more effective than regularization or fine-tuning in mitigating memorization.
1
1
4
3/⚡Introducing TinyMem: We created TinyMem, a suite of small, efficient models designed to help test and benchmark memorization mitigation techniques. TinyMem allows for quick experiments with lower computational costs.
1
1
3
2/ 🚨 Main Methods: We test 17 methods—regularization, fine-tuning, and unlearning—5 of which we propose. These methods aim to remove memorized info from LMs while preserving performance.
1
1
3
1/🧵New Research on Language Models! Language models (LMs) often "memorize" data, leading to privacy risks. This paper explores ways to reduce that! Paper: https://t.co/OBtYz9mJON Code: https://t.co/x0C5I77CG3 Blog: https://t.co/nA6AH5rnXV
1
7
14
Language models can memorize sensitive data! 🔒 Our new research by the team (@Mansi__S, @Nchudson95, and others) with TinyMem shows unlearning methods like BalancedSubnet effectively mitigate memorization while keeping performance high. #AI #Privacy
https://t.co/usBb1wLSID
mansisak.com
A study of regularization, fine-tuning, and machine unlearning methods to curb memorization in LMs
0
1
1
Jordan Pettyjohn, @Nchudson95, @Mansi__S, @aswathy__ajith, and @chard_kyle just published new work demonstrating detoxification strategies on Language Model outputs at @BlackboxNLP! ""Mind Your Manners: Detoxifying Language Models via Attention Head Intervention" Congrats All!
0
1
5
🎉 I successfully defended my Master's dissertation in the area of interpretable Language Modeling! Check out my work's applications in better understanding multi-hop reasoning, bias localization, and malicious prompt detection in my talk:
1
2
16
@Mansi__S presented her Master's thesis on "Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models". Watch the recording here:
0
1
2
✨Trillion Parameter Models in Science✨ We present an initial vision for a shared ecosystem to take the next step in large language models for scientific research – Trillion Parameter Models (TPMs). #LLM are becoming more powerful by the day. But, there is still work done to
3
17
81
I had a great time at #EMNLP2023 and am now at #NeurIPS23. I am very excited to meet new people. Feel free to DM to meet up and say 👋. I will be presenting Attention Lens ( https://t.co/dkHj6xoSy6) as a a poster at the Attributing Model Behavior at Scale workshop on Friday!
0
0
3