Cecilia Ferrando
@ceci_ferrando
Followers
1K
Following
17K
Media
154
Statuses
1K
CS PhD candidate at UMass Amherst Researcher in private machine learning When not researching, probably gaming
Joined October 2019
Thrilled to share that my recent paper with Dan Sheldon "Private Regression via Data-Dependent Sufficient Statistic Perturbation" (DD-SSP) is now published in Transactions on Machine Learning Research (TMLR)! Paper: https://t.co/qEJeIOYCMM Reviews:
openreview.net
Sufficient statistic perturbation (SSP) is a widely used method for differentially private linear regression. SSP adopts a data-independent approach where privacy noise from a simple distribution...
1
2
15
Holy fuck guys we’re not "pushing hard" for or replacing concept artists with AI. We have a team of 72 artists of which 23 are concept artists and we are hiring more. The art they create is original and I’m very proud of what they do. I was asked explicitly about concept art
gamespot.com
Here are all the reasons why Larian believes machine learning-powered automation is the future.
5K
4K
60K
I am recruiting PhD students at @NYU_Courant to conduct research in learning theory, algorithmic statistics, and trustworthy machine learning, starting Fall 2026. Please share widely! Deadline to apply is December 12, 2025.
12
135
583
I'm really excited about our new paper!! 📣 'Reinforcement Learning Improves Traversal of Hierarchical Knowledge in LLMs' Contrary to belief that RL ft degrades memorized knowledge, RL-enhanced models consistently outperform base/SFT on knowledge recall by 24pp! RL teaches
18
51
422
Starting my last first day of school with 100 citations -- thank you!
11
4
165
I can't* fathom why the top picture, and not the bottom picture, is the standard diagram for an autoencoder. The whole idea of an autoencoder is that you complete a round trip and seek cycle consistency—why lay out the network linearly?
94
222
3K
The DD-SSP framework is general: - Any data-dependent DP marginal releasing mechanism can be used (we use AIM). - By finding the appropriate target workload and approximations, it can be extended to other models with finite or approximate suff stats (including other GLMs).
0
0
1
Finally, we connect regression under DD-SSP to training on DP synthetic data. When the right marginal workload is targeted, training ML models on synthetic data is itself a form of data-dependent SSP, where utility depends on how well marginals are preserved.
1
0
1
For logistic regression (where finite suff stats don’t exist), we design the first data-dependent SSP-style algorithm using polynomial approximations. It outperforms objective perturbation and is competitive with DP-SGD -- without the burden of hyperparameter tuning.
1
0
1
In this paper, we introduce DD-SSP, a data-dependent SSP method that reconstructs sufficient statistics from privately released marginals. We show empirically that for linear regression DD-SSP outperforms traditional SSP methods on a variety or real-world datasets.
1
0
1
Sufficient Statistic Perturbation (SSP) is a classic approach for DP regression--it adds noise to X^TX and X^Ty. But: traditional SSP is data-independent. Our insight: many suff stats are just linear queries, which can be released more accurately via data-dependent mechanisms.
1
0
1
🔊Open access version of the book 📖 "Differential Privacy in AI: From Theory to Practice" is now available! 👉 https://t.co/5494N4j7kX This was a tremendous effort of so many leaders in the DP community who contributed to it Hope it will be a useful resource for many!
emerald.com
The ebook edition of this title is Open Access and freely available to read online. Differential Privacy in Artificial Intelligence: From Theory to Practic
3
18
66
Really enjoyed @playWUCHANG. Highlights: level design, combat system, art direction, and the rich historical/mythological inspiration that makes the world so immersive.
0
0
1
I'll be at #JSM2025 in Nashville TN for the next couple of days presenting some my research on differentially private ML at "Maintaining Privacy in Increasingly Public Societies". DM me if you're around and want to chat about privacy. First time in Nashville -- looking forward!
0
0
0
Earlier this year, a 17-year-old high school student named Hannah Cairo solved a 40-year-old mystery about how waves behave, surprising and exciting mathematicians. @KSHartnett reports:
quantamagazine.org
After finding the homeschooling life confining, the teen petitioned her way into a graduate class at Berkeley, where she ended up disproving a 40-year-old conjecture.
90
588
5K
A gentle reminder that TMLR is a great journal that allows you to submit your papers when they are ready rather than rushing to meet conference deadlines. The review process is fast, there are no artificial acceptance rates, and you have more space to present your ideas in the
15
31
330
🪄We made a 1B Llama BEAT GPT-4o by... making it MORE private?! LoCoMo results: 🔓GPT-4o: 80.6% 🔐1B Llama + GPT-4o (privacy): 87.7% (+7.1!⏫) 💡How? GPT-4o provides reasoning ("If X then Y"), the local model fills in the blanks with your private data to get the answer!
7
41
186