Sanae Lotfi Profile
Sanae Lotfi

@LotfiSanae

Followers
3K
Following
2K
Media
46
Statuses
411

AI Research Scientist @MetaAI (FAIR) | PhD from @nyuniversity

Menlo Park, CA
Joined August 2020
Don't wanna be here? Send us removal request.
@KempeLab
Julia Kempe
3 months
Grateful for this great summary of our recent work!
@arankomatsuzaki
Aran Komatsuzaki
3 months
Soft Tokens, Hard Truths • First scalable RL method for continuous CoT • Learns “soft” tokens (mixtures + noise) → richer reasoning paths • Matches discrete CoTs at pass@1, beats them at pass@32 (more diversity) • Best setup: train w/ soft tokens, infer w/ hard tokens
0
2
14
@mrtnm
Martin Marek
3 months
Getting small batch sizes to work in bfloat16 precision can be challenging. In our recent paper on batch size, we ran all experiments in float32, but memory-constrained settings demand lower precision. Here are two tricks that we used to enable bf16 training at small batch sizes:
7
26
258
@LotfiSanae
Sanae Lotfi
4 months
Huge thanks to my amazing labmates, mentors, collaborators at Amazon, Meta, and Microsoft Research, and to my friends and family, I can’t name everyone, but I’m truly grateful for all your support. Special shoutout to Ethan for being the first to cite me as Dr. Lotfi!
1
1
28
@LotfiSanae
Sanae Lotfi
4 months
I also want to thank my incredible PhD committee: @KempeLab, @furongh, @Qi_Lei_, Jonathan Niles-Weed and Benjamin Peherstorfer. It was amazing to have such brilliant people in one room who not only cared about me as a PhD student but truly believed in my potential!
1
1
28
@LotfiSanae
Sanae Lotfi
4 months
First, I’m very thankful to my advisor, @andrewgwils, for his mentorship, for guiding me to grow as an independent researcher, and for creating a lab that is both a home to brilliant collaborators and a community of supportive friends. I never took any of this for granted!
1
0
22
@LotfiSanae
Sanae Lotfi
4 months
Excited to share two milestones: I have officially completed my PhD at NYU, and I have joined Meta AI’s Fundamental AI Research (FAIR) team in the Bay Area as a Research Scientist! I’m so grateful to many people who made this possible; more in this thread 🧵
52
25
745
@micahgoldblum
Micah Goldblum
5 months
🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n
28
116
841
@andrewgwils
Andrew Gordon Wilson
9 months
My new paper "Deep Learning is Not So Mysterious or Different": https://t.co/AgHdSQkals. Generalization behaviours in deep learning can be intuitively understood through a notion of soft inductive biases, and formally characterized with countable hypothesis bounds! 1/12
16
322
2K
@LotfiSanae
Sanae Lotfi
1 year
I’m excited to be a keynote speaker and panelist at the machine learning and compression workshop @NeurIPSConf ( https://t.co/qr55V2Eizr). Find me in meeting room 211-214 at 1:25pm and 3:50pm to talk about compression bounds!
1
8
88
@zdeborova
Lenka Zdeborova
1 year
We need more of *Science of Deep Learning* in the major ML conferences. This year’s @NeurIPSConf workshop @scifordl on this topic is just starting, and I hope it is NOT the last edition!!!
1
17
152
@LotfiSanae
Sanae Lotfi
1 year
Very excited to be co-organizing the Science of Deep Learning workshop, which will take place on Sunday. Please stop by, we have an amazing lineup of speakers and panelists. We’ll also announce the winners of the debunking common-held beliefs in DL challenge 🔥
@NYUDataScience
NYU Center for Data Science
1 year
CDS researchers @FlorentinGuth, @LotfiSanae, and recent grad @ZKadkhodaie, et al, are leading a new approach to studying deep learning at #NeurIPS2024. Their workshop (@scifordl) promotes a science of controlled experiments to understand deep nets. https://t.co/cMCbHoim0J
1
4
30
@andrewgwils
Andrew Gordon Wilson
1 year
Nice crowd at our #NeurIPS2024 poster today with @LotfiSanae presenting on token-level generalization bounds for LLMs with billions of parameters! https://t.co/rU2VZG0TLC
5
7
69
@jyo_pari
Jyo Pari
1 year
Over the past year I have been working on using multiple specialized models in a collective fashion to solve novel tasks. We investigated Mixture of Experts (MoE) style routing for merging. However, we find that feature based merging is likely not scalable paradigm. Read on!
2
27
98
@LotfiSanae
Sanae Lotfi
1 year
I was fortunate to collaborate with this incredible team during my internship at MSR. Not only do they work on important and timely research questions, but they are also some of the most supportive and uplifting people you’ll collaborate with. Highly recommend this position!!
@murefil
Alessandro Sordoni
1 year
The ML team at @MSFTResearch Montréal 🍁 is hiring a Senior Researcher with a background in ML / NLP!!! Come work with us at the intersection of interactivity, modularity and reasoning in foundation models 😊 MSR is a highly collaborative environment where risky ideas are
1
0
38
@Napoolar
Thomas Fel
1 year
🎭Recent work shows that models’ inductive biases for 'simpler' features may lead to shortcut learning. What do 'simple' vs 'complex' features look like? What roles do they play in generalization? Our new paper explores these questions. https://t.co/aW2PrlYQF4 #Neurips2024
7
105
506
@micahgoldblum
Micah Goldblum
1 year
📢I’ll be admitting multiple PhD students this winter to Columbia University 🏙️ in the most exciting city in the world! If you are interested in dissecting modern deep learning systems to probe how they work, advancing AI safety, or automating data science, apply to my group.
6
145
560
@LotfiSanae
Sanae Lotfi
1 year
My experience with other researchers in the ML community has been more uplifting than not! Unexpected words of encouragement and acts of kindness go a long way! To all (senior) researchers who are inclusive, helpful and welcoming: you're amazing and make a huge difference!
1
2
90
@LotfiSanae
Sanae Lotfi
1 year
Excited to share that we’re organizing a #neurips2024 workshop on scientific methods for understanding deep learning with outstanding speakers & panelists 🥳 Submit your best papers demonstrating why and when deep learning works by **Sep 10** & stay tuned for more details ;)
@scifordl
Scientific Methods for Understanding Deep Learning
1 year
📢Excited to announce the Workshop on Scientific Methods for Understanding Deep Learning #NeurIPS2024 🥳 ➡️Submission Deadline: Sep 10 ‘24 ➡️Speaker lineup: https://t.co/MmrlYngPTY ➡️Call for paper: https://t.co/GMHdMfJpzg ➡️Our ✨Debunking ✨ challenge: https://t.co/VAzhYWCjc0
0
5
57
@LotfiSanae
Sanae Lotfi
1 year
We find that as models are quantized more aggressively, their ability to recall memorized facts from its pretraining data deteriorates faster than its ability to recognize structured patterns, echoing the findings of @tjingrant et al. about the effect of down-scaling LLMs. 7/8
1
0
9