Sanae Lotfi @LotfiSanae X Profile

Sanae Lotfi

@LotfiSanae

Followers

3K

Following

2K

Media

46

Statuses

411

AI Research Scientist @MetaAI (FAIR) | PhD from @nyuniversity

https://t.co/INciWkfCHC

Menlo Park, CA

Joined August 2020

Don't wanna be here? Send us removal request.

Julia Kempe

@KempeLab

3 months

Grateful for this great summary of our recent work!

Aran Komatsuzaki

@arankomatsuzaki

3 months

Soft Tokens, Hard Truths • First scalable RL method for continuous CoT • Learns “soft” tokens (mixtures + noise) → richer reasoning paths • Matches discrete CoTs at pass@1, beats them at pass@32 (more diversity) • Best setup: train w/ soft tokens, infer w/ hard tokens

0

2

14

Martin Marek

@mrtnm

3 months

Getting small batch sizes to work in bfloat16 precision can be challenging. In our recent paper on batch size, we ran all experiments in float32, but memory-constrained settings demand lower precision. Here are two tricks that we used to enable bf16 training at small batch sizes:

7

26

258

Sanae Lotfi

@LotfiSanae

4 months

Huge thanks to my amazing labmates, mentors, collaborators at Amazon, Meta, and Microsoft Research, and to my friends and family, I can’t name everyone, but I’m truly grateful for all your support. Special shoutout to Ethan for being the first to cite me as Dr. Lotfi!

1

28

Sanae Lotfi

@LotfiSanae

4 months

I also want to thank my incredible PhD committee: @KempeLab, @furongh, @Qi_Lei_, Jonathan Niles-Weed and Benjamin Peherstorfer. It was amazing to have such brilliant people in one room who not only cared about me as a PhD student but truly believed in my potential!

1

28

Sanae Lotfi

@LotfiSanae

4 months

First, I’m very thankful to my advisor, @andrewgwils, for his mentorship, for guiding me to grow as an independent researcher, and for creating a lab that is both a home to brilliant collaborators and a community of supportive friends. I never took any of this for granted!

1

0

22

Sanae Lotfi

@LotfiSanae

4 months

Excited to share two milestones: I have officially completed my PhD at NYU, and I have joined Meta AI’s Fundamental AI Research (FAIR) team in the Bay Area as a Research Scientist! I’m so grateful to many people who made this possible; more in this thread 🧵

52

25

745

Micah Goldblum

@micahgoldblum

5 months

🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n

28

116

841

Andrew Gordon Wilson

@andrewgwils

9 months

My new paper "Deep Learning is Not So Mysterious or Different": https://t.co/AgHdSQkals. Generalization behaviours in deep learning can be intuitively understood through a notion of soft inductive biases, and formally characterized with countable hypothesis bounds! 1/12

16

322

2K

Sanae Lotfi

@LotfiSanae

1 year

I’m excited to be a keynote speaker and panelist at the machine learning and compression workshop @NeurIPSConf ( https://t.co/qr55V2Eizr). Find me in meeting room 211-214 at 1:25pm and 3:50pm to talk about compression bounds!

1

8

88

Lenka Zdeborova

@zdeborova

1 year

We need more of *Science of Deep Learning* in the major ML conferences. This year’s @NeurIPSConf workshop @scifordl on this topic is just starting, and I hope it is NOT the last edition!!!

1

17

152

Sanae Lotfi

@LotfiSanae

1 year

Very excited to be co-organizing the Science of Deep Learning workshop, which will take place on Sunday. Please stop by, we have an amazing lineup of speakers and panelists. We’ll also announce the winners of the debunking common-held beliefs in DL challenge 🔥

NYU Center for Data Science

@NYUDataScience

1 year

CDS researchers @FlorentinGuth, @LotfiSanae, and recent grad @ZKadkhodaie, et al, are leading a new approach to studying deep learning at #NeurIPS2024. Their workshop (@scifordl) promotes a science of controlled experiments to understand deep nets. https://t.co/cMCbHoim0J

1

4

30

Andrew Gordon Wilson

@andrewgwils

1 year

Nice crowd at our #NeurIPS2024 poster today with @LotfiSanae presenting on token-level generalization bounds for LLMs with billions of parameters! https://t.co/rU2VZG0TLC

5

7

69

Jyo Pari

@jyo_pari

1 year

Over the past year I have been working on using multiple specialized models in a collective fashion to solve novel tasks. We investigated Mixture of Experts (MoE) style routing for merging. However, we find that feature based merging is likely not scalable paradigm. Read on!

2

27

98

Sanae Lotfi

@LotfiSanae

1 year

I was fortunate to collaborate with this incredible team during my internship at MSR. Not only do they work on important and timely research questions, but they are also some of the most supportive and uplifting people you’ll collaborate with. Highly recommend this position!!

Alessandro Sordoni

@murefil

1 year

The ML team at @MSFTResearch Montréal 🍁 is hiring a Senior Researcher with a background in ML / NLP!!! Come work with us at the intersection of interactivity, modularity and reasoning in foundation models 😊 MSR is a highly collaborative environment where risky ideas are

1

0

38

Thomas Fel

@Napoolar

1 year

🎭Recent work shows that models’ inductive biases for 'simpler' features may lead to shortcut learning. What do 'simple' vs 'complex' features look like? What roles do they play in generalization? Our new paper explores these questions. https://t.co/aW2PrlYQF4 #Neurips2024

7

105

506

Micah Goldblum

@micahgoldblum

1 year

📢I’ll be admitting multiple PhD students this winter to Columbia University 🏙️ in the most exciting city in the world! If you are interested in dissecting modern deep learning systems to probe how they work, advancing AI safety, or automating data science, apply to my group.

6

145

560

Sanae Lotfi

@LotfiSanae

1 year

My experience with other researchers in the ML community has been more uplifting than not! Unexpected words of encouragement and acts of kindness go a long way! To all (senior) researchers who are inclusive, helpful and welcoming: you're amazing and make a huge difference!

1

2

90

Sanae Lotfi

@LotfiSanae

1 year

Excited to share that we’re organizing a #neurips2024 workshop on scientific methods for understanding deep learning with outstanding speakers & panelists 🥳 Submit your best papers demonstrating why and when deep learning works by **Sep 10** & stay tuned for more details ;)

Scientific Methods for Understanding Deep Learning

@scifordl

1 year

📢Excited to announce the Workshop on Scientific Methods for Understanding Deep Learning #NeurIPS2024 🥳 ➡️Submission Deadline: Sep 10 ‘24 ➡️Speaker lineup: https://t.co/MmrlYngPTY ➡️Call for paper: https://t.co/GMHdMfJpzg ➡️Our ✨Debunking ✨ challenge: https://t.co/VAzhYWCjc0

0

5

57

Sanae Lotfi

@LotfiSanae

1 year

Much more in the paper! We are really excited about this work: https://t.co/iSM87VR5CR with amazing co-authors: @KuangYilun, @brandondamos, @micahgoldblum, @m_finzi, and @andrewgwils 8/8

arxiv.org

Large language models (LLMs) with billions of parameters excel at predicting the next token in a sequence. Recent work computes non-vacuous compression-based generalization bounds for LLMs, but...

0

1

6

Sanae Lotfi

@LotfiSanae

1 year

We find that as models are quantized more aggressively, their ability to recall memorized facts from its pretraining data deteriorates faster than its ability to recognize structured patterns, echoing the findings of @tjingrant et al. about the effect of down-scaling LLMs. 7/8

1

0

9