bsimsek13 Profile Banner
Berfin Simsek Profile
Berfin Simsek

@bsimsek13

Followers
769
Following
984
Media
9
Statuses
178

Research fellow @FlatironCCM & @NYU, previously: Ph.D. @EPFL, intern @MetaAI. DL Theory 😎, Math 🥰, AI 🤗 Slowly migrating to @bsimsek.bsky.social

New York, USA
Joined December 2017
Don't wanna be here? Send us removal request.
@bsimsek13
Berfin Simsek
3 months
Come see our analysis of a Gaussian multi-index model #AISTATS2025 on Sunday at Hall A—E 183. My favorite result is when the dot product between the ideal vectors exceeds a threshold, gradient flow fails to separate them under correlation loss! 😎.
1
0
12
@bsimsek13
Berfin Simsek
1 month
Very exciting research direction ! 🙌🏼.
@patrickshafto
Patrick Shafto
1 month
NY times article on expMath, my AI for math @darpa program, with commentary from mathematicians Andrew Granville, Bryna Kra, Jordan Ellenberg, and context from @the_IAS professor @alondra and @AnthropicAI CEO @DarioAmodei.
0
0
5
@bsimsek13
Berfin Simsek
3 months
cross-posted @bsimsek.bsky.social .(I'm slowly migrating there).
0
0
0
@bsimsek13
Berfin Simsek
3 months
Another result is a tight characterization of the time complexity of the gradient flow early in dynamics, a generalization of the known result for single-index to multi-index models. This generalization applies to arbitrary geometries as well. 🙃.
1
0
0
@bsimsek13
Berfin Simsek
3 months
Below this threshold, a mild overparameterization (log k factor) for k index vectors is sufficient to match the neurons to the ideal vectors due to a coupon collector argument. 🤠.
1
0
0
@bsimsek13
Berfin Simsek
3 months
When the ideal vectors form an equiangular frame, all learned weights converge to their average (no matter how much overparameterization is used), which has turned from a saddle to a local minimum after a certain threshold of the dot product. 😲.
1
0
0
@bsimsek13
Berfin Simsek
3 months
Searching for an exact inverse map from learned weights to ideal (concept) vectors is an intricate geometry question, even for "simple" idealized models. 🧐.
1
0
0
@bsimsek13
Berfin Simsek
5 months
I now have a new account on Bluesky..follow me !.
0
0
0
@bsimsek13
Berfin Simsek
5 months
Here is a link to my talk on distillation for neural networks. at Les Houches together with many other talks on algorithmic theories of learning 🙌 Thanks to organizers @_brloureiro and Vittorio.
videos.univ-grenoble-alpes.fr
Découverte de l'université, des campus, de l'organisation et de la stratégie universitaire. 
1
0
3
@bsimsek13
Berfin Simsek
7 months
Nice talk by Jarod Alper at JMM’25 🙌🏼
Tweet media one
0
0
4
@bsimsek13
Berfin Simsek
7 months
I don’t mind if o1 does not think clearly like humans. It’s great for computing formulas like integrals, even better in combination with Wolfram alpha 🙌🏼.
@roydanroy
Dan Roy
7 months
o1 may be superhuman in some respects, but it's ability to think clearly mathematically about integration is still not equal to a strong high schooler.
1
0
4
@bsimsek13
Berfin Simsek
7 months
RT @jacobandreas: Ekin Akyürek (@akyurekekin) builds tools for understanding & controlling algorithms that underlie reasoning in language m….
0
7
0
@bsimsek13
Berfin Simsek
8 months
6/n My application package is available at feel free to reach out! (n=6)
Tweet media one
0
0
2
@bsimsek13
Berfin Simsek
8 months
5/n I'm enthusiastic about championing Gaussian multi-index models as mathematically analyzable and insightful models for MLPs. I continue developing new results for this model, it is fun!.
1
0
1
@bsimsek13
Berfin Simsek
8 months
3/n It is unclear whether the LLMs in the wild can be robustly interpreted by ad-hoc methods. My approach is to analyze toy models that give insight into the non-linear feature compression of the MLPs. Fascinating math challenges are to be expected in this journey!.
1
0
1
@bsimsek13
Berfin Simsek
8 months
2/n Studying the loss landscape is essential for understanding deep learning optimization and generalization. I developed a combinatorial complexity framework that quantifies the non-convexity of the deep learning loss landscapes.
Tweet media one
1
0
4
@bsimsek13
Berfin Simsek
8 months
📢 I'm on the faculty job market this year! . My research explores the foundations of deep learning and analyzes learning and feature geometry for Gaussian inputs. I detail my major contributions👇Retweet if you find it interesting and help me spread the word! DM is open. 1/n.
1
22
75
@bsimsek13
Berfin Simsek
8 months
RT @deepcohen: The Center for Computational Mathematics at Flatiron Institute is hiring research fellows (postdocs) to start next year -- a….
0
12
0