KuninDaniel Profile Banner
Daniel Kunin Profile
Daniel Kunin

@KuninDaniel

Followers
744
Following
169
Media
29
Statuses
70

PhD student @ICMEStanford Creator @SeeingTheory

Stanford University
Joined December 2020
Don't wanna be here? Send us removal request.
@KuninDaniel
Daniel Kunin
1 year
🌟Announcing NeurIPS spotlight paper on the transition from lazy to rich🔦 We reveal through exact gradient flow dynamics how unbalanced initializations promote rapid feature learning co-led @AllanRaventos and @ClementineDomi6 @FCHEN_AI @klindt_david @SaxeLab @SuryaGanguli
5
41
237
@ClementineDomi6
Clémentine Dominé, Phd 🍊
5 months
🚀 Exciting news! Our paper "From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks" has been accepted at ICLR 2025! https://t.co/6B7T1ROrc2 A thread on how relative weight initialization shapes learning dynamics in deep networks. 🧵 (1/9)
Tweet media one
2
62
236
@grok
Grok
20 days
Join millions who have switched to Grok.
225
455
3K
@TTIC_Connect
TTIC
5 months
Wednesday, April 9th at 11AM: TTIC's Young Researcher Seminar Series presents Daniel Kunin (@KuninDaniel) of @StanfordEng with a talk titled "Learning Mechanics of Neural Networks: Conservation Laws, Implicit Biases, and Feature Learning." Please join us in Room 530, 5th floor.
Tweet media one
0
1
3
@FCHEN_AI
FENG CHEN
7 months
1/ Our new paper: “Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning” on how to change training to better exploit test-time compute! co-led by @AllanRaventos, w/ Nan Cheng, @SuryaGanguli & @ShaulDr https://t.co/xM49OB6sk7
1
5
19
@MasonKamb
Mason Kamb
8 months
Excited to finally share this work w/ @SuryaGanguli. Tl;dr: we find the first closed-form analytical theory that replicates the outputs of the very simplest diffusion models, with median pixel wise r^2 values of 90%+. https://t.co/SYkAAh6k4C
Tweet media one
20
153
940
@KuninDaniel
Daniel Kunin
9 months
Come check out our #NeurIPS2024 spotlight poster on feature learning tomorrow! 📍East Exhibit Hall A-C #2102 📅Thu 12 Dec 4:30 p.m. — 7:30 p.m. PST
@KuninDaniel
Daniel Kunin
1 year
🌟Announcing NeurIPS spotlight paper on the transition from lazy to rich🔦 We reveal through exact gradient flow dynamics how unbalanced initializations promote rapid feature learning co-led @AllanRaventos and @ClementineDomi6 @FCHEN_AI @klindt_david @SaxeLab @SuryaGanguli
0
7
49
@klindt_david
David Klindt
1 year
Great job, it was an honor being part of this amazing project! Congrats to the team 💪
@KuninDaniel
Daniel Kunin
1 year
🌟Announcing NeurIPS spotlight paper on the transition from lazy to rich🔦 We reveal through exact gradient flow dynamics how unbalanced initializations promote rapid feature learning co-led @AllanRaventos and @ClementineDomi6 @FCHEN_AI @klindt_david @SaxeLab @SuryaGanguli
0
2
19
@KuninDaniel
Daniel Kunin
1 year
Also, big shoutout to @yasamanbb, @CPehlevan, and @HSompolinsky for coordinating last year's 'Deep Learning from Physics and Neuroscience' program @KITP_UCSB. Our amazing team met there, and this project is a direct result of the conversations we had!
0
1
6
@KuninDaniel
Daniel Kunin
1 year
We provide empirical evidence that an unbalanced rich regime drives feature learning in deep networks, promotes interpretability of early layers in CNNs, reduces sample complexity of learning hierarchical data, and decreases time to grokking in modular arithmetic
Tweet media one
1
1
17
@KuninDaniel
Daniel Kunin
1 year
Applying our function space analysis to shallow ReLU networks, we find that rapid feature learning occurs from unbalanced initializations that promote faster learning in early layers, driving a large change in activation patterns, but a small change in parameter space
Tweet media one
1
1
7
@KuninDaniel
Daniel Kunin
1 year
We find three regimes in function space: 1. lazy akin to linear regression, 2. rich akin to silent alignment (Atanasov et al. 2021), 3. delayed-rich initially lazy followed by rich We extend this analysis (with mirror flows and implicit biases) to wide & deep linear networks
Tweet media one
1
1
4
@KuninDaniel
Daniel Kunin
1 year
We derive exact gradient flow solutions for a minimal two-layer linear model displaying lazy and rich learning, which reveals that the relative scale between layers influence feature learning through conserved quantities that constrain the geometry of learning trajectories
Tweet media one
1
1
7
@KuninDaniel
Daniel Kunin
1 year
Reproducing Fig.1 in Chizat et al. 2019 we find that even at small overall scale, the relative scale between layers can transition the network between rich and lazy learning and the best generalization occurs at small scale and large relative scale!
Tweet media one
1
1
9
@KuninDaniel
Daniel Kunin
1 year
Really cool theory project on feature learning. If you are at the HiLD workshop @icmlconf check it out!
@AllanRaventos
Allan Raventós
1 year
Interested in exactly solvable models of learning dynamics and implicit bias? Come check out our "Get Rich Quick" poster at the HiLD Workshop @icmlconf at 10am! With @KuninDaniel, myself, @ClementineDomi6, @FCHEN_AI, @klindt_david, @SaxeLab, and @SuryaGanguli.
Tweet media one
1
1
14
@AllanRaventos
Allan Raventós
1 year
Interested in exactly solvable models of learning dynamics and implicit bias? Come check out our "Get Rich Quick" poster at the HiLD Workshop @icmlconf at 10am! With @KuninDaniel, myself, @ClementineDomi6, @FCHEN_AI, @klindt_david, @SaxeLab, and @SuryaGanguli.
Tweet media one
0
13
48
@DebOishi
Oishi Deb
2 years
Reminder! Happening Tomorrow! @ELLISforEurope
@DebOishi
Oishi Deb
2 years
We are delighted to announce our next speakers for the @ELLISforEurope RG are fr @Stanford, the authors of the NeurIPS2023 paper (https://t.co/FNikSjwN97)-@FCHEN_AI @KuninDaniel @atsushi_y1230 & @SuryaGanguli on 13th Feb '24 @ 5pm CET on Zoom. Save the Date! Link to join the RG👇
0
1
5
@KuninDaniel
Daniel Kunin
2 years
To get the zoom link and get notified about other interesting talks check out
0
1
2
@KuninDaniel
Daniel Kunin
2 years
🚨@FCHEN_AI, @atsushi_y1230, and I will be presenting our NeurIPS2023 paper https://t.co/6NuqR0nRJA to ELLIS Mathematics of Deep Learning reading group tomorrow Feb 13 at 5pm CET. Join to learn more about stochastic collapse!
Tweet card summary image
arxiv.org
In this work, we reveal a strong implicit bias of stochastic gradient descent (SGD) that drives overly expressive networks to much simpler subnetworks, thereby dramatically reducing the number of...
1
2
15
@DebOishi
Oishi Deb
2 years
We are delighted to announce our next speakers for the @ELLISforEurope RG are fr @Stanford, the authors of the NeurIPS2023 paper (https://t.co/FNikSjwN97)-@FCHEN_AI @KuninDaniel @atsushi_y1230 & @SuryaGanguli on 13th Feb '24 @ 5pm CET on Zoom. Save the Date! Link to join the RG👇
Tweet card summary image
arxiv.org
In this work, we reveal a strong implicit bias of stochastic gradient descent (SGD) that drives overly expressive networks to much simpler subnetworks, thereby dramatically reducing the number of...
@DebOishi
Oishi Deb
3 years
I am delighted to be a chair for an @ELLISforEurope Reading Group on Mathematics of Deep Learning along with @LinaraAdylova and Sidak @unregularized. The link to join the group is here - https://t.co/lkq0IlUDbA, looking forward to meeting new people! @CompSciOxford @oxengsci
Tweet media one
0
4
17