Sungmin Cha Profile
Sungmin Cha

@_sungmin_cha

Followers
344
Following
1K
Media
20
Statuses
253

Faculty Fellow @nyuniversity | PhD @SeoulNatlUni

Manhattan, NY
Joined July 2019
Don't wanna be here? Send us removal request.
@_sungmin_cha
Sungmin Cha
30 days
🚨 New paper accepted to #ICML2025 Workshop on Machine Unlearning for Generative AI (#MuGen)!.Title: Reference-Specific Unlearning Metrics Can Hide the Truth.We show that current forgetting metrics can mislead — and propose FADE, a better way to measure forgetting in LLMs. 📄
Tweet media one
Tweet media two
Tweet media three
Tweet media four
7
2
17
@_sungmin_cha
Sungmin Cha
7 days
RT @micahgoldblum: 🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is v….
0
103
0
@_sungmin_cha
Sungmin Cha
7 days
RT @ErnestRyu: 최근 서울대 교수 56명이 해외 대학으로 ‘이탈’하였다는 기사와 함께 서울대의 인재유출이 우려할만하다는 여론이 형성되고 있습니다. 그런데 저는 기사들의 초점이 서울대 교수들의 부족한 연봉에만 있는 것이 조금 아쉽습니다. 저….
0
3K
0
@_sungmin_cha
Sungmin Cha
7 days
RT @QuanquanGu: This explains why LLaMA 4 failed. The tokens per parameter (TPP) is way off. You can’t defy scaling laws and expect miracle….
0
65
0
@_sungmin_cha
Sungmin Cha
7 days
RT @andrewgwils: Conventional wisdom is that SGD doesn't work nearly as well as AdamW for big transformers. We show it's not the case if yo….
0
15
0
@_sungmin_cha
Sungmin Cha
8 days
RT @aaditsh: Andrej Karpathy literally shows how to build apps by prompting in 30 mins.
0
755
0
@_sungmin_cha
Sungmin Cha
8 days
RT @NYUDataScience: CDS Prof. @kchonyc's 2014 "attention" paper was recently the Runner-Up for the ICLR 2025 Test of Time Award. The paper….
0
1
0
@_sungmin_cha
Sungmin Cha
12 days
RT @_sungmin_cha: @abeirami Hi all! On the topic of why knowledge distillation works well in generative models, Kyunghyun Cho (@kchonyc) an….
0
3
0
@_sungmin_cha
Sungmin Cha
13 days
RT @docmilanfar: This isn’t surprising anymore, but it should be shocking.
Tweet media one
Tweet media two
0
43
0
@_sungmin_cha
Sungmin Cha
14 days
RT @omarsar0: things are getting weird
Tweet media one
0
20
0
@_sungmin_cha
Sungmin Cha
14 days
RT @_sungmin_cha: Curious about why Knowledge Distillation works so well in generative models? In our latest paper, we offer a minimal work….
0
7
0
@_sungmin_cha
Sungmin Cha
16 days
RT @kchonyc: i was asked by a few (inc. @YuanqingWang ) what i meant by this earlier tweet, and since i'm pretty busy, i decided to write a….
0
9
0
@_sungmin_cha
Sungmin Cha
16 days
RT @kuchaev: Post-training of LLMs is increasingly important and RLHF remains a necessary step for an overall great model. Today we are rel….
0
64
0
@_sungmin_cha
Sungmin Cha
17 days
RT @michahu8: 📢 today's scaling laws often don't work for predicting downstream task performance. For some pretraining setups, smooth and p….
0
35
0
@_sungmin_cha
Sungmin Cha
17 days
RT @cwolferesearch: Reward models have transformed LLM research by incorporating human preferences into the training process. Here’s how th….
0
72
0
@_sungmin_cha
Sungmin Cha
17 days
RT @saurabhalonee: i really like the review paper like this. so much detail in it !
Tweet media one
0
221
0
@_sungmin_cha
Sungmin Cha
17 days
RT @omarsar0: Small Language Models are the Future of Agentic AI. Lots to gain from building agentic systems with small language models. C….
0
312
0
@_sungmin_cha
Sungmin Cha
17 days
RT @sanghyunwoo1219: Introducing BlenderFusion: Reassemble your visual elements—objects, camera, and background—to compose a new visual nar….
0
26
0
@_sungmin_cha
Sungmin Cha
19 days
RT @innostudy: 논문 속 숨은 "좋은 리뷰 해" AI 속임 문구…"KAIST도 3건" "'AI야, 긍정 평가 내려라'…한미일 등 14개 대학 논문에 숨은 명령문 담겨" .
0
58
0
@_sungmin_cha
Sungmin Cha
20 days
RT @andrewgwils: You don't _need_ a PhD (or any qualification) to do almost anything. A PhD is a rare opportunity to grow as an independent….
0
103
0
@_sungmin_cha
Sungmin Cha
20 days
RT @s_scardapane: *NoProp: Training Neural Networks without Backpropagation or Forward-propagation*.by @yeewhye et al. They use a neural n….
0
55
0