Luke McDermott @lukemcdermotttt profile

Luke McDermott

@lukemcdermotttt

Followers

312

Following

531

Media

10

Statuses

66

AI Researcher at Modern Intelligence, focusing on Efficient Deep Learning & Adaptive AI | Incoming PhD @UCSanDiego

https://t.co/X1BPPIFhBG

San Diego, CA

Joined May 2023

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

Elon • 680412 Tweets

Argentina • 431526 Tweets

Celtics • 247659 Tweets

Luka • 153137 Tweets

FANDOMS CONTRA A PL 1904 • 153050 Tweets

イーロン • 115917 Tweets

Villarruel • 103900 Tweets

#ウマ娘 • 102422 Tweets

Dallas • 97502 Tweets

Mavs • 93697 Tweets

Tatum • 85606 Tweets

自分のポスト • 85265 Tweets

MARTIN AL 9009 • 84021 Tweets

コロンブス • 77530 Tweets

自分のツイート • 72796 Tweets

#ENGFAxNONGCHAT • 64489 Tweets

Mavericks • 48374 Tweets

プライバシー • 45655 Tweets

#NBAFINALS • 44924 Tweets

Midori • 42666 Tweets

Jaylen Brown • 41858 Tweets

自分の投稿 • 40787 Tweets

生存確認 • 36884 Tweets

ブックマーク • 30692 Tweets

仕様変更 • 25843 Tweets

Mrs. GREEN APPLE • 25270 Tweets

カブさん • 16748 Tweets

佐藤弘道 • 16399 Tweets

脊髄梗塞 • 15260 Tweets

ミセスのMV • 14848 Tweets

ブラウザ版 • 14405 Tweets

下半身麻痺 • 12691 Tweets

ジャンポケ • 12235 Tweets

アオキさん • 12061 Tweets

ミセス炎上 • 11522 Tweets

#RejectFinanceBill2024 • 10861 Tweets

フジロック

公開停止

サンリオコラボ

ブルースカイ

ゴクオーくん

ブルスカ

Jason Kidd

AIゆりこ

例のMV

ポケマス

メタル化

キラーズ

#DMMGAMESお祭り最新CM

ひろみちお兄さん

Last Seen Profiles

@Kai_Schultz

@WE99i

@abir

@Phil_Coutinho

@bokephotviral

@ciroso777

@JimScheffres

@pjohnson13847

@ElaineGodoy21

@CerenHacer50929

@DevvonTerrell

@leone2593

@LasRadda

@LucyDavila35550

@Humble_jay12

@johanuuuu

@Pedro01657378

@hitam_jawa440

@VeronicaRuckh

@ShopperVery

Pinned Tweet

Luke McDermott

@lukemcdermotttt

2 months

Wooooh! Happy to officially say I’ll be staying in San Diego for a PhD in ML & Data Science @ UCSD ECE

4

0

47

Luke McDermott

@lukemcdermotttt

1 month

update: pushed changes to the bio

Luke McDermott

@lukemcdermotttt

1 month

My PhD acceptance is in superposition

17

24

2K

16

51

4K

Luke McDermott

@lukemcdermotttt

1 month

My PhD acceptance is in superposition

17

24

2K

Luke McDermott

@lukemcdermotttt

1 month

@herbschang It’s been my dream ever since I was a little quark

0

180

Luke McDermott

@lukemcdermotttt

1 month

@AgPedram This was actually after I attended the workshop haha

1

0

160

Luke McDermott

@lukemcdermotttt

8 months

@kellerjordan0 Random features are quite powerful still, you can get near optimal on ResNet-18/CIFAR-10 (95% acc) by only training the first few layers + last layer, roughly 15% of the weights. The gradients are only large for the first and last layer in this setting.

1

0

9

Luke McDermott

@lukemcdermotttt

4 months

First time in Vancouver & first spotlight presentation @ AI2ASE #AAAI2024

1

10

Luke McDermott

@lukemcdermotttt

6 months

Thanks to everyone who came by #NeurIPS2023 & @unireps

Luke McDermott

@lukemcdermotttt

6 months

I’ll be at #NeurIPS2023 presenting two works at UniReps! I’ll be there all week, so I’m looking to meet up to find collaborators, PhD positions, or just talk about new research. Feel free to reach out. Links to the papers below ⬇️

1

0

7

0

2

10

Luke McDermott

@lukemcdermotttt

6 months

I’ll be at #NeurIPS2023 presenting two works at UniReps! I’ll be there all week, so I’m looking to meet up to find collaborators, PhD positions, or just talk about new research. Feel free to reach out. Links to the papers below ⬇️

Luke McDermott

@lukemcdermotttt

8 months

Happy to announce that my paper “Linear Mode Connectivity in Sparse Neural Networks” was accepted at NeurIPS 2023’s UniReps workshop. link: 1/n

1

6

1

0

7

Luke McDermott

@lukemcdermotttt

8 months

Happy to announce that my paper “Linear Mode Connectivity in Sparse Neural Networks” was accepted at NeurIPS 2023’s UniReps workshop. link: 1/n

1

6

Luke McDermott

@lukemcdermotttt

2 months

And for a bit of a biased reason…

Luke McDermott

@lukemcdermotttt

2 months

Wooooh! Happy to officially say I’ll be staying in San Diego for a PhD in ML & Data Science @ UCSD ECE

4

0

47

0

1

6

Luke McDermott

@lukemcdermotttt

6 months

My hot take at #NeurIPS2023 , Beignets are overrated…

1

0

5

Luke McDermott

@lukemcdermotttt

11 months

@EvMill Why not just zero/mask off the output if a head is deemed “unconfident”? This would encourage expert attention heads that only activation when needed, similar to how ReLU creates modularity in MLP’s. Softmax would still be needed for confident heads in this case though.

0

5

Luke McDermott

@lukemcdermotttt

1 month

@joshmobleymusic @elevenlabsio Just wait for video games with auto-generated music depending on your environment & actions

1

0

4

Luke McDermott

@lukemcdermotttt

9 months

Great time at FastML for Science & Imperial College London. Presented work on NAS / Pruning for anomaly detectors on the L1 trigger at CERN.

0

3

Luke McDermott

@lukemcdermotttt

2 months

app idea: overleaf for the phone someone pls make it happen

1

0

4

Luke McDermott

@lukemcdermotttt

9 months

Final day at #Automl2023 @automl_conf , presenting work on Neural Network Pruning & Dataset Disillation

0

3

Luke McDermott

@lukemcdermotttt

8 months

@roydanroy "~ N(0,1)" enjoyers meet "(Omega, F, P)" purists

0

3

Luke McDermott

@lukemcdermotttt

7 months

@finbarrtimbers Always wondered this. 14B param model still has lots of activations. In pruning literature, removing individual weights is significantly better than removing neurons because a large representation matters more than the transformation from the weights.

1

0

3

Luke McDermott

@lukemcdermotttt

7 months

@kohjingyu What residencies today do you recommend before committing to a PhD program?

1

0

2

Luke McDermott

@lukemcdermotttt

15 days

@0xcd16 @vsbuffalo entirely research focused + 10 min walk to the beach + pretty big school + lack of rowdy students All makes for a great research school. Only lacks name recognition / selectivity that the other schools have

0

2

Luke McDermott

@lukemcdermotttt

11 months

@elan_learns @EvMill Yes, attention heads need modularity/become an expert in different features. Maybe there needs to be some Relu-style component paired with softmax to disable non-confident activations. Dynamic sparsity is essential.

0

2

Luke McDermott

@lukemcdermotttt

8 months

Thanks @HaoliYin , @EmilyLiJiayao , and Eva !

gm8xx8

@gm8xx8

8 months

Gradual Fusion Transformer (GraFT) advances ReID in computer vision with fusion tokens. Captures features efficiently and surpasses benchmarks. Optimized for size-performance balance using neural pruning.

0

1

0

2

Luke McDermott

@lukemcdermotttt

5 months

@SebastianB929 speculative decoding?

Fast Inference from Transformers via Speculative Decoding

Inference from large autoregressive models like Transformers is slow - decoding K tokens takes K serial runs of the model. In this work we introduce speculative decoding - an algorithm to sample...

arxiv.org

1

0

2

Luke McDermott

@lukemcdermotttt

6 months

First authored “Linear Mode Connectivity in Sparse Neural Networks” Arxiv link: Co-authored “UniCat: Crafting a Stronger Fusion Baseline for Multimodal Re-Identification” Arxiv link:

Linear Mode Connectivity in Sparse Neural Networks

With the rise in interest of sparse neural networks, we study how neural network pruning with synthetic data leads to sparse networks with unique training properties. We find that distilled data,...

arxiv.org

1

0

2

Luke McDermott

@lukemcdermotttt

3 months

@HaoliYin @EmilyLiJiayao you guys have no clue how many models I had to delete at modern haha

0

2

Luke McDermott

@lukemcdermotttt

7 months

Just waiting for someone to open source a foundation-model-sized supernet for LLMs. Massive pretraining costs, yet academics can cheaply sample the search space for their use cases.

Yann LeCun

@ylecun

8 months

Which is why open research communities wins.

13

30

308

0

2

Luke McDermott

@lukemcdermotttt

8 months

@kellerjordan0 Gradient sparsity at each layer: 0.07 0.30 0.03 0.01 0.00 0.01 0.06 0.00 0.19 0.32 0.30 0.71 0.03 0.82 0.97 0.995 0.9995 0.94 0.998 0.999992 0.02

1

0

2

Luke McDermott

@lukemcdermotttt

4 months

@BlackHC @arankomatsuzaki If I had a dollar for every time an ML paper is posted and the comments say “Isn’t this just x”…

1

0

1

Luke McDermott

@lukemcdermotttt

8 months

@GCazenavette @GzyAftermath @VictorKaiWang1 Awesome work, what are the main limitations with cross-architectural generalization to ViTs?

0

1

Luke McDermott

@lukemcdermotttt

7 months

@finbarrtimbers @cosminnegruseri Look for N:M, mixed, or semi-structured sparsity. Those are all names that I’ve seen for types of Sparsity around this. A100s can do 2:4 sparsity well, but most GPUs don’t so people tend to stick to vanilla structured pruning instead

0

1

Luke McDermott

@lukemcdermotttt

8 months

@kellerjordan0 If you keep track of the gradients during training by keeping a running sum of magnitudes, then with post-training info you can train that same initialization with much fewer gradients (freeze the ones that had the lowest movements)

1

0

1

Luke McDermott

@lukemcdermotttt

7 months

@finbarrtimbers In this setting it’s 50% weight Sparsity vs quantizing in half. I’m willing to bet quantizing hurts the model less.

0

1

Luke McDermott

@lukemcdermotttt

6 months

Special thanks to Jenni Crawford and @HaoliYin for leading UniCat and to Daniel Cummings for advising both papers!

0

1

Luke McDermott

@lukemcdermotttt

6 months

@ClementBonnet16 yes !

0

1

Luke McDermott

@lukemcdermotttt

8 months

@kellerjordan0 In practice obviously you can’t do this, but you see a generally trend of what gradients are important (early layers & the FC layer)

0

1

Luke McDermott

@lukemcdermotttt

8 months

To find these subnetworks, we leverage distilled data during the retraining stage of IMP to take advantage of the compressed representations. 5/n thanks!

0

1

Luke McDermott

@lukemcdermotttt

1 month

@Miles_Brundage Or a beer for the workplace that looks like nitro cold brew

0

1

Luke McDermott

@lukemcdermotttt

1 month

@hadesinwinter “I am sorry to inform that after significant consideration, I have to reject this rejection. Best of luck with your other applicants, see you in fall”

1

0

1

Luke McDermott

@lukemcdermotttt

8 months

@kellerjordan0 The final blocks (besides FC) have the lowest gradient movements where as the middle ones can be still important. I can send checkpoints later today. For now, I have a high performing model with 90% of the weights frozen and here’s the distribution of frozen weights 👇

1

0

1