
Yacine Mahdid
@yacinelearning
Followers
12K
Following
25K
Media
1K
Statuses
8K
(neuro/ai) I make technical deep learning tutorials 👺
Montreal Canada
Joined January 2019
if there is one thing that you must not do is surrender. don’t surrender your dreams, your passion, your curiosity or your freedom. never
6
19
230
this weekend we are going to figure if the clankers have a theory of mind or not
2
1
30
oh this ties up so well to that neuro paper review we did 3 weeks ago
This paper finds LLMs' ability to understand that others have different beliefs (Theory of Mind) comes from 0.001% of their parameters. Break those specific weights & the model loses both its ability to track what others know AND language comprehension. Interesting implications.
2
3
25
tldw on muon-clip in kimi k2:. - regular muon with weight decay.- save the max attention logit per attention head (s_max in the thumbnail).- for each head calculate the scaling factor mu (at most 1).- scale the weights of Q and K for the next iteration with mu and alpha.- profit.
0
0
26
for teacher day we are going to figure out what the heck is muon-clip today live. so hop in and do drop your deep learning questions in the chat
4
0
33
TO ALL THE HATERS OF THE WAFFLEHOUSE ON HACKERNEWS MICHAEL PLANNED HIS WHOLE LIFE TOO OK THIS IS 100% NORMAL BEHAVIOR
0
0
14
a lot of folks are confused about linkedin but it’s not hard:. its a year long conference with networking event. yeah there’s the cringe talk on stage about this or that product. yeah everyone is boasting about their career. but you’re there to meet people and get their dm info.
9
1
43
that make so much sense.
One might think shampoo is that weird radical-ass-new-optimizer that promises wild performance. let me tell you something reassuring. shampoo (with blocks), is *generalization* of adam. That is, shampoo with specific hyperparameter is adam. It is logically impossible for.
0
0
4
the only bits of info we have from ilya is the godamn meme hairline merch?. THE MEME HAIRLINE MERCH?????.
5
0
30
that type of work for free whaaaaaat.
Fuck it. Today, we open source FineVision: the finest curation of datasets for VLMs, over 200 sources!. > 20% improvement across 10 benchmarks.> 17M unique images.> 10B answer tokens.> New capabilities: GUI navigation, pointing, counting. FineVision 10x’s open-source VLMs.
1
0
17
here it is folks: . btw we’ll cover muon-clip soon so do not despair.
0
3
16
here are 7min of semi-rough talk to beginners in deep learning feeling stuck right now. the tldr my guys is that deep learning is just a tool, you gotta figure out what you want to apply it to. but also be pragmatic about your current situation and the market environment . đź«‚
12
16
383
RT @yacinelearning: *coffee spilling all over the table*.*barista look at us aghast*.*sirens in the distance*
0
2
0
the one I hate the most is “we are only using 10% of our brain capacity”. my sweet dear child why would you want to have generalized seizure.
there are few pop science neuroscience theories i hate more than "the brain is actually quantum", it's surface level, reddit-tier neil degrasse tyson fan kinda bullshit.there's no selection pressure for that kind of complexity, if anything, there was selection against it.
4
2
57
*coffee spilling all over the table*.*barista look at us aghast*.*sirens in the distance*
3
2
26
I met a business analyst yesterday and he asked me what I thought the next frontier for LLMs is. you will never guess what I said.
5
0
39