David Marx (@digthatdata.bsky.social)
@DigThatData
Followers
4K
Following
11K
Media
1K
Statuses
10K
Generative AI MLE, FOSS toolmaker, innovation catalyst @CoreWeave + @AiEleuther. https://t.co/z0fpuhlWRs
Seattle, WA
Joined November 2013
Remember: by participating on twitter, you are adding value to it, thereby incentivizing others to join/stay, and amplifying the voice/reach of the people who set its algorithm's biases. Is this the ideology you want to amplify? Because if you are here, you are amplifying this.
1
0
0
'we're in this bizarre world where the best way to learn about llms... is to read papers by chinese companies. i do not think this is a good state of the world' - us labs keeping their architectures and algorithms secret is ultimately hurting ai development in the us.
5
26
125
> Republicans: "We love America! It has the greatest system of governance. Look, I even carry the constitution next to my heart like a little bible :*) " > Also republicans: "DISMANTLE THE GOVERNMENT! FUCK THE SEPARATION OF POWERS! GOD KING PRESIDENT CULT OF PERSONALITY!"
0
0
2
๐ฅ Are you ever dissatisfied with the imprecise names in vision-language datasets? ๐ At #NeurIPS2024, we introduce ๐๐๐๐๐๐๐๐, showing how better segmentation dataset names lead to ๐๐๐ญ๐ญ๐๐ซ ๐ญ๐ซ๐๐ข๐ง๐ข๐ง๐ & ๐๐ฏ๐๐ฅ๐ฎ๐๐ญ๐ข๐จ๐ง. Letโs dive in! ๐งต๐
1
7
35
Yo this paper is wild.
๐Announcing NeurIPS spotlight paper on the transition from lazy to rich๐ฆ We reveal through exact gradient flow dynamics how unbalanced initializations promote rapid feature learning co-led @AllanRaventos and @ClementineDomi6 @FCHEN_AI @klindt_david @SaxeLab @SuryaGanguli
0
1
1
New NanoGPT training speed record: 3.28 FineWeb val loss in 4.66 minutes Previous record: 5.03 minutes Changelog: - FlexAttention blocksize warmup - hyperparameter tweaks
8
22
244
This is officially the new record! Congrats @hi_tysam (who is also an OG of CIFAR-10 speedrunning) https://t.co/95LfOB66Ev
New NanoGPT training speed record: 3.28 FineWeb val loss in 4.66 minutes Previous record: 5.03 minutes Changelog: - FlexAttention blocksize warmup - hyperparameter tweaks
3
11
144
Is KL-regularization the right tool for language model alignment? The ฯPO algorithm: We show that a one-line change to DPOโmoving from KL to chi-squared regularizationโis sufficient to achieve state-of-the-art theoretical guarantees, provably alleviating over-optimization.
3
25
206
Narrative on X: ๐ฆ has no AI/ML and just talks about itself My actual feed on ๐ฆ:
61
64
1K
You can choose what feeds you have on your homepage -- I personally like the "popular with friends" feed (in the above tweet). The default is "following" - which looks like this for me:
4
5
123
Because X has tended to censor discussion of social networks I won't link directly, but look for this post to get an instant AI/ML feed thanks to @maosbot
10
14
196
We announce LAION-DISCO-12M - a collection of 12 million links to publicly available YouTube samples paired with metadata to support basic machine learning research in foundation models for generic audio and music. https://t.co/rREm3aeFuU
laion.ai
LAION announces the LAION-DISCO-12M - a collection of 12 million links to publicly ava...
0
44
190