Musk Viewer
About
Privacy Policy
Removal Request
rohan anil
@_arohan_
2 years
For improving the replication in ML, why not ship the 0th step (init values) and 1st step weights (single opt step) with every architecture release?
11
3
82
Replies
Ludger Paehler
@ludgerpaehler
2 years
@_arohan_
At high performance computing conferences we have an artifact evaluation to provide all of the above information and badges are being awarded based on that - () might potentially also be a recipe for machine learning conferences?
1
2
5
rohan anil
@_arohan_
2 years
@ludgerpaehler
This is a great initiative and idea, hoping ML conferences appreciate the engineering aspects :)
0
0
2
Vincent Lordier
@vlordier
2 years
@_arohan_
wouldn't a seed with the init code be enough for 0th ?
2
0
1
rohan anil
@_arohan_
2 years
@vlordier
No because frameworks dont agree on type of init or itβs parameterization.
1
0
3
Yaroslav Bulatov
@yaroslavvb
2 years
@_arohan_
This may help, while also uncovering a slew of differences due to low-level software stack (like this one from pre-GPU era )
1
0
6
Protim
@pr0timr
2 years
@_arohan_
I think for a stochastic process, the entire trajectory is required. Seed, initialized weights and 1st opt step is good for gd or adam?
0
0
0
Shanqing Cai
@sqcai
2 years
@_arohan_
Also a checksum of the data.
1
0
1
Cristian Garcia
@cgarciae88
2 years
@_arohan_
Data can be the trickiest part for reproducibility.
0
0
1
Thomas Capelle
@capetorch
2 years
@_arohan_
@weights_biases
has entered the chat π
0
0
3
Leshem Choshen π€π€
@LChoshen
2 years
@_arohan_
And the data shuffling order?
0
0
2
Stephen Roller
@stephenroller
2 years
@_arohan_
Another helpful one is releasing all the model predictions on the final dataset; not just an average metric.
0
0
6
JFPuget πΊπ¦
@JFPuget
2 years
@_arohan_
and random seed.
0
0
3
Pranav Chaturvedi
@pranavchatman
2 years
@_arohan_
Yup yup! Great idea.
0
0
1