@_arohan_
rohan anil
2 years
For improving the replication in ML, why not ship the 0th step (init values) and 1st step weights (single opt step) with every architecture release?
11
3
82

Replies

@ludgerpaehler
Ludger Paehler
2 years
@_arohan_ At high performance computing conferences we have an artifact evaluation to provide all of the above information and badges are being awarded based on that - () might potentially also be a recipe for machine learning conferences?
1
2
5
@_arohan_
rohan anil
2 years
@ludgerpaehler This is a great initiative and idea, hoping ML conferences appreciate the engineering aspects :)
0
0
2
@vlordier
Vincent Lordier
2 years
@_arohan_ wouldn't a seed with the init code be enough for 0th ?
2
0
1
@_arohan_
rohan anil
2 years
@vlordier No because frameworks dont agree on type of init or it’s parameterization.
1
0
3
@yaroslavvb
Yaroslav Bulatov
2 years
@_arohan_ This may help, while also uncovering a slew of differences due to low-level software stack (like this one from pre-GPU era )
1
0
6
@pr0timr
Protim
2 years
@_arohan_ I think for a stochastic process, the entire trajectory is required. Seed, initialized weights and 1st opt step is good for gd or adam?
0
0
0
@sqcai
Shanqing Cai
2 years
@_arohan_ Also a checksum of the data.
1
0
1
@cgarciae88
Cristian Garcia
2 years
@_arohan_ Data can be the trickiest part for reproducibility.
0
0
1
@capetorch
Thomas Capelle
2 years
@_arohan_ @weights_biases has entered the chat 😎
0
0
3
@LChoshen
Leshem Choshen πŸ€–πŸ€—
2 years
@_arohan_ And the data shuffling order?
0
0
2
@stephenroller
Stephen Roller
2 years
@_arohan_ Another helpful one is releasing all the model predictions on the final dataset; not just an average metric.
0
0
6
@JFPuget
JFPuget πŸ‡ΊπŸ‡¦
2 years
@_arohan_ and random seed.
0
0
3
@pranavchatman
Pranav Chaturvedi
2 years
@_arohan_ Yup yup! Great idea.
0
0
1