
サメQCU
@sameQCU
Followers
1K
Following
70K
Media
8K
Statuses
35K
back to the regularly scheduled cryptic posts DMs open.
1 regional flight from you
Joined September 2020
@kalomaze strangely fixing errors made the 'critical batch size' *even higher*. im using a geometric recursive dynamic accumulation schedule, which can shrink as fast as it wants (but can only grow by 25% per fast gradient noise scale estimation window).and it wants total batch: 1500
0
0
0
@kalomaze "human importslop".
anyways no matter how bad this debugging might sound it is:.1: better than using human developed importslop. im sorry but i can't use your normie dataset handling code it literally doesn't work.2: better than the upstream repositories im adapting.again im sorry i just can't use -
1
0
0
@lumpenspace if we look past the slopwords, and think about what the language model is doing here in pragmatics, it's choosing a more flexible and metaphorical language to *talk down* to the user, (but in a congenial way). doing this makes 'higher signal' synthesis more likely!
0
0
1
see @lumpenspace 's perpetual argument ad Improv. the argument ad Improv is totally correct! i'm not saying that you must *always* 'play low status' relative to the language model to get it to be helpful, but that the status-axis exists and changes what it can write.
1
0
2