sameQCU Profile Banner
サメQCU Profile
サメQCU

@sameQCU

Followers
1K
Following
70K
Media
8K
Statuses
35K

back to the regularly scheduled cryptic posts DMs open.

1 regional flight from you
Joined September 2020
Don't wanna be here? Send us removal request.
@sameQCU
サメQCU
12 days
Okay spent a lot of last evening talking with voooooogel , which was nice for several reasons besides suggestions for compute. There is around 1 hour left in the day to get coffee/breakfast with me while I'm in town.
1
0
15
@sameQCU
サメQCU
15 hours
steam reviewer: "kind of like a first person kenshi in a way"
Tweet media one
0
0
5
@sameQCU
サメQCU
16 hours
guys its dwarf fortress, its the sequel to dwarf fortress
Tweet media one
@sameQCU
サメQCU
18 hours
this is whawt im doing instead of playing devil spire 2 btw. i bought devil spire 2 sight unseen because i played devil spire 1 and im doing this insteaed of the fun new videogame.
1
0
9
@sameQCU
サメQCU
16 hours
@kalomaze strangely fixing errors made the 'critical batch size' *even higher*. im using a geometric recursive dynamic accumulation schedule, which can shrink as fast as it wants (but can only grow by 25% per fast gradient noise scale estimation window).and it wants total batch: 1500
Tweet media one
0
0
0
@sameQCU
サメQCU
16 hours
@kalomaze push'd
Tweet media one
1
0
0
@sameQCU
サメQCU
16 hours
@kalomaze gradient norms and losses are finally proportionate to what they should be
Tweet media one
1
0
0
@sameQCU
サメQCU
16 hours
@kalomaze "human importslop".
@sameQCU
サメQCU
18 hours
anyways no matter how bad this debugging might sound it is:.1: better than using human developed importslop. im sorry but i can't use your normie dataset handling code it literally doesn't work.2: better than the upstream repositories im adapting.again im sorry i just can't use -
Tweet media one
1
0
0
@sameQCU
サメQCU
18 hours
AAWWWWWWW YEAAAAAAAAAAAAAAAHHHH IT'S ALL COMING UP BATCH SHAPED
Tweet media one
0
0
1
@sameQCU
サメQCU
18 hours
at least i made the batch caching code like 1000x faster so im ooda looping into the broken bs super fast.
1
0
0
@sameQCU
サメQCU
18 hours
this is whawt im doing instead of playing devil spire 2 btw. i bought devil spire 2 sight unseen because i played devil spire 1 and im doing this insteaed of the fun new videogame.
1
0
4
@sameQCU
サメQCU
18 hours
anyways no matter how bad this debugging might sound it is:.1: better than using human developed importslop. im sorry but i can't use your normie dataset handling code it literally doesn't work.2: better than the upstream repositories im adapting.again im sorry i just can't use -
Tweet media one
1
0
5
@sameQCU
サメQCU
18 hours
huh. google ai studio gemini melts at the same kind of context length use as gemini-cli. there's no google ai studio advantage other than your human intuition being better at copy pasting slices of input code to read than gemini doing self-directed filereads.
Tweet media one
1
0
3
@sameQCU
サメQCU
19 hours
guys i have bad news gemini 2.5 doesn't have a world model of tensor shapes. it literally has to make print statements like a mundane human (it has simply never been exposed to enough pytorch debugging to have a plausible origin of a 'world model' here).
1
0
16
@sameQCU
サメQCU
1 day
of course the ultimate feature combination needed for a proper 2tb/s+ hardware unit might make this kind of alternate optical signal encoding too expensive or even impossible with COTs parts. that's why i'm blasting these questions through the sota LLM, pure feasibility guesswork
Tweet media one
0
0
1
@sameQCU
サメQCU
1 day
some *very* interesting speculation by the gemini 2.5 instance. is fiberoptic switching hardware more flexible than the engineers who work on it will ever admit?.of course!.can you do the 'opposite of ml model quantization' to force higher data rates on short hops? .yeppers!
Tweet media one
1
0
2
@sameQCU
サメQCU
1 day
oh okay i was supposed to read part of the gemini output and specifically look for 50G PAM4 transceivers, these results are way less stale (web search is still always keyword search, and becomes less precise as you add semantic detail. easy to forget in the age of LLMs).
2
0
0
@sameQCU
サメQCU
1 day
other search results are 'what if we used optical transceivers. but like, for space, because, uh, space'. quite disappointing because the last reference i checked didn't even do any 'space hardening' tests on the devices they listed, it read like an incomplete gamefaqs guide.
1
0
0
@sameQCU
サメQCU
1 day
all of the websites claiming to have MZI devices suck and want you to go through a trad engineering slowbrain dullard sales funnel in relation to some imaginary 'standard scientific/industrial narrow application'. i don't even see a data rate here!
Tweet media one
1
0
1
@sameQCU
サメQCU
1 day
@lumpenspace if we look past the slopwords, and think about what the language model is doing here in pragmatics, it's choosing a more flexible and metaphorical language to *talk down* to the user, (but in a congenial way). doing this makes 'higher signal' synthesis more likely!
Tweet media one
0
0
1
@sameQCU
サメQCU
1 day
see @lumpenspace 's perpetual argument ad Improv. the argument ad Improv is totally correct! i'm not saying that you must *always* 'play low status' relative to the language model to get it to be helpful, but that the status-axis exists and changes what it can write.
1
0
2
@sameQCU
サメQCU
1 day
if you aren't roleplaying as a bumbling fool who is *missing* context but sensibly responds to *introduced* context, the langauge model is probably missing a narrative template for how to build didactic writing which self-prompts it to give the best and most consistent prose.
2
0
3