sameQCU Profile Banner
サメQCU Profile
サメQCU

@sameQCU

Followers
1K
Following
70K
Media
8K
Statuses
35K

back to the regularly scheduled cryptic posts DMs open.

1 regional flight from you
Joined September 2020
Don't wanna be here? Send us removal request.
@sameQCU
サメQCU
13 days
Okay spent a lot of last evening talking with voooooogel , which was nice for several reasons besides suggestions for compute. There is around 1 hour left in the day to get coffee/breakfast with me while I'm in town.
1
0
15
@sameQCU
サメQCU
3 hours
this means going from contortedly simple and low spatial resolution test cases to more complex conditioning chains, judgement systems, and networks in lots of embarassingly simple steps of course.
0
0
0
@sameQCU
サメQCU
3 hours
ideally you want for every single stage in this to be vibecodeable (by the stnadards of the nauseating accelerando of gemini 2.5 'vibe coding' lmao)so you can do 4 years worth of ICLR papers and responses to papers in like, idk, 1-2 weeks.
1
0
0
@sameQCU
サメQCU
3 hours
starting by making a model draw roundels and contour lines on progressively more complicated local geometries on fantasy vehicles is a somewhat sensible first stage in a model design sequence that ends in (text, R^2, SO3) multi-reference generative modeling of whole textures.
1
0
1
@sameQCU
サメQCU
3 hours
the crux of it is that i think there's some really blunt and obvious way to turn what are no more complicated than vector graphics instructions into path-continuation autoregression objectives, which let you test feasibility of like, sensory-driven self-supervised-learning.
1
0
1
@sameQCU
サメQCU
3 hours
okay i still don't know what a holonomy is but i might want to do a holonomy generative model anyways
Tweet media one
1
0
2
@sameQCU
サメQCU
4 hours
but it sure does make winograds meaningless. like, once you discover the winograd <-> sphereogradient divergence shows. e.g. up as you maximize a simple, easy to record surrogate score while never adding richer or more concrete 'examples' to your 'general intelligence corpora'.
0
0
0
@sameQCU
サメQCU
4 hours
this isn't 'ai pessimistic' either, as by definition we have two different models, which we both expect to produce useful feature detectors, predictors, classifiers when cut from the same cloth (opt'd from the same pytorch).
1
0
0
@sameQCU
サメQCU
4 hours
similarly, you can compare weak 'model B' tools against strong 'model A' tools by any scaling property you like, and vice versa. this isn't a loss function, you know? you're measuring divergence over a feature embedding, to see if certain 'generalizations' are ever possible!
Tweet media one
1
0
0
@sameQCU
サメQCU
4 hours
what's delightful and funny about what i've just written is that there's simply no stopping condition or output action implied by this measure!.if you find that one dataset for 'model A' produces huge divergences in predictions of molecule properties vs 'model B', dataset scored!.
1
0
1
@sameQCU
サメQCU
4 hours
you can then, instead of measuring your 'agi model' in terms of 'loss on own dataset', measure the kl divergence between 'model A's outputs on molecule property prediction tasks, and the outputs of 'model B', which has been given unabstracted direct access to physics.
1
0
1
@sameQCU
サメQCU
4 hours
'model B' is a spherical convolution network trained on the literal physical properties of molecules. (
Tweet media one
1
0
1
@sameQCU
サメQCU
4 hours
like lemme give you a little secret:.if you wanted to measure how well 'intelligence' generalizes, you would train two models, model A and model B. 'model A' gets 'general intelligence data', like textbooks about molecules, and gets *no sensory data* outside of graphemes.
1
0
0
@sameQCU
サメQCU
4 hours
well, like, somewhat obviously, we're discovering not 'tokenization errors', 'architecture issues', 'unfair benchmarks', or 'x-modality'. instead we discover we bought into a *type of guy's* personality, who wants irrelevant metrics to *mean too much* to *too many other people*.
1
0
2
@sameQCU
サメQCU
4 hours
if the best and most expensive computer programs in the world can't 'learn in context' to do what we would think of as relatively simple compositional tasks ('what volumes should make up a house out of blocks? okay now how should arrange blocks along 3 spatial dim to form those?)
Tweet media one
2
0
1
@sameQCU
サメQCU
4 hours
it should be clear to the rest of us that, for now, that the crushing of the so-called 'wordcel disciplines' by extremely fast and high power wikipedia-engines, was not only overrated, but that confusion about out-of-domain llm-intelligence was our fault.
@sameQCU
サメQCU
1 day
guys i have bad news gemini 2.5 doesn't have a world model of tensor shapes. it literally has to make print statements like a mundane human (it has simply never been exposed to enough pytorch debugging to have a plausible origin of a 'world model' here).
1
0
0
@sameQCU
サメQCU
4 hours
someday we might get predictive processing residual stream plan-embeddings, which could allow those artifacts we call 'language models' not only an interface to control social play team deathmatch videogames, but also to *experience* them.
@sameQCU
サメQCU
5 hours
so what do we get when the Weak Man At The End Of History gets such delightful and wonderful tools as the adam adaptive optimizer?.well they don't use it to play quake or halo or minecraft with the computer! those are games you have to play in a group and *hang out*.
1
0
0
@sameQCU
サメQCU
4 hours
anyways it's a good thing that the LLM?=AGI question resolved to 'ain't no fuckin way lmao what are you kidding lol lmao? lol?' way before ~2030, as we finally plugged LLMs into minecraft and found that, of course, they need vision, world experience history, etc. to do anything.
1
0
1
@sameQCU
サメQCU
4 hours
but if you're reading this thread, you know i know you know that we're not talking about the motion solvers and generative models for pose/gesture/telerobotics when an allusion to the 'prototypical 2010s ai guy' slips out, right?.
1
0
0
@sameQCU
サメQCU
4 hours
the energy based models clique, EDM, resnet inventors, etc. are ofc still out there contradicting exactly what i just wrote (they are doing new things in open domains and slamming cool publications and in so doing creating new history, violating even the 'end of history' joke!).
1
0
0
@sameQCU
サメQCU
4 hours
they're also pretty good at maximizing benchmarks if they know the benchmarks ahead of time (the 2010s ai guy si an ai guy after all) but but as far as i can tell cannot do anything new in any open domain. we can't pretend that they're all d kingma, you know?.
1
0
0