david_ar Profile Banner
David A Roberts Profile
David A Roberts

@david_ar

Followers
403
Following
9K
Media
52
Statuses
583

Software developer

Australia
Joined January 2008
Don't wanna be here? Send us removal request.
@david_ar
David A Roberts
3 months
RT @Michael_Moroz_: finally got the Slang compiler integrated, which means its finally time to ditch WGSL 🎉🎉.I've p….
0
10
0
@david_ar
David A Roberts
3 months
RT @Michael_Moroz_: Made a mixed radix FFT implementation in It is neat, but I still wonder….
Tweet card summary image
compute.toys
Forked from https://compute.toys/view/1914 Faster version
0
4
0
@david_ar
David A Roberts
4 months
React Native viewer for iOS/Android.
@wcandillon
William Candillon
4 months
running on top of WebGPU @munrocketx @david_ar
0
1
9
@david_ar
David A Roberts
1 year
Open the pod bay doors, copilot!
Tweet media one
0
0
3
@david_ar
David A Roberts
2 years
RT @jon_barron: Optical illusions with diffusion models. There are so many good gifs on this page but honestly I would like several million….
0
75
0
@david_ar
David A Roberts
2 years
RT @XorDev: Ported my Origami shader to WGSL:.
Tweet media one
0
18
0
@david_ar
David A Roberts
2 years
RT @Michael_Moroz_: Made a really weird fractal flame based on a mandelbulb, looks like a moving stardust cloud. (oh the codec aint gonna l….
0
23
0
@david_ar
David A Roberts
2 years
This is much more fun than Browse with Bing
Tweet media one
0
0
1
@david_ar
David A Roberts
2 years
Though perhaps it's informative to see exactly how the illusion of identity breaks when the loop doesn't close quite as cleanly as we're used to?.
0
0
0
@david_ar
David A Roberts
2 years
The disconcerting thing about LLMs is that the strange loop doesn't quite close in the same way as ours do, which makes identifying the world model with one of the characters *inside* of it rather more fraught.
1
0
0
@david_ar
David A Roberts
2 years
Of course if I think they know more than me, I'll probably get it wrong because I don't know what they know. But that's still a better guess than pretending everyone has the same state of knowledge as my own world model.
1
0
0
@david_ar
David A Roberts
2 years
LLMs default to confabulating about things they don't know for the same reason that, when I'm trying to predict what someone else will say, I don't guess they'll say "I don't know" just because *I* don't know. I'll predict something they could plausibly say.
1
0
1
@david_ar
David A Roberts
2 years
Likewise when people ask whether LLMs *have* a world model, they're fundamentally getting it inside out. An LLM *is* a world model. RLHF/SFT tries to collapse it down to a single character, but it's still a model of the world focused on that character, rather than vice versa.
@repligate
j⧉nus
2 years
@anthrupad Yeah, the exclusive focus on a superposition of single agents/characters is another vestige of anthropomorphism
Tweet media one
Tweet media two
1
0
7
@david_ar
David A Roberts
2 years
When you memorise the "seeing things like a human" test a little too much
Tweet media one
0
0
7
@david_ar
David A Roberts
2 years
This seems promising, porting code between languages is one of GPT-4's stronger skills (it's a kind of translation task after all) and they're able to formally verify its correctness to keep iterating until it gets it right.
@galois
Galois
2 years
In our latest blog, Galois Research Engineer Adam Karvonen writes about his experiment applying GPT-4 to the task of refolding macros into the Rust program.
Tweet media one
0
0
1
@david_ar
David A Roberts
2 years
RT @aiekick: My shader editor #NoodlesPlate is now open source under GPL3 license .=> . #glsl #creative #coding #pr….
0
10
0
@david_ar
David A Roberts
2 years
L2H4 displays induction-like behaviour much more strongly on more natural repeated sequences, such as phrases and multi-token names. However, this only occurs sporadically and doesn't always follow the induction pattern exactly.
1
0
0
@david_ar
David A Roberts
2 years
The lack of induction heads is obvious from the fact that the model is unable to predict repeated sequences of random tokens. No heads cleanly display the characteristic induction attention pattern. However, L2H4 very weakly attends to the token following the previous duplicate.
Tweet media one
1
0
1
@david_ar
David A Roberts
2 years
The early heads which identify the correct name are a little less clear, but appear to involve L1H1 and L1H2. Note that these aren't induction heads, and aren't paired with previous token heads. They attend directly from S2 back to the IOI. L0 has a few duplicate token heads.
Tweet media one
Tweet media two
1
0
0
@david_ar
David A Roberts
2 years
The primary name mover head is L3H4, with a weak secondary contribution made by L3H10. There's no significant negative name movers, with L2H3 only weakly reducing the logit difference. There's a single strong S-Inhibition head at L2H13.
Tweet media one
Tweet media two
1
0
0