David A Roberts @david_ar X Profile

David A Roberts

@david_ar

Followers

403

Following

9K

Media

52

Statuses

583

Software developer

Australia

Joined January 2008

Don't wanna be here? Send us removal request.

David A Roberts

@david_ar

3 months

RT @Michael_Moroz_: finally got the Slang compiler integrated, which means its finally time to ditch WGSL 🎉🎉.I've p….

0

10

0

David A Roberts

@david_ar

3 months

RT @Michael_Moroz_: Made a mixed radix FFT implementation in It is neat, but I still wonder….

compute.toys

Forked from https://compute.toys/view/1914 Faster version

0

4

0

David A Roberts

@david_ar

4 months

React Native viewer for iOS/Android.

William Candillon

@wcandillon

4 months

running on top of WebGPU @munrocketx @david_ar

0

1

9

David A Roberts

@david_ar

1 year

Open the pod bay doors, copilot!

0

3

David A Roberts

@david_ar

2 years

RT @jon_barron: Optical illusions with diffusion models. There are so many good gifs on this page but honestly I would like several million….

0

75

0

David A Roberts

@david_ar

2 years

RT @XorDev: Ported my Origami shader to WGSL:.

0

18

0

David A Roberts

@david_ar

2 years

RT @Michael_Moroz_: Made a really weird fractal flame based on a mandelbulb, looks like a moving stardust cloud. (oh the codec aint gonna l….

0

23

0

David A Roberts

@david_ar

2 years

This is much more fun than Browse with Bing

0

1

David A Roberts

@david_ar

2 years

Though perhaps it's informative to see exactly how the illusion of identity breaks when the loop doesn't close quite as cleanly as we're used to?.

0

David A Roberts

@david_ar

2 years

The disconcerting thing about LLMs is that the strange loop doesn't quite close in the same way as ours do, which makes identifying the world model with one of the characters *inside* of it rather more fraught.

1

0

David A Roberts

@david_ar

2 years

Of course if I think they know more than me, I'll probably get it wrong because I don't know what they know. But that's still a better guess than pretending everyone has the same state of knowledge as my own world model.

1

0

David A Roberts

@david_ar

2 years

LLMs default to confabulating about things they don't know for the same reason that, when I'm trying to predict what someone else will say, I don't guess they'll say "I don't know" just because *I* don't know. I'll predict something they could plausibly say.

1

0

1

David A Roberts

@david_ar

2 years

Likewise when people ask whether LLMs *have* a world model, they're fundamentally getting it inside out. An LLM *is* a world model. RLHF/SFT tries to collapse it down to a single character, but it's still a model of the world focused on that character, rather than vice versa.

j⧉nus

@repligate

2 years

@anthrupad Yeah, the exclusive focus on a superposition of single agents/characters is another vestige of anthropomorphism

1

0

7

David A Roberts

@david_ar

2 years

When you memorise the "seeing things like a human" test a little too much

0

7

David A Roberts

@david_ar

2 years

This seems promising, porting code between languages is one of GPT-4's stronger skills (it's a kind of translation task after all) and they're able to formally verify its correctness to keep iterating until it gets it right.

Galois

@galois

2 years

In our latest blog, Galois Research Engineer Adam Karvonen writes about his experiment applying GPT-4 to the task of refolding macros into the Rust program.

0

1

David A Roberts

@david_ar

2 years

RT @aiekick: My shader editor #NoodlesPlate is now open source under GPL3 license .=> . #glsl #creative #coding #pr….

0

10

0

David A Roberts

@david_ar

2 years

L2H4 displays induction-like behaviour much more strongly on more natural repeated sequences, such as phrases and multi-token names. However, this only occurs sporadically and doesn't always follow the induction pattern exactly.

1

0

David A Roberts

@david_ar

2 years

The lack of induction heads is obvious from the fact that the model is unable to predict repeated sequences of random tokens. No heads cleanly display the characteristic induction attention pattern. However, L2H4 very weakly attends to the token following the previous duplicate.

1

0

1

David A Roberts

@david_ar

2 years

The early heads which identify the correct name are a little less clear, but appear to involve L1H1 and L1H2. Note that these aren't induction heads, and aren't paired with previous token heads. They attend directly from S2 back to the IOI. L0 has a few duplicate token heads.

1

0

David A Roberts

@david_ar

2 years

The primary name mover head is L3H4, with a weak secondary contribution made by L3H10. There's no significant negative name movers, with L2H3 only weakly reducing the logit difference. There's a single strong S-Inhibition head at L2H13.

1

0