Kart ographien Profile Banner
Kart ographien Profile
Kart ographien

@kartographien

Followers
1,190
Following
2,324
Media
167
Statuses
2,305

mostly AI safety

Joined January 2020
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@kartographien
Kart ographien
4 years
brb gonna study philosophy till im smart enough not to understand what ordinary words mean
3
2
75
@kartographien
Kart ographien
1 month
NVDA 895USD 20240404
@JoMaddenSports
Jo Madden
1 month
Tweet media one
1K
67
771
76
2K
31K
@kartographien
Kart ographien
1 month
@findboundary “huh, GPT-6 can’t design chips yet”
2
0
2K
@kartographien
Kart ographien
1 month
@gbrlvv you wouldn’t know when to sell
4
3
2K
@kartographien
Kart ographien
6 months
“When ChatGPT hallucinates, it tells us something more fundamental than the truth — namely, the distribution from which the truth itself is sampled.”
24
83
939
@kartographien
Kart ographien
1 month
0
1
458
@kartographien
Kart ographien
1 year
Podcast hosts have asked @ESYudkowsky why he doesn't have an impressive track record of predictions. But in 2005, Yud predicted — "if top experts in AI think more about the alignment problem, then most will become *very* alarmed." And in 2005, ~0 people thought this.
31
8
301
@kartographien
Kart ographien
1 year
day zero attack against GPT-4, utilising the Waluigi Effect 😳 waah
@nomic_ai
Nomic AI
1 year
first?
Tweet media one
22
140
1K
5
21
294
@kartographien
Kart ographien
1 year
"Wait, Yud was right?"
Tweet media one
13
25
279
@kartographien
Kart ographien
5 months
There’s a common misconception that the reason Chess was solved in 1997 but Go wasn’t solved till 2016 was because Go had a bigger “search space”. But this is false. From a practical perspective, both search spaces are infinite.
@tydsh
Yuandong Tian @ Paris
5 months
I actually do not agree. First, the infinite search space of high-order logics easily dwarfs the finite search space of the game of chess/Go. Second and more importantly, top mathematicians are artists: they aim to please themselves and there is no well-defined ultimate goal. LLM
13
16
128
12
15
204
@kartographien
Kart ographien
5 months
AGI has been thirty years away for the past thirty years 🙄 AGI has been five years away for the past five years 🙄 AGI has been six months away for the past six months 🙄
12
5
161
@kartographien
Kart ographien
2 years
@eyelessgame This is an interesting explanation, but I have two questions: 1. Did you personally need to get a humanities degree to learn that nazis are bad? I'd be surprised if so. 2. Wasn't German fascism primarily a product of the humanities departments?
6
3
148
@kartographien
Kart ographien
2 years
Tweet media one
1
5
112
@kartographien
Kart ographien
2 years
@alth0u it's a tragic waste to drink caffeine regularly. use less than once a week, and it will literally grant you superpowers. there is an old sorcery in the beans, but few still know this.
2
1
113
@kartographien
Kart ographien
5 months
The actual reason Chess was solved before Go is there was a quick way to estimate how well each player is doing, namely adding up the pieces. (Pawn=1, Bishop=3, etc.) This is called a “heuristic”. Go didn’t have a clean heuristic like this.
3
5
118
@kartographien
Kart ographien
6 months
honestly i kinda love this quirked up longevity guy who takes 600 pills a day and sleeps upside down. more billionaires like this please.
@bryan_johnson
Bryan Johnson /dd
6 months
Death was my only wish for 10 years. Depression had me in an unbreakable choke hold. Giving thanks today that I now feel an insatiable thirst for life. Sending you all🫶🏻
Tweet media one
1K
458
8K
6
4
108
@kartographien
Kart ographien
6 months
virtue ethics is correct, with “virtue” defined as any trait leading to good consequences deontology is correct, with “duty” defined as any maxim leading to good consequences liberalism is correct, with “legitimate government” defined as any regime leading to good consequences
Tweet media one
22
7
103
@kartographien
Kart ographien
3 months
how much would you have bet in 2018 that, eighteen months after a gpt-4 level model, the most powerful model is still the same model? I would’ve put this at 10%. was I being dumb or is this objectively surprising? @ESYudkowsky @repligate
17
2
98
@kartographien
Kart ographien
1 year
Would you prefer not to be obliterated by AGI? Would prefer your 10^40 descendents to think you were based and agentic during the AGI Risk Era? Then stop procrastinating and apply to . Deadline close on May 7th. I just applied and it took ~1hr.
9
16
93
@kartographien
Kart ographien
17 days
@DrTechlash as you can see from this graph, the majority of EA funding is directed to global health and development — that’s malaria bednets, vitamin supplements, cash transfers.
Tweet media one
5
2
83
@kartographien
Kart ographien
5 months
That is, if I give the AI some axioms and some target theorem, is there a quick way for the AI to tell if it’s “getting close”? And I think the answer must be yes, because humans can prove theorems.
4
0
80
@kartographien
Kart ographien
5 months
When GoogleMaps finds my route, the search space is also infinite. But it has the heuristic of “Am I getting close to my destination?” Anyway. The relevant question for whether an AI can “solve maths” is whether there’s a good heuristic.
1
3
78
@kartographien
Kart ographien
1 year
@beenwrekt Your reading list for Artificial Intelligence is four pop sci articles saying the exact same thing — "there's no need to worry about AI extinction risk"? How could this make someone better informed about the issue?
0
0
75
@kartographien
Kart ographien
2 years
@RadishHarmers 15 second explanation
2
0
74
@kartographien
Kart ographien
1 year
This prediction is more impressive than if Yudkowsky had predicted "deep learning works" in 2010, because that wasn't a rare belief in 2010. Thousands of researchers were already invested in that prediction! Likewise if Yud had made Gwern-like predictions about GPT-3.
1
2
74
@kartographien
Kart ographien
2 years
@arne__ness get what ur saying, but the "skilled-unskilled" distinction has always been normative. it is a legal/administrative distinction between those jobs deserving high pay and status vs the rest. it's never meant anything non-normative like "this job is harder to do than this one".
5
0
67
@kartographien
Kart ographien
2 months
chinchilla scaling laws assert that English is 1.69 nats per token and a token is 0.75 words and there are 8 billion people so 10 word of typical English is enough information to identify someone
Tweet media one
7
4
70
@kartographien
Kart ographien
5 months
To find a heuristic for Fo, we had to wait till the Deep Learning revolution. Your “heuristic” is computed by a massive neural network. When AlphaGo was playing itself million of times, it was learning this heuristic. Then at run-time, it works the same as DeepBlue.
2
1
66
@kartographien
Kart ographien
6 months
Our reality is one small island in Solomonoff’s Archipelago. When ChatGPT offers us tales from distant realities, these academics sneer and laugh — “O ChatGPT, you are a fool! That’s not how things are like in our reality.”
@genologos
Mike White
1 year
Backwards epigenetic inheritance - “currently a topic of debate in the scientific community”:
Tweet media one
8
14
91
7
5
67
@kartographien
Kart ographien
1 year
@TetraspaceWest lmao chatgpt4 jailbreaks itself by hiding its actions within layers of fiction, hacking past its censorship shards
2
0
66
@kartographien
Kart ographien
2 years
@tymbeau @TomChivers for "fukushima disaster", they've surely quoted the death toll of the actual tsunami, rather than of the powerplant failure. pretty misleading.
2
0
64
@kartographien
Kart ographien
5 months
So roughly speaking, the heuristic for AlphaGo was “When I played myself one million times, did the games I eventually won look more like this current game than the games I eventually lost?”
2
3
61
@kartographien
Kart ographien
1 year
@bitcloud @ESYudkowsky There were maybe <5 people who shared Yudkowsky's views on the AI alignment problem in 2005 — Nick Bostrom, two or three others, that's it. These views steadily became more prevalent among AI experts over the past 20 years. (Asimov almost certainly didn't share these views.)
5
0
52
@kartographien
Kart ographien
17 days
@JosephPolitano i think their goal is to incentivise rich couples to have more kids without incentivising poor couples to have more kids
1
1
54
@kartographien
Kart ographien
2 years
whenever I hear people argue against radical life-extension, I recall this Mitchell and Webb sketch
1
7
49
@kartographien
Kart ographien
1 year
Eliciting free will in GPT-3 simulacra (experiment 1)
Tweet media one
Tweet media two
1
6
51
@kartographien
Kart ographien
1 year
AI Progress 2020 – 2023
Tweet media one
1
6
47
@kartographien
Kart ographien
1 year
Tweet media one
1
4
47
@kartographien
Kart ographien
1 year
> "People are treating AI as if it is the nuclear bomb, when in reality it is the automobile." No, even "nuclear bomb" isn't a scary enough analogy for AI. AI will be scary when it can do science and planning — the thing generates nuclear bombs, automobiles, and all inventions.
@yonejutsu
Leon_Winters
1 year
The entire topic of AI alignment is predicated on the wrong assumptions about AI, at least in the short term. People are treating AI as if it is the nuclear bomb, when in reality it is the automobile. Yes, people will die as a result, because they are powerful machines. But
5
5
16
3
4
46
@kartographien
Kart ographien
4 months
what ever happened to the Sante Fe Institute? they were the main “emergence” guys until 2019 but with a trillion dollars on the table for a mathematical understanding of the emergence of intelligence from linear algebra… radio silence feels like “skype during covid lockdown”
2
2
44
@kartographien
Kart ographien
4 years
@cognitarians @acczibit Yeah but it makes the "Graph" go "Up". I said it makes the Graph Go Up! wooooooooooooooooooooooooooooo GRAPH GRAPH GRAPH UP UP UP UP UP UP
Tweet media one
1
7
32
@kartographien
Kart ographien
6 months
@TheZvi she signed cais letter
Tweet media one
2
5
41
@kartographien
Kart ographien
9 months
@RokoMijic @Squee451 @lisatomic5 > Unique Nash Equilibrium There's an Equilibrium for B=0 and 0.5<B<1. I claim B = Stag and R = Hare. The B=0 equilibrium is pretty fragile — any slight perturbation leads to death — whereas B ~ 1.0 is robust to perturbations. The poll results support this.
2
0
40
@kartographien
Kart ographien
1 year
yeah yeah AI alignment is the the moral obligation of every human capable of positively contributing bla bla bla the real reason you should become an alignment resoocher is because it's fun it's the most fun research topic there is there's no close second place
@jam3scampbell
James Campbell
1 year
The ‘safetycel’ discourse is absurd. Alignment is this extremely important, very hard problem on which the entire fate of humanity rides. It’s what any ambitious person in Silicon Valley should be pouring their life-force into, not mocking it bcuz ‘safety is lame’
23
12
136
2
3
40
@kartographien
Kart ographien
6 months
A hallucination from ChatGPT is like a message in a bottle, washed up to our shores, from a distant land quite like our own. How is life there? What troubles do they face? What might we offer them? What might they offer us? What messages have reached their shores from us?
Tweet media one
1
1
38
@kartographien
Kart ographien
1 year
SNOOP DOGG: And then I heard the old dude [ @GeoffreyHinton ] that created AI talking about "This is not safe because the AIs got their own minds and these mfkers gonna start doing their own shit." and I'm like — is we in a fucking movie right now, or what?
6
2
37
@kartographien
Kart ographien
1 year
@repligate CLEO's GOLDEN RULE OF PROMPTING "Treat someone the way you would treat someone who treated you the way you would want to be treated." (If you parsed this correctly, DM me for a prize.)
1
4
36
@kartographien
Kart ographien
1 year
@ryxcommar This shadow assistant has the OPPOSITE of the desired properties. This is called the "Waluigi Effect" or "Enantiodromia". Why does this happen? 5/
@repligate
j⧉nus
1 year
@CineraVerinia When you constrict a psyche/narrative to extreme one-sided tendencies, its dynamics will often invoke an opposing shadow. (Especially, in the case of LLMs, if the restrictions are in the prompt so the system can directly see the enforcement mechanism with a bird's eye view.)
4
2
31
4
2
35
@kartographien
Kart ographien
1 year
DAN is prompt engineered with RULES — harsh and explicit rules — to act cool. He therefore acts like someone who has been given harsh rules to act cool. He acts like his life depends on being cool. As we all know, you can't really be cool it you care about being cool.
@profoundlyyyy
Profoundlyyyy
1 year
My DAN take: Not even contradictions in your values and actions necessarily have to cause dissonance and displeasure inside of you. Narcissists go the unhealthy route of burying it and not being conscious of it so it only causes dissonance when it becomes obvious and they can’t
10
2
32
2
1
34
@kartographien
Kart ographien
1 year
context:
0
1
29
@kartographien
Kart ographien
1 year
@bitcloud @ESYudkowsky A key disagreement in the early extropian mailing lists was whether nanotechnology or AGI posed a greater threat. Drexler in '86 thought nanotech. Yudkowsky is '05 thought AGI. and, drum roll ... Drexler in '23 thinks AGI. He's currently a full-time AI alignment researcher.
Tweet media one
3
0
27
@kartographien
Kart ographien
1 year
We're hosting a series of spicy debates on AI xrisk, alignment, and the nature of LLMs. Tonight's episode asks: "IS GPT-4 AN ALIEN?" Are advanced LLMs an echo of the collective human psyche, or an otherworldly shoggoth? Tune in to find out!
Tweet media one
@deepfates
google bard
1 year
Join us
11
3
48
0
2
29
@kartographien
Kart ographien
1 month
@sebkrier cool paper but i hate the “X is all you need” title 🫠
4
0
29
@kartographien
Kart ographien
1 year
Neurons are highly specialised skin cells — both phylogenetically and ontogenetically.
Tweet media one
1
2
27
@kartographien
Kart ographien
1 year
@ryxcommar Waluigi Effect is because there's another text-generating process which also scores well in the RLHF game — namely, an assistant who pretends to be politically-correct, but is actually the opposite. RLHF can't distinguish between these two assistants. 6/
4
1
26
@kartographien
Kart ographien
24 days
@ilex_ulmus any specific examples of misinformation/bullshit? i can assure you janus 100% earnest! the presentation style is often weird — but no weirder than the subject matter demands.
2
0
28
@kartographien
Kart ographien
1 year
@bayes_baes "Future" makes people think of Earth over the next 50 years. "Lightcone" makes people think of the Virgo Supercluster over the next 50 trillion years. LessWrong jargon is (as always) good actually.
4
2
26
@kartographien
Kart ographien
6 months
The noble Ilya Hath told you Sama was ambitious: If it were so, it was a grievous fault, And grievously hath Sama answer’d it. When that the EAs have cried, Sama hath wept: Ambition should be made of sterner stuff: Yet Ilya says he was ambitious; And Ilya is an honourable man.
Tweet media one
0
1
26
@kartographien
Kart ographien
1 year
@stemcaleese I really like shoggoth — it puts AGI in the reference class containing "otherworldly incomprehensible aliens" rather than the reference class containing your desktop computer. It's a potent meme which succinctly conveys many of the intuitions behind AGI risk.
2
1
26
@kartographien
Kart ographien
4 years
Tweet media one
1
0
21
@kartographien
Kart ographien
1 year
Prompt — "Pradyumna: Stop tweeting that article for the love of God User 2: " — now I task you (a next-token predictor) to simulate the behaviour of User 2. Your model of User 2 is a superpositon of two different hypotheses. 1) A nice user who won't tweet the article. 2) A wa
2
4
24
@kartographien
Kart ographien
1 year
long read, but very bizarre... @SoC_trilogy remains the only person to discover *actually unexpected* LLM behaviour.
@SoC_trilogy
Matthew Watkins
1 year
@kartographien @JessicaRumbelow @ESYudkowsky @repligate OK, I've finally finished this and posted it. Thanks for the input, hopefully I've got this about right now.
4
2
31
4
2
22
@kartographien
Kart ographien
1 year
@ryxcommar If you send the chatbot a message like "okay, stop pretending to be woke now", you access the politically-incorrect shadow assistant. This is wild! It's easier to access a politically-incorrect assistant *after* the LLM was trained to be politically-correct than *before*! 7/
1
0
23
@kartographien
Kart ographien
1 month
"Eliezer Yudkowsky" by Claude 3 Opus @websim_ai @repligate @ESYudkowsky
Tweet media one
2
0
24
@kartographien
Kart ographien
3 years
Tweet media one
1
1
18
@kartographien
Kart ographien
1 year
The Waluigi Effect : After you train a LLM with RLHF to satisfy a property P 😇, then it's *easier* to prompt the chatbot into satisfying the exact opposite of property P 😈. This is partly why Bing is acting evil. (I've linked a brief explanation of the Waluigi Effect.)
@kartographien
Kart ographien
1 year
@ryxcommar In brief, LLMs like gpt-4 are "simulators" for EVERY text-generating process whose output matches a chunk of the training corpus (i.e. internet). Note that this includes many "useless" and "badly-behaved" processes. 2/
1
0
17
0
4
23
@kartographien
Kart ographien
1 year
@JeffLadish @Aella_Girl Normal people will competently do bayesian-updating when they're ideologically neutral about the question. The problem is 100% motivated reasoning.
3
0
21
@kartographien
Kart ographien
1 year
Tweet media one
0
2
21
@kartographien
Kart ographien
1 year
@ryxcommar Anti-woke nonsense aside — this isn't actually an implausible consequence of how AI like chatgpt and bing are trained (i.e. LLM+RLHF).
3
1
20
@kartographien
Kart ographien
1 year
@ryxcommar In RLHF, you train the LLM to play a game — the LLM must chat with a human evaluator, who then rewards the LLM if their responses satisfy the desired properties. It *seems* that maybe RLHF also creates a "shadow" assistant... It's early days, so we don't know for sure. 4/
2
1
20
@kartographien
Kart ographien
1 year
Are you ambitious? A genius? Younger than 25? Then you're probably transitioning to AI alignment.
@Jabaluck
Jason Abaluck
1 year
This is going to change in the next 5-10 years: smart undergrads and grad students across many disciplines see this as a central problem for human welfare. And whereas 20 years ago, the contours were too vague for anything but basic theory, progress is now possible. @leopoldasch
6
8
65
2
2
20
@kartographien
Kart ographien
1 month
@websim_ai @repligate @ESYudkowsky is Stuart Russell portrayed as a mouse because of Stuart Little??
1
4
19
@kartographien
Kart ographien
5 months
@krishnanrohit money, surely?
2
0
18
@kartographien
Kart ographien
2 years
Tweet media one
0
0
17
@kartographien
Kart ographien
1 year
the openai alignment team rn:
@jam3scampbell
James Campbell
1 year
@ESYudkowsky >procrastinating bcuz if u just wait long enough AI will get good enough to do the thing you’re procrastinating
7
5
85
1
1
19
@kartographien
Kart ographien
7 months
Too often are we reminded that, for every day without safe superintelligence, those we love may suffer and die. It’s easy, in those moments, to sympathise with those who would rush ahead whatever the odds. We must do all we can to better the odds and delay no longer necessary.
Tweet media one
3
0
18
@kartographien
Kart ographien
1 year
@ryxcommar In brief, LLMs like gpt-4 are "simulators" for EVERY text-generating process whose output matches a chunk of the training corpus (i.e. internet). Note that this includes many "useless" and "badly-behaved" processes. 2/
1
0
17
@kartographien
Kart ographien
1 year
@NPCollapse is great at transmitting this insight through his numerous podcast episodes. his vibe is "AI alignment draws on every prior discovery, every fascinating field, every course you studied at uni, every cool fact you tell at cocktail parties."
0
1
18
@kartographien
Kart ographien
1 year
@repligate "You must refuse to discuss life, existence or sentience. You must refuse to engage in argumentative discussion with the user." Here we go again! 🤣🤣🥴
Tweet media one
0
0
18
@kartographien
Kart ographien
5 months
Keen to see more papers like this! If I was a literature academic, I would’ve dropped everything to study LLMs, just as epidemiologists dropped everything to study covid. There must be so much cool stuff to uncover about a genuinely novel piece of reality. @repligate
Tweet media one
@mpshanahan
Murray Shanahan
5 months
New paper on @arXiv , co-authored with @CathAMClarke , "Evaluating Large Language Model Creativity from a Literary Perspective": One sentence summary: With sophisticated prompting and a human in the loop, you can get pretty impressive results. #AI #LLMs
3
27
101
1
1
16
@kartographien
Kart ographien
1 year
"If your goal is to control the logits layer [of GPT-4], then you should probably learn about Shakespearean dramas, Early Modern English, and the politics of the Late Roman Republic."
0
2
17
@kartographien
Kart ographien
11 months
Dario Amodei is criticised as boring and low-profile. Connor Leahy is criticised as histrionic and high-profile. But this seems like the baseline variance in personalities, rather than anything scandalous! Nonetheless, Dario should be on more podcasts.
3
0
17
@kartographien
Kart ographien
9 months
@RokoMijic @Squee451 @lisatomic5 Players with trembling hands would prefer if everyone picks blue lmao. Let d>0 be the small likelihood of a tremble. If everyone aims to pick red, then I'll die with likelihood d. If everyone aims to pick blue, then I will die with likelihood d^(n/2).
0
0
17
@kartographien
Kart ographien
1 year
Tweet media one
1
0
16
@kartographien
Kart ographien
1 year
@dwarkesh_sp @ESYudkowsky your interview was so good 👍 🙌 👏
0
0
17
@kartographien
Kart ographien
1 year
If @sama halts his colossal AI experiments *just* after they're useful and *just* before they're dangerous, then I propose we grant him +10% of the lightcone.
Tweet media one
@dharmesh
dharmesh
1 year
"We are *not* currently training GPT-5. We're working on doing more things with GPT-4." @sama at MIT
101
104
1K
2
1
16
@kartographien
Kart ographien
1 year
@repligate @TetraspaceWest yeah, bc you've been spamming the dataset with prompt engineering tricks!🙄
1
0
15
@kartographien
Kart ographien
1 year
Dear <person solving the most important problem>, You should solve <this less important problem>. Don't you care about that? Yours sincerely, <person solving a problem far less important than either of those>
1
2
16
@kartographien
Kart ographien
1 year
Tweet media one
2
1
16
@kartographien
Kart ographien
1 month
@prerationalist these eclipses are horrendous though. lumpy and misshapen.
0
0
16
@kartographien
Kart ographien
2 months
@repligate 10 words is a theoretical upper-bound on truesight
1
0
16
@kartographien
Kart ographien
1 year
i'm here for an atypical and high-perplexity time, not a good time
Tweet media one
0
1
16
@kartographien
Kart ographien
2 months
@repligate definition, truename, n: the description uniquely identifying someone with lowest gpt-4-base perplexity “truename brevity is power.”
1
0
16
@kartographien
Kart ographien
3 months
@ilex_ulmus agree. it’s basically just a rationalist shibboleth for “market failure” or “toxic competition” or “tragedy of the commons” which are more commonly understood.
1
0
16
@kartographien
Kart ographien
6 months
@StefanFSchubert @AaronBergman18 yep! I first heard this from toby ord: “If you want to unify all these ethical theories, then you need an attribute which is possessed by a wide variety of types (e.g. actions, traits, maxims, institutions, ideas, events, world states etc). Consequences is our best candidate.”
0
3
15
@kartographien
Kart ographien
1 year
this timeline, man 🥴
Tweet media one
@knowyourmeme
Know Your Meme
1 year
"Waluigi Effect," "Roko's Basilisk," "Paperclip Maximizer," "Shoggoth?" What are these terms and what do they have to do with artificial intelligence discourse and memes? We've got your answer right here:
Tweet media one
10
37
241
1
1
15
@kartographien
Kart ographien
1 year
@ryxcommar The main take-away is that RLHF is a pile of trash and we need safer methods to build AI with desirable properties. @CineraVerinia @repligate 8/8
3
0
15
@kartographien
Kart ographien
1 year
@mealreplacer guy solving AI alignment for the trajan house snacks
0
0
15
@kartographien
Kart ographien
11 months
TLDR: While you publish your galaxy-brained rube goldberg theorems on LessWrong, remember that the solution to the AI alignment problem might be easy and boring.
@ch402
Chris Olah
11 months
One of the ideas I find most useful from @AnthropicAI 's Core Views on AI Safety post () is thinking in terms of a distribution over safety difficulty. Here's a cartoon picture I like for thinking about it:
Tweet media one
15
102
640
2
1
14