
noahdgoodman
@noahdgoodman
Followers
5K
Following
351
Media
8
Statuses
253
Professor of natural and artificial intelligence @Stanford. Alignment at @GoogleDeepMind. (@StanfordNLP @StanfordAILab etc)
Joined November 2019
Congrats to OAI on producing a reasoning model! Their opaque tweets demonstrate that they’ve (independently) found some of the core ideas that we did on our way to STaR.
Congrats to DeepSeek on producing an o1-level reasoning model! Their research paper demonstrates that they’ve independently found some of the core ideas that we did on our way to o1.
30
132
2K
This seems like a good time to mention that I've taken a part-time role at @GoogleDeepMind working on AI Safety and Alignment!.
So excited and so very humbled to be stepping in to head AI Safety and Alignment at @GoogleDeepMind. Lots of work ahead, both for present-day issues and for extreme risks in anticipation of capabilities advancing.
13
10
269
@tomlikestocode I think it means in the future we will develop models by having smart and intense researchers stand in the shoulders of previous smart and intense researchers.
2
3
172
Hey twitter! @mcxfrank and I are teaching a seminar "Topics in Natural and Artificial Intelligence", where we'll be thinking about the relations between cognitive psych and modern AI (mostly LLMs). What papers are we missing?.
9
23
134
New work with the awesome @aryaman2020 ! This started as some musings about “model organisms” for studying LM alignment, and ended up with really cool connections between ICL, Bayes, and alignment techniques.
New paper! 🫡. In-context learning (ICL) is when LLMs infer how to do a task from examples. We know that the relationship between # of ICL examples and task accuracy is predictable. Can we predict the shape of the ICL curve using Bayesian assumptions? Our paper shows yes!
1
7
100
I’m thrilled for this to be public! The culmination of years of thinking about thinking.
Language models today are trained to reason either 1) generally, imitating online reasoning data or 2) narrowly, self-teaching on their own solutions to specific tasks. Can LMs teach themselves to reason generally?🌟Introducing Quiet-STaR, self-teaching via internal monologue!🧵
2
11
96
💚. (i don't work at openai, or know @sama, or have really any opinion about any of this. but all the cool kids are heartreposting and i fomo).
1
2
86
New work where language models learn to ask questions? So they can better understand user needs? With an amazing method name? Oh, yes!.
When prompting language models to complete a task, users often leave important things unsaid. Can language models teach themselves to ask clarifying questions? . In STaR-GATE, we explore LMs' ability to self-improve by rewarding the model for generating useful questions!
2
9
42
My very first phd student, @stuhlmueller, founded @oughtinc after leaving CoCoLab. Ought Gad done amazing work as a nonprofit lab, that helped me see the power of LLMs. I’m excited for their next chapter as @elicitorg!! (And in a new role for me I’m an “angel”).
1/ Announcing our spinoff from @oughtinc into a public benefit corporation, our $9 million seed round, and a much more powerful Elicit!. This new Elicit takes the components of the popular literature review workflow and extends them to automate more research workflows.
0
4
44
Some PR on a fun auto-coding project we've (@ericzelikman @GabrielPoesia @nickhaber @qhwang3) been doing. Tl;dr: decomposing the problem, solving the pieces, then re-composing is useful! .New Tool Helps AI and Humans Learn To Code Better
2
7
41
Base language models already know a lot about good behavior. Here we bring out that latent knowledge by enhancing the connection between principles and responses — no preferences required!.
Constitutional AI showed LMs can learn to follow constitutions by labeling their own outputs. But why can't we just tell a base model the principles of desired behavior and rely on it to act appropriately?. Introducing SAMI: Self-Supervised Alignment with Mutual Information!
1
7
41
First author @gandhikanishk's post: Arxiv:
Language models struggle to search, not due to an architecture problem, but a data one! They rarely see how to search or backtrack. We show how LLMs can be taught to search by representing the process of search in language as a flattened string, a stream of search (SoS)!
1
4
40
“Four habits of highly effective STaRs” — we show that certain high level cognitive behaviors are necessary for learning to reason through RL. Exciting!.
New Paper!! We try to understand why some LMs self-improve their reasoning while others hit a wall. The key? Cognitive behaviors! Read our paper on how the right cognitive behaviors can make all the difference in a model's ability to improve with RL! 🧵1/13
0
5
41
Thinking quietly helps in thinking aloud - we added an additional result to the Quiet-STaR paper! We also open sourced code and weights today.
A couple exciting updates! First, we quantitatively evaluated the improvement from combining Quiet-STaR with chain-of-thought (i.e. letting the model think before each CoT token). We found it improves zero-shot CoT accuracy on GSM8K by over 7%!
2
8
39
It turns out LLMs are surprisingly good at statistical modeling! New work with @michaelyli__ and Emily Fox.
2
4
38
I'm super excited about this new work with @BenPrystawski on the origins of reasoning and why chain-of-thought helps LLMs!.
0
2
37
Here's the latest in a very exciting research program from Stanford -- just about ready to find causal structure behind the behavior of LLMs!.
📣How does 🔥Alpaca🦙 follow your instructions?. Mechanistic interpretability at scale – our new paper identifies the causal mechanisms the Alpaca 7B model uses to solve simple reasoning tasks (with Atticus Geiger, @ChrisGPotts, and @noahdgoodman!). Paper:
0
3
29
@RichardMCNgo I believe it’s the case that hyperbolic is mixture of exponentials, so if you have broad uncertainty about true discount rate you get hyperbolic.
0
0
11
A great series of talks! (Except maybe the first one…).
See the speaker line-up for 2024-25 #KempnerInstitute Seminar Series! @noahdgoodman @MarkChurchland @scott_linderman @jacobandreas @FieteGroup @rsalakhu @StefanoFusi2 @RAIVNLab @SuthanaLab @behrenstimb @tri_dao @KordingLab @tyrell_turing @AndrewLampinen
1
2
26
Intriguing new work that suggests a kind of hypothesis-generation CoT can dramatically improve ICL for hard induction problems!.
Did you know there’s a task people easily solve but GPT-4 fails? From a few input-output grids, ARC asks you to infer and apply a rule. With Hypothesis Search, we double GPT-4’s score. w/@ruocheng_w @GabrielPoesia @evanthebouncy @nickhaber @noahdgoodman.🧵
1
2
21
@FelixHill84 @stanfordnlp Love this work Felix! We followed it up with an argument that similar burstiness (local observation distribution) leads to CoT. Pretty amazing that both inductive inference and sequential reasoning may emerge from statistical properties of the data!.
2
1
21
Oh, I forgot to promote this fun @cogsci_soc paper with @tsvilodub @meanwhileina and @hawkrobe!. We explore what extra information should be provided in answering simple questions: "Do you have wine?" "No, but we have beer!". (LLMs are spotty at this.).
Do you have an opinion?. Humans and LLMs often answer yes/no questions with multiple words.While humans add assisting information.GPT&friends add all\random info unless prompted otherwise. New LLM challenge?.@tsvilodub @meanwhileina @ramihawk @noahdgoodman.
0
5
19
LMs are not terrible at asking the questions they need to figure out a task -- but there's a lot of room for improvement. We set out a framework for exploring this. Fun work with amazing collaborators!.
Eliciting Human Preferences with Language Models. Currently, people write detailed prompts to describe what they want a language model to do. We explore *generative elicitation*—where models interactively ask for this information through open-ended conversation . 1/
0
3
21
This cool project pulls from a lot of research threads — theorem provers, RL, curricula. (And no LLMs in sight… but stay tuned for my tweet later this week!).
Excited to share our work on Peano, which just came out today in the Phil. Trans. of the Royal Society! This has been quite a journey w/ @noahdgoodman, from dependent types to the role of curricula in the cultural transmission of mathematics! 1/n.
0
1
18
Awesome @willknight on GPT. He chooses quotes where I sound deep rather than just confused 🫶. “These days my viewpoint is that this is AGI, in that it is a kind of intelligence and it is general—but we have to be a little bit less, you know, hysterical about what AGI means,”.
GPT-4 often "seems" remarkably clever, and some believe it exhibits features of more general intelligence. Experts say we need new methods for probing the model's intelligence, and more transparency from @OpenAI about how it works to truly understand it.
1
0
17
@karpathy thanks for the shoutout to our gist token work! i think of this as an example of "meta tokens", which are tokens that affect the processing of the model (by eg changing the mask). what other examples are there of this?.
New (2h13m 😅) lecture: "Let's build the GPT Tokenizer". Tokenizers are a completely separate stage of the LLM pipeline: they have their own training set, training algorithm (Byte Pair Encoding), and after training implement two functions: encode() from strings to tokens, and
1
0
18
A fun trick @jayelmnop @XiangLisaLi2 and I came up with to teach LMs how to compress their instructions into a few “meta tokens” worth of activations.
Prompting is cool and all, but isn't it a waste of compute to encode a prompt over and over again?. We learn to compress prompts up to 26x by using "gist tokens", saving memory+storage and speeding up LM inference:. (w/ @XiangLisaLi2 and @noahdgoodman). 🧵
1
0
12
@RogerGrosse Agreed! It would be super cool to make a set of illustrative examples for alignment along the lines of Power move is to then train transformers on data generated from them and see if you can “find the issues”.
0
0
12
And. here’s the follow up: we use Peano together with a new kind of tool to guide LLMs toward better reasoning. And then learn from it!.
Thrilled to share our work on Certified Reasoning with Language Models, a really fun collaboration with @gandhikanishk @ericzelikman @noahdgoodman!. This is a mix of Peano, Synchromesh & STaR that shows a simple way to intermix & self-improve on formal and informal reasoning. 1/.
0
2
11
@StephenLCasper I think you might be wrong, but for the opposite reason to others. Single humans are not unified principle agents that can be aligned to, either. Cas this morning, Cas tonight, Cas near donuts vs salad, etc. A human has incoherent goals. Just like a society.
1
0
10
@OwainEvans_UK Nice! And what if you add the statement that not everything you read is true? (This sounds like joke, but I’m serious — can metacognitive hints ameliorate the effect?).
4
0
9
Sometimes illusions are illusory! We respond to some interesting criticisms of our DAS method for establishing causal abstractions of neural nets. (#mechinterp).
🧐In our new commentary: we argue the notion of "illusion" in this paper labels correct explanation as illusory, & that avoiding "illusion" would require unwarranted constraints on NNs. The "illusions" are, though, instructive about how models work. 1/.
0
1
8
@justintchiu For instance here’s a fun one I haven’t revisited in ages (what does this even look like for LLMs?).
0
2
6
@ShunyuYao12 Oh, nice! — “As a thought experiment, imagine GPT-N could “simulate” memory, grounding, learning, and decision-making in context: list all the possible actions, simulate and evaluate each one, and maintain its entire long-term memory explicitly in a very long context.”.
0
0
5
@gerardsans could you be more specific in your outrageous criticism? we make pretty specific claims and back them up with data. though i guess the comparison to @elonmusk is sort of complementary?.
1
0
5
@carterleffen We’ll open source an implementation as soon as we can clean up the code. Though if you’re eager, go for it! (Multiple OS implementations are helpful, anyhow.).
1
1
5
@ESYudkowsky this is a distinction i think we need to consider a lot more -- AI trained to imitate us vs AI trained to understand us. (our new qstar is closer to the later, though not all the way.) it's actually not clear to me which is more alignable.
1
0
5
@ericzelikman @mcxfrank Yes. the confusion about LLM ToM is surpassed only by the confusion about infant/toddler ToM….
0
0
2
@stuhlmueller This is super neat. I look forward to reading the top essays on this. I also request that you add a baseline condition. Where you prompt a large language model with the prize announcement and include that in your evaluations (blinded, of course).
1
0
4
@ellitoshiba The days of artisanal Bayesian modeling for large-scale AI may be sadly behind us. But the days of understanding large-scale AI by artisanal Bayesian modeling may be in front of us!.
0
0
3
@justintchiu Btw we tried for a while reducing variance by using multiple samples for y and didn’t get anything that worked better. But the sea of variance reducers is wide…!.
1
0
4
@horseracedpast These models have a finite memory state that separates past observations from future predictions!.
1
0
3
@evanthebouncy I tried 20 Questions with Meta-prompt as you suggested! I'm not sure if it actually got better (should do a proper eval), but it did develop reasonable instructions for itself (in 🧵).
1
0
3
@srush_nlp @headinthebox I think the other reason to do it is that there are language model HoFs that you want to have the right commutation behavior with chaining. Again not seeing it much yet, but think rejection sampling from generations.
1
0
3
@CFGeek and SSMs should make it a lot easier to find these kind of representations (if they exist) because 'activation patching across time' (tm:) is well defined within the hidden state.
1
0
3
nice -- let's find out what the noosphere thinks! Will AI increase or decrease Common Knowledge?.
@mhtessler @noahdgoodman Yeah it's in part a prediction about likely deployment/form factor:
0
0
3