
Stephanie Chan
@scychan_brains
Followers
5K
Following
5K
Media
33
Statuses
721
Staff Research Scientist at Google DeepMind. Artificial & biological brains 🤖 🧠Views are my own.
San Francisco, CA
Joined November 2018
Check out our new work: Generalization from context often outperforms generalization from finetuning. And you might get the best of both worlds by spending extra compute at train-time.
How do language models generalize from information they learn in-context vs. via finetuning? We show that in-context learning can generalize more flexibly, illustrating key differences in the inductive biases of these modes of learning — and ways to improve finetuning. Thread: 1/
4
20
206
New ideas for our information ecosystem.
🚨🚨 Excited to share a new paper led by @Li_Haiwen_ with the @CommunityNotes team!. LLMs will reshape the information ecosystem. Community Notes offers a promising model for keeping human judgment central but it's an open question how to best integrate LLMs. Thread👇
0
0
9
An important line of research -- understanding complementarity between humans and AIs.
How do we ensure humans can still effectively oversee increasingly powerful AI systems? In our blog, we argue that achieving Human-AI complementarity is an underexplored yet vital piece of this puzzle! And, it’s hard, but we achieved it. 🧵(1/10)
0
0
12
RT @GoogleDeepMind: We’re bringing powerful AI directly onto robots with Gemini Robotics On-Device. 🤖. It’s our first vision-language-actio….
0
554
0
RT @oswaldjoh: Super happy and proud to share our novel scalable RNN model - the MesaNet! . This work builds upon beautiful ideas of 𝗹𝗼𝗰𝗮𝗹𝗹….
0
64
0
RT @coolboi95: I’m really excited to announce @GeneralistAI_ ! Our mission is to make general-purpose robots a reality. Getting to this “Ch….
0
13
0
RT @emollick: McKinsey's new report on AI agents shows the same mindset I see in many firms: a focus on making small, obsolete models do ba….
0
231
0
RT @cogscikid: Excited to share project specifying a research direction I think will be particularly fruitful for theory-driven cognitive s….
0
44
0
RT @mpshanahan: Does It Make Sense to Speak of Introspection in Large Language Models? New paper with Iulia M. Comsa (@astronomind). https:….
0
14
0
Very proud of @Aaditya6284 for garnering an ICML Oral Award for this work, even while moving countries and starting a new job!. The paper shows: When there are two different circuits to solve the same problem, the circuits can compete *and* cooperate at the same time. This.
Dropping a few high-level takeaways in this thread. For more details please see Aaditya's thread,.or the paper itself.
0
2
31
RT @METR_Evals: At METR, we’ve seen increasingly sophisticated examples of “reward hacking” on our tasks: models trying to subvert or explo….
0
46
0
RT @EkdeepL: Paper alert—accepted as a NeurIPS *Spotlight*!🧵👇. We build on our past work relating emergence to task compositionality and an….
0
92
0
RT @chengmyra1: Do people actually like human-like LLMs? In our #ACL2025 paper HumT DumT, we find a kind of uncanny valley effect: users di….
0
19
0
shows theoretically how emergence occurs when learning sparse attention, and that these dynamics match real transformers.(@NicolasZucchet @dngfra).4/5.
1
2
30
These results build on top of this analysis of S-curves in the context of multi-component tasks:.(@AdamSJermyn @bshlgrs).3/5.
1
1
22
Emergence can occur due to multiple interacting subcircuits, even if each subcircuit is learned in a smooth gradual way. showed this in a mathematical model and with extensive experiments in small transformers (@Aaditya6284). 2/5.
1
3
44
RT @Aaditya6284: Was super fun to be a part of this work! Felt very satisfying to bring the theory work on ICL with linear attention a bit….
0
5
0
RT @jxmnop: new paper from our work at Meta!. **GPT-style language models memorize 3.6 bits per param**. we compute capacity by measuring t….
0
385
0