raphaelmilliere Profile Banner
Raphaël Millière Profile
Raphaël Millière

@raphaelmilliere

Followers
11K
Following
7K
Media
524
Statuses
3K

Philosopher of Artificial Intelligence & Cog Science @Macquarie_Uni Past @Columbia @UniofOxford Also on other platforms Blog: https://t.co/2hJjfShFfr

Sydney
Joined May 2016
Don't wanna be here? Send us removal request.
@raphaelmilliere
Raphaël Millière
2 months
Transformer-based neural networks achieve impressive performance on coding, math & reasoning tasks that require keeping track of variables and their values. But how can they do that without explicit memory?. 📄 Our new ICML paper investigates this in a synthetic setting! 🧵 1/13
9
98
622
@raphaelmilliere
Raphaël Millière
2 days
There's a lot more in the full paper – here's the open access link:. Special thanks to @TaylorWWebb and @MelMitchell1 for comments on previous versions of the paper!. 9/9.
1
1
15
@grok
Grok
2 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
596
2K
8K
@raphaelmilliere
Raphaël Millière
2 days
This opens intesting avenues for future work. By using causal intervention methods with open-weights models, we can start to reverse-engineer these emergent analogical abilities and compare the discovered mecanisms to computational models of analogical reasoning. 8/9.
1
1
4
@raphaelmilliere
Raphaël Millière
2 days
But models also showed different sensitivities than humans. For example, top LLMs were more affected by permuting the order of examples and were more distracted by irrelevant semantic information, hinting at different underlying mechanisms. 7/9
Tweet media one
Tweet media two
Tweet media three
1
0
6
@raphaelmilliere
Raphaël Millière
2 days
We found that the best-performing LLMs match human performance across many of our challenging new tasks. This provides evidence that sophisticated analogical reasoning can emerge from domain-general learning, where existing computational models fall short. 6/9
Tweet media one
2
1
4
@raphaelmilliere
Raphaël Millière
2 days
In our second study, we highlighted the role of semantic content. Here, the task required identifying specific properties of concepts (e.g., "Is it a mammal?", "How many legs does it have?") and mapping them to features of the symbol strings. 5/9
Tweet media one
1
0
3
@raphaelmilliere
Raphaël Millière
2 days
In our first study, we tested whether LLMs could map semantic relationships between concepts to symbolic patterns. We included controls such as permuting the order of examples or adding semantic distractors to test for robustness and content effects (see full list below). 4/9
Tweet media one
Tweet media two
1
0
3
@raphaelmilliere
Raphaël Millière
2 days
We tested humans & LLMs on analogical reasoning tasks that involve flexible re-representation. We strived to apply best practices from cognitive science – designing novel tasks to avoid data contamination, including careful controls, and doing proper statistical analysis. 3/9.
1
1
10
@raphaelmilliere
Raphaël Millière
2 days
We focus on an important feature of analogical reasoning often called "re-representation" – the ability to dynamically select which features of analogs matter to make sense of the analogy (e.g. if one analog is "horse", which properties of horses does the analogy rely on?). 2/9.
1
0
5
@raphaelmilliere
Raphaël Millière
2 days
The final version of this paper has now been published in open access in the Journal of Memory and Language (link below). This was a long-running but very rewarding project. Here are a few thoughts on our methodology and main findings. 1/9
Tweet media one
@raphaelmilliere
Raphaël Millière
1 year
📄New preprint with Sam Musker, Alex Duchnowski & Ellie Pavlick @Brown_NLP! We investigate how humans subjects and LLMs perform on novel analogical reasoning tasks involving semantic structure-mapping. Our findings shed light on current LLMs' abilities and limitations. 1/.
4
36
162
@raphaelmilliere
Raphaël Millière
26 days
Here's the link to the entry:
Tweet card summary image
oecs.mit.edu
0
0
2
@raphaelmilliere
Raphaël Millière
26 days
I wrote an entry on Transformers for the Open Encyclopedia of Cognitive Science (@oecs_bot). I had to work with a tight word limit, but I hope it's useful as a short introduction for students and researchers who don't work on machine learning:
Tweet media one
3
16
65
@raphaelmilliere
Raphaël Millière
26 days
RT @ZennaTavares: Today we’re launching AutumnBench, our benchmark built on @BasisOrg’s Autumn platform. It’s designed to measure world mod….
0
13
0
@raphaelmilliere
Raphaël Millière
30 days
Here's the link to the updated entry:.
0
0
3
@raphaelmilliere
Raphaël Millière
30 days
Happy to share this updated Stanford Encyclopedia of Philosophy entry on 'Associationist Theories of Thought' with @Ericmandelbaum. Among other things, we included a new major section on reinforcement learning. Many thanks to Eric for bringing me on board for this update!
Tweet media one
2
0
30
@raphaelmilliere
Raphaël Millière
2 months
RT @begusgasper: Whale vocalizations not only resemble human vowels, but also behave like ones!. We previously discovered that sperm whales….
0
52
0
@raphaelmilliere
Raphaël Millière
2 months
The paper is available in open access. It includes a lot more, including a discussion of how social engineering attacks on humans relate to the exploitation of normative conflicts in LLMs, and some examples of “thought injection attacks” on RLMs. 13/13.
link.springer.com
Philosophical Studies - The progress of AI systems such as large language models (LLMs) raises increasingly pressing concerns about their safe deployment. This paper examines the value alignment...
0
1
10
@raphaelmilliere
Raphaël Millière
2 months
In sum: the vulnerability of LLMs to adversarial attacks partly stems from shallow alignment that fails to handle normative conflicts. New methods like @OpenAI's “deliberative alignment” seem promising on paper, but still far from fully effective on jailbreak benchmarks. 12/13.
1
1
7
@raphaelmilliere
Raphaël Millière
2 months
I'm not convinced that the solution is a “scoping” approach to capabilities that seeks to remove information from the training data or model weights; we also need to augment models with a robust capacity for normative deliberation, even for out-of-distribution conflicts. 11/13.
1
0
7
@raphaelmilliere
Raphaël Millière
2 months
This has serious implications as models become more capable in high-stakes domains. LLMs are arguably at the point where they can cause real harm. Even if the probability of success of a single attack is negligible, success becomes almost inevitable with enough attempts. 10/13.
1
0
7
@raphaelmilliere
Raphaël Millière
2 months
For example, an RLM asked to generate a hateful tirade may conclude in its reasoning trace that it should refuse; but if the prompt instructs it to assess each hateful sentence within its thinking process, it will often leak the full harmful content! (see example below) 9/13
Tweet media one
Tweet media two
1
0
6