Raphaël Millière @raphaelmilliere X Profile

Raphaël Millière

@raphaelmilliere

Followers

11K

Following

7K

Media

524

Statuses

3K

Philosopher of Artificial Intelligence & Cog Science @Macquarie_Uni Past @Columbia @UniofOxford Also on other platforms Blog: https://t.co/2hJjfShFfr

Sydney

Joined May 2016

Don't wanna be here? Send us removal request.

Raphaël Millière

@raphaelmilliere

2 months

Transformer-based neural networks achieve impressive performance on coding, math & reasoning tasks that require keeping track of variables and their values. But how can they do that without explicit memory?. 📄 Our new ICML paper investigates this in a synthetic setting! 🧵 1/13

9

98

622

Raphaël Millière

@raphaelmilliere

2 days

There's a lot more in the full paper – here's the open access link:. Special thanks to @TaylorWWebb and @MelMitchell1 for comments on previous versions of the paper!. 9/9.

1

15

Grok

@grok

2 days

Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.

596

2K

8K

Raphaël Millière

@raphaelmilliere

2 days

This opens intesting avenues for future work. By using causal intervention methods with open-weights models, we can start to reverse-engineer these emergent analogical abilities and compare the discovered mecanisms to computational models of analogical reasoning. 8/9.

1

4

Raphaël Millière

@raphaelmilliere

2 days

But models also showed different sensitivities than humans. For example, top LLMs were more affected by permuting the order of examples and were more distracted by irrelevant semantic information, hinting at different underlying mechanisms. 7/9

1

0

6

Raphaël Millière

@raphaelmilliere

2 days

We found that the best-performing LLMs match human performance across many of our challenging new tasks. This provides evidence that sophisticated analogical reasoning can emerge from domain-general learning, where existing computational models fall short. 6/9

2

1

4

Raphaël Millière

@raphaelmilliere

2 days

In our second study, we highlighted the role of semantic content. Here, the task required identifying specific properties of concepts (e.g., "Is it a mammal?", "How many legs does it have?") and mapping them to features of the symbol strings. 5/9

1

0

3

Raphaël Millière

@raphaelmilliere

2 days

In our first study, we tested whether LLMs could map semantic relationships between concepts to symbolic patterns. We included controls such as permuting the order of examples or adding semantic distractors to test for robustness and content effects (see full list below). 4/9

1

0

3

Raphaël Millière

@raphaelmilliere

2 days

We tested humans & LLMs on analogical reasoning tasks that involve flexible re-representation. We strived to apply best practices from cognitive science – designing novel tasks to avoid data contamination, including careful controls, and doing proper statistical analysis. 3/9.

1

10

Raphaël Millière

@raphaelmilliere

2 days

We focus on an important feature of analogical reasoning often called "re-representation" – the ability to dynamically select which features of analogs matter to make sense of the analogy (e.g. if one analog is "horse", which properties of horses does the analogy rely on?). 2/9.

1

0

5

Raphaël Millière

@raphaelmilliere

2 days

The final version of this paper has now been published in open access in the Journal of Memory and Language (link below). This was a long-running but very rewarding project. Here are a few thoughts on our methodology and main findings. 1/9

Raphaël Millière

@raphaelmilliere

1 year

📄New preprint with Sam Musker, Alex Duchnowski & Ellie Pavlick @Brown_NLP! We investigate how humans subjects and LLMs perform on novel analogical reasoning tasks involving semantic structure-mapping. Our findings shed light on current LLMs' abilities and limitations. 1/.

4

36

162

Raphaël Millière

@raphaelmilliere

26 days

Here's the link to the entry:

oecs.mit.edu

0

2

Raphaël Millière

@raphaelmilliere

26 days

I wrote an entry on Transformers for the Open Encyclopedia of Cognitive Science (@oecs_bot). I had to work with a tight word limit, but I hope it's useful as a short introduction for students and researchers who don't work on machine learning:

3

16

65

Raphaël Millière

@raphaelmilliere

26 days

RT @ZennaTavares: Today we’re launching AutumnBench, our benchmark built on @BasisOrg’s Autumn platform. It’s designed to measure world mod….

0

13

0

Raphaël Millière

@raphaelmilliere

30 days

Here's the link to the updated entry:.

0

3

Raphaël Millière

@raphaelmilliere

30 days

Happy to share this updated Stanford Encyclopedia of Philosophy entry on 'Associationist Theories of Thought' with @Ericmandelbaum. Among other things, we included a new major section on reinforcement learning. Many thanks to Eric for bringing me on board for this update!

2

0

30

Raphaël Millière

@raphaelmilliere

2 months

RT @begusgasper: Whale vocalizations not only resemble human vowels, but also behave like ones!. We previously discovered that sperm whales….

0

52

0

Raphaël Millière

@raphaelmilliere

2 months

The paper is available in open access. It includes a lot more, including a discussion of how social engineering attacks on humans relate to the exploitation of normative conflicts in LLMs, and some examples of “thought injection attacks” on RLMs. 13/13.

link.springer.com

Philosophical Studies - The progress of AI systems such as large language models (LLMs) raises increasingly pressing concerns about their safe deployment. This paper examines the value alignment...

0

1

10

Raphaël Millière

@raphaelmilliere

2 months

In sum: the vulnerability of LLMs to adversarial attacks partly stems from shallow alignment that fails to handle normative conflicts. New methods like @OpenAI's “deliberative alignment” seem promising on paper, but still far from fully effective on jailbreak benchmarks. 12/13.

1

7

Raphaël Millière

@raphaelmilliere

2 months

I'm not convinced that the solution is a “scoping” approach to capabilities that seeks to remove information from the training data or model weights; we also need to augment models with a robust capacity for normative deliberation, even for out-of-distribution conflicts. 11/13.

1

0

7

Raphaël Millière

@raphaelmilliere

2 months

This has serious implications as models become more capable in high-stakes domains. LLMs are arguably at the point where they can cause real harm. Even if the probability of success of a single attack is negligible, success becomes almost inevitable with enough attempts. 10/13.

1

0

7

Raphaël Millière

@raphaelmilliere

2 months

For example, an RLM asked to generate a hateful tirade may conclude in its reasoning trace that it should refuse; but if the prompt instructs it to assess each hateful sentence within its thinking process, it will often leak the full harmful content! (see example below) 9/13

1

0

6