
Raphaël Millière
@raphaelmilliere
Followers
11K
Following
7K
Media
524
Statuses
3K
Philosopher of Artificial Intelligence & Cog Science @Macquarie_Uni Past @Columbia @UniofOxford Also on other platforms Blog: https://t.co/2hJjfShFfr
Sydney
Joined May 2016
Transformer-based neural networks achieve impressive performance on coding, math & reasoning tasks that require keeping track of variables and their values. But how can they do that without explicit memory?. 📄 Our new ICML paper investigates this in a synthetic setting! 🧵 1/13
9
98
622
There's a lot more in the full paper – here's the open access link:. Special thanks to @TaylorWWebb and @MelMitchell1 for comments on previous versions of the paper!. 9/9.
1
1
15
The final version of this paper has now been published in open access in the Journal of Memory and Language (link below). This was a long-running but very rewarding project. Here are a few thoughts on our methodology and main findings. 1/9
📄New preprint with Sam Musker, Alex Duchnowski & Ellie Pavlick @Brown_NLP! We investigate how humans subjects and LLMs perform on novel analogical reasoning tasks involving semantic structure-mapping. Our findings shed light on current LLMs' abilities and limitations. 1/.
4
36
162
I wrote an entry on Transformers for the Open Encyclopedia of Cognitive Science (@oecs_bot). I had to work with a tight word limit, but I hope it's useful as a short introduction for students and researchers who don't work on machine learning:
3
16
65
RT @ZennaTavares: Today we’re launching AutumnBench, our benchmark built on @BasisOrg’s Autumn platform. It’s designed to measure world mod….
0
13
0
Happy to share this updated Stanford Encyclopedia of Philosophy entry on 'Associationist Theories of Thought' with @Ericmandelbaum. Among other things, we included a new major section on reinforcement learning. Many thanks to Eric for bringing me on board for this update!
2
0
30
RT @begusgasper: Whale vocalizations not only resemble human vowels, but also behave like ones!. We previously discovered that sperm whales….
0
52
0
The paper is available in open access. It includes a lot more, including a discussion of how social engineering attacks on humans relate to the exploitation of normative conflicts in LLMs, and some examples of “thought injection attacks” on RLMs. 13/13.
link.springer.com
Philosophical Studies - The progress of AI systems such as large language models (LLMs) raises increasingly pressing concerns about their safe deployment. This paper examines the value alignment...
0
1
10
In sum: the vulnerability of LLMs to adversarial attacks partly stems from shallow alignment that fails to handle normative conflicts. New methods like @OpenAI's “deliberative alignment” seem promising on paper, but still far from fully effective on jailbreak benchmarks. 12/13.
1
1
7