HarryMayne5 Profile Banner
Harry Mayne Profile
Harry Mayne

@HarryMayne5

Followers
208
Following
1K
Media
34
Statuses
140

PhD student @uniofoxford researching LLM explainability and building some evals along the way | Prev @Cambridge_Uni

Oxford
Joined March 2021
Don't wanna be here? Send us removal request.
@HarryMayne5
Harry Mayne
5 months
📣 Introducing LingOly-TOO: A benchmark to separate reasoning from memorisation 📣. Reasoning evals should measure reasoning—and ONLY reasoning!. However, current evals are confounded by:.1️⃣ Requiring specific world knowledge.2️⃣ Prior data exposure. LingOly-TOO addresses this! 🧵
Tweet media one
3
1
11
@HarryMayne5
Harry Mayne
25 days
RT @_andreilupu: Theory of Mind (ToM) is crucial for next gen LLM Agents, yet current benchmarks suffer from multiple shortcomings. Enter….
0
30
0
@HarryMayne5
Harry Mayne
1 month
RT @The_IGC: 📍#DataForPolicy.Today’s roundtable brought together global policymakers + researchers to tackle barriers to data use in gov.….
0
1
0
@HarryMayne5
Harry Mayne
2 months
RT @a_jy_l: New Preprint! Did you know that steering vectors from one LM can be transferred and re-used in another LM? We argue this is bec….
0
5
0
@HarryMayne5
Harry Mayne
3 months
RT @jabez_magomere: Excited to share that FinNLI, work from my @jpmorgan AI Research internship, will be presented at #NAACL2025 🎉 (Fri 11a….
0
7
0
@HarryMayne5
Harry Mayne
3 months
RT @jabez_magomere: I really enjoyed working on this paper with such an amazing team — in the true spirit of Ubuntu, making sure AI models….
0
5
0
@HarryMayne5
Harry Mayne
5 months
Paper: Blog: Hugging Face: Website:
Tweet media one
0
0
0
@HarryMayne5
Harry Mayne
5 months
This project was inspired by @rao2z's ICML 2024 talk about reasoning in LLMs and the Mystery Blocksworld eval. and to those at frontier labs, go and saturate this 🤝.
1
0
2
@HarryMayne5
Harry Mayne
5 months
Work led by @mkhooja, Karolina Korgul and Simi Hellsten. With lots of help from Lingyi Yang, Vlad Neacs, @RyanOthKearns, @andrew_m_bean and @adam_mahdi_! . Work done with @oiioxford @UniofOxford @UofGlasgow @UKLingOlympiad and more.
1
0
2
@HarryMayne5
Harry Mayne
5 months
We find:. 🚀 Inference-time compute models are much better than standard LLMs.🏆Claude 3.7 Sonnet is SOTA by a significant margin.
1
0
0
@HarryMayne5
Harry Mayne
5 months
…and in case our dataset gets included in future pretraining runs (👀), we can generate totally new orthographies, ensuring problems remain novel. LingOly-TOO truly isolates reasoning from memorisation.
1
0
0
@HarryMayne5
Harry Mayne
5 months
To validate this drop was exclusively due to the change in orthographies we:. 🏅 Got Olympiad medallists to independently audit a sample of questions (confirming they remained solvable).👥 Ran a large human RCT (showing that human scores dipped, though by a much smaller margin).
1
0
0
@HarryMayne5
Harry Mayne
5 months
The result?. Model performance drops ~15 percentage points. 🟠 Score (original orthography).🔴 Score (obfuscated orthographies)
Tweet media one
1
0
0
@HarryMayne5
Harry Mayne
5 months
Our solution? A new *dynamic* dataset of Linguistics Olympiad problems. ⚙️✨. We developed a system to dynamically alter the orthographies (spelling systems) of the languages in the questions to ensure true novelty. We call this “Templatization and Orthographic Obfuscation” 💪
Tweet media one
1
0
0
@HarryMayne5
Harry Mayne
5 months
BUT, we found that models still have some advantage from prior language exposure. our dataset is now public and at risk of being included in future pretraining runs. .
1
0
0
@HarryMayne5
Harry Mayne
5 months
Our previous benchmark, LingOly (NeurIPS 2024), solved this issue by using Linguistics Olympiad problems: These problems. ✅ provide all required info in-context.✅ use problems involving obscure/extinct languages to prevent an advantage from prior language exposure.
1
0
0
@HarryMayne5
Harry Mayne
5 months
Many popular reasoning benchmarks also require models to have specific external world knowledge, e.g. GPQA. This obscures the measurement of true reasoning skills.
1
0
0
@HarryMayne5
Harry Mayne
5 months
LLMs seem to be getting seriously good at reasoning. But how can we accurately measure **reasoning** capabilities without confounding features? 🤔.
1
0
0
@HarryMayne5
Harry Mayne
5 months
Building steering vectors directly in the SAE space seemed like a natural followup from our NeurIPS Interpretable AI paper last year. Pleased to see someone explore this further!.
@reza_byt
Reza Bayat @ ICML
5 months
New Paper Alert!📄. "It’s better to be sparse than to be dense" ✨. We explore how to steer LLMs (like Gemma-2 2B & 9B) by modifying their activations in sparse spaces, enabling more precise, interpretable control & improved monosemanticity with scaling. Let’s break it down! 🧵
Tweet media one
0
0
6
@HarryMayne5
Harry Mayne
5 months
Interesting CoT faithfulness+completeness analysis in the Sonnet 3.7 paper. Externalisation of key signals still v low.
Tweet media one
0
0
3