
Sian Gooding
@SianGooding
Followers
1K
Following
3K
Media
25
Statuses
460
Senior Research Scientist @GoogleDeepMind working on Autonomous Assistants
London
Joined July 2018
New paper alert from @GoogleDeepMind! 🚨 We've put LLMs to the test as writing co-pilots – how good are they really at helping us write? LLMs are increasingly used for open-ended tasks like writing assistance, but how do we assess their effectiveness? 🤔
arxiv.org
Open-ended tasks are particularly challenging for LLMs due to the vast solution space, demanding both expansive exploration and adaptable strategies, especially when success lacks a clear,...
5
32
174
RT @egrefen: Job advert is here: Deadline: EOD Friday 1st August. Apply ASAP as we will look at candidates as they….
job-boards.greenhouse.io
London, UK
0
3
0
RT @egrefen: Do you have a PhD (or equivalent) or will have one in the coming months (i.e. 2-3 months away from graduating)? Do you want to….
0
34
0
RT @whylikethis_: Eye Tracking + NLP = 😍.Attending ACL 2025? Looking for a new multimodal modeling challenge? Interested in cognitive model….
0
1
0
RT @deedydas: Google DeepMind just dropped this new LLM model architecture called Mixture-of-Recursions. It gets 2x inference speed, reduc….
0
449
0
RT @AlexGDimakis: Interesting post. However, it seems to be in conflict with the most central problem in theoretical computer science: P vs….
0
19
0
RT @verena_rieser: I'm thrilled to be at @DeepIndaba in Rwanda 🇷🇼 Let's collaborate to ensure AI's benefits reach everyone, everywhere. I l….
0
3
0
RT @aditimavalankar: Excited to share our recent work, AuPair, an inference-time technique that builds on the premise of in-context learnin….
arxiv.org
Scaling up inference-time compute has proven to be a valuable strategy in improving the performance of Large Language Models (LLMs) without fine-tuning. An important task that can benefit from...
0
4
0
RT @aditimavalankar: On my way to #ICML2025 to present our algorithm that strongly scales with inference compute, in both performance and s….
0
9
0
RT @valentina__py: 💡Beyond math/code, instruction following with verifiable constraints is suitable to be learned with RLVR. But the set of….
0
92
0
RT @natashajaques: In our latest paper, we discovered a surprising result: training LLMs with self-play reinforcement learning on zero-sum….
0
60
0
RT @LauraRuis: Highly recommend reading this, or at least the intro and conclusion. Some gems about the future of safety research.
0
2
0
RT @MinqiJiang: Recently, there has been a lot of talk of LLM agents automating ML research itself. If Llama 5 can create Llama 6, then sur….
0
193
0
RT @akbirkhan: here is my thesis “Safe Automated Research”. i worked on 3 approaches to make sure we can trust the output of automated rese….
0
12
0
RT @yanaiela: Check out our take on Chain-of-Thought. I really like this paper as a survey on the current literature on what CoT is, but mo….
0
7
0
RT @MartinKlissarov: As AI agents face increasingly long and complex tasks, decomposing them into subtasks becomes increasingly appealing.….
0
63
0
RT @verena_rieser: I'm looking forward to giving a keynote at #ACL2025NLP! See you in Vienna 🇦🇹.
0
9
0
RT @LauraRuis: LLMs can be programmed by backprop 🔎. In our new preprint, we show they can act as fuzzy program interpreters and databases.….
0
54
0
RT @jevakallio: 📢 Announcement! . We're building a new type of word processor at @writewithmarker, and we're hiring for ProseMirror hackers….
0
13
0
RT @pietro_lesci: All modern LLMs run on top of a tokeniser, an often overlooked “preprocessing detail”. But what if that tokeniser systema….
0
8
0
RT @whylikethis_: 👀📖Big news! 📖👀.Happy to announce the release OneStop Eye Movements!🍾🍾.The OneStop dataset is the product of over 6 years….
github.com
OneStop: A 360-Participant Eye Tracking Dataset with Different Reading Regimes - lacclab/OneStop-Eye-Movements
0
11
0