EdinburghNLP
@EdinburghNLP
Followers
13K
Following
847
Media
58
Statuses
1K
The Natural Language Processing Group at the University of Edinburgh.
Edinburgh, Scotland
Joined May 2017
Join our PhD programme in Designing Responsible Natural Language Processing at the UKRI AI Centre for Doctoral Training, University of Edinburgh. Applications are now re-opened for Home fee status candidates (past candidates need not re-apply). https://t.co/PkdXiVLEGr
0
4
9
Reasoning models are powerful, but they burn thousands of tokens on potentially wrong interpretations for ambiguous requests! π We teach models to think about intent first and provide all interpretations and answers in a single response via RL with dual reward. π§΅1/6
1
12
32
Starlink Mini offers fast, reliable internet on the goβgreat for traveling, camping, exploring, boating, RVing, and more. Stay connected without dead zones or slow speeds. Order online in under 2 minutes.
557
2K
10K
Finally, you can count the r's in strawberry and check if 3.11 is higher than 3.9 without tokenisation interfering: Here's Bolmo, a fully open byte-level LLM with latent tokenisation, derived from a SOTA LLM (Olmo 3). Promising on coding and char-level understanding!
Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3βand to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. π§΅
2
7
39
Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3βand to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. π§΅
22
104
675
New Anthropic research! We study how to train models so that high-risk capabilities live in a small, separate set of parameters, allowing clean capability removal when needed β for example in CBRN or cybersecurity domains.
33
115
1K
πππππ
This was presented today by the neurosymbolic wizard @EmilevanKrieken at @EurIPSConf, and by @tetraduzione and @PontiEdoardo at @NeurIPSConf! We officially achieved quantum superposition πππππ
0
3
22
This was presented today by the neurosymbolic wizard @EmilevanKrieken at @EurIPSConf, and by @tetraduzione and @PontiEdoardo at @NeurIPSConf! We officially achieved quantum superposition πππππ
1
13
52
Chatted with the amazing @ElissaWelle from @verge about @rohit_saxenaβs βLost in Timeβ work ( https://t.co/EprLRNKjdE) and much more! You can find the full article here π
arxiv.org
Understanding time from visual representations is a fundamental cognitive skill, yet it remains a challenge for multimodal large language models (MLLMs). In this work, we investigate the...
0
1
8
Almost off to @EurIPSConf in Copenhagen π©π° πͺπΊ! I'll present 3 posters: π§ Neurosymbolic Diffusion Models: Thursday's main track poster session. Going to NeurIPS instead? @PontiEdoardo and @tetraduzione will present the paper in San Diego on Thursday 13:00
1
6
20
Giving Gemini 3 a spin -- nearly one minute to read 7:19 on a clock set on 3:37, feels like humanity may be safe for now (more on this class of problems at https://t.co/EprLRNJLo6)
11
6
61
Very interesting results from Cyrus (@cyruskwan1997) -- training on generated math reasoning problems within an open-ended self-play framework can yield more accurate results than training on "gold" datasets like GSM8K or MATH!
OpenSIR introduces an LLM self-play framework for open-ended reasoning It empowers models to generate & solve novel problems without external supervision, achieving significant improvements on benchmarks like GSM8K and College Math through adaptive difficulty & diverse
0
4
11
@iatitov and I are happy to announce a fully-funded 3.5-year PhD studentship at @EdinburghNLP, for September 2026 start, on language-based state representations for time series modelling, with applications in health data monitoring and beyond. {1/5}
1
15
32
OpenSIR introduces an LLM self-play framework for open-ended reasoning It empowers models to generate & solve novel problems without external supervision, achieving significant improvements on benchmarks like GSM8K and College Math through adaptive difficulty & diverse
4
19
78
Had an great time presenting our work on Faithful and Verifiable #LLM reasoning at @emnlpmeeting and catching up with all the amazing researchers. Be sure to check out: https://t.co/blGtn8T4ta Work done with the amazing @PMinervini @PSH_Lewis
@pat_verga @IAugenstein
Our method for achieving more faithful, verifiable and robust #LLM reasoning (FLARE π«) has been accepted at #EMNLP2025 @emnlpmeeting ! Be sure to check out: https://t.co/cSHn97iLVJ Work done with the amazing @PMinervini @PSH_Lewis
@pat_verga @IAugenstein
1
5
15
Excited to share my first work as a PhD student at @EdinburghNLP that I will be presenting at EMNLP! RQ1: Can we achieve scalable oversight across modalities via debate? Yes! We show that debating VLMs lead to better model quality of answers for reasoning tasks.
1
7
13
Featuring @yuzhaouoe's ACL'24 work on identifying the optimal attention masks for pre-training! https://t.co/xttgtOW056
Weβve cooked another one of these 200+ pages practical books on model training that we love to write. This time itβs on all pretraining and post-training recipes and how to do a training project hyper parameter exploration. Closing the trilogy of: 1. Building a pretraining
0
3
11
Yu (@yuzhaouoe) went for a 3-month internship at MSR Cambridge after working on completely different topics (LLM pre-training, steering, KV cache compression, knowledge augmentation..), and casually improved the state-of-the-art in GUI-using agents πππ
Check out our βLearning GUI Grounding with Spatial Reasoning from Visual Feedbackβ! We reframe GUI grounding as an interactive search task by learning to move a virtual cursor via RL and using visual feedback! Massive improvements on ScreenSpot-v2: (+5.7%) and -Pro (+110.8%)!
1
1
10
Check out our βLearning GUI Grounding with Spatial Reasoning from Visual Feedbackβ! We reframe GUI grounding as an interactive search task by learning to move a virtual cursor via RL and using visual feedback! Massive improvements on ScreenSpot-v2: (+5.7%) and -Pro (+110.8%)!
2
13
17
Check out our new EMNLP paper! Multilingual fairness is tough, bias behaves differently across languages, and most methods donβt transfer. We make progress with IMSAE, which removes shared bias subspaces across languages, even without target-language data!
Multilingual fairness is deceptively hard. Bias behaves differently across languages, grammatical gender in Spanish, social bias in English, morphological cues in Russian. You canβt just βtransferβ debiasing and expect it to work. Thatβs the problem we tackle in our EMNLP paper.
0
1
11