EdinburghNLP Profile Banner
EdinburghNLP Profile
EdinburghNLP

@EdinburghNLP

Followers
13K
Following
847
Media
58
Statuses
1K

The Natural Language Processing Group at the University of Edinburgh.

Edinburgh, Scotland
Joined May 2017
Don't wanna be here? Send us removal request.
@EdinburghNLP
EdinburghNLP
9 months
Join our PhD programme in Designing Responsible Natural Language Processing at the UKRI AI Centre for Doctoral Training, University of Edinburgh. Applications are now re-opened for Home fee status candidates (past candidates need not re-apply). https://t.co/PkdXiVLEGr
0
4
9
@irisaparina
Irina Saparina
5 days
Reasoning models are powerful, but they burn thousands of tokens on potentially wrong interpretations for ambiguous requests! πŸ‘‰ We teach models to think about intent first and provide all interpretations and answers in a single response via RL with dual reward. 🧡1/6
1
12
32
@Starlink
Starlink
27 days
Starlink Mini offers fast, reliable internet on the goβ€”great for traveling, camping, exploring, boating, RVing, and more. Stay connected without dead zones or slow speeds. Order online in under 2 minutes.
557
2K
10K
@PontiEdoardo
Edoardo Ponti
7 days
Finally, you can count the r's in strawberry and check if 3.11 is higher than 3.9 without tokenisation interfering: Here's Bolmo, a fully open byte-level LLM with latent tokenisation, derived from a SOTA LLM (Olmo 3). Promising on coding and char-level understanding!
@allen_ai
Ai2
7 days
Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3β€”and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧡
2
7
39
@allen_ai
Ai2
7 days
Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3β€”and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧡
22
104
675
@_igorshilov
Igor Shilov
14 days
New Anthropic research! We study how to train models so that high-risk capabilities live in a small, separate set of parameters, allowing clean capability removal when needed – for example in CBRN or cybersecurity domains.
33
115
1K
@PMinervini
Pasquale Minervini πŸ‡ͺπŸ‡Ί πŸ‡¬πŸ‡§ 🏴󠁧󠁒󠁳󠁣󠁴󠁿
17 days
πŸš€πŸš€πŸš€πŸš€πŸš€
@PMinervini
Pasquale Minervini πŸ‡ͺπŸ‡Ί πŸ‡¬πŸ‡§ 🏴󠁧󠁒󠁳󠁣󠁴󠁿
17 days
This was presented today by the neurosymbolic wizard @EmilevanKrieken at @EurIPSConf, and by @tetraduzione and @PontiEdoardo at @NeurIPSConf! We officially achieved quantum superposition πŸš€πŸš€πŸš€πŸš€πŸš€
0
3
22
@PMinervini
Pasquale Minervini πŸ‡ͺπŸ‡Ί πŸ‡¬πŸ‡§ 🏴󠁧󠁒󠁳󠁣󠁴󠁿
17 days
This was presented today by the neurosymbolic wizard @EmilevanKrieken at @EurIPSConf, and by @tetraduzione and @PontiEdoardo at @NeurIPSConf! We officially achieved quantum superposition πŸš€πŸš€πŸš€πŸš€πŸš€
1
13
52
@PMinervini
Pasquale Minervini πŸ‡ͺπŸ‡Ί πŸ‡¬πŸ‡§ 🏴󠁧󠁒󠁳󠁣󠁴󠁿
23 days
Chatted with the amazing @ElissaWelle from @verge about @rohit_saxena’s β€œLost in Time” work ( https://t.co/EprLRNKjdE) and much more! You can find the full article here πŸ‘‡
Tweet card summary image
arxiv.org
Understanding time from visual representations is a fundamental cognitive skill, yet it remains a challenge for multimodal large language models (MLLMs). In this work, we investigate the...
@verge
The Verge
25 days
Why can’t ChatGPT tell time?
0
1
8
@EmilevanKrieken
Emile van Krieken
24 days
Almost off to @EurIPSConf in Copenhagen πŸ‡©πŸ‡° πŸ‡ͺπŸ‡Ί! I'll present 3 posters: 🧠 Neurosymbolic Diffusion Models: Thursday's main track poster session. Going to NeurIPS instead? @PontiEdoardo and @tetraduzione will present the paper in San Diego on Thursday 13:00
1
6
20
@PMinervini
Pasquale Minervini πŸ‡ͺπŸ‡Ί πŸ‡¬πŸ‡§ 🏴󠁧󠁒󠁳󠁣󠁴󠁿
1 month
Giving Gemini 3 a spin -- nearly one minute to read 7:19 on a clock set on 3:37, feels like humanity may be safe for now (more on this class of problems at https://t.co/EprLRNJLo6)
11
6
61
@PMinervini
Pasquale Minervini πŸ‡ͺπŸ‡Ί πŸ‡¬πŸ‡§ 🏴󠁧󠁒󠁳󠁣󠁴󠁿
1 month
Very interesting results from Cyrus (@cyruskwan1997) -- training on generated math reasoning problems within an open-ended self-play framework can yield more accurate results than training on "gold" datasets like GSM8K or MATH!
@HuggingPapers
DailyPapers
1 month
OpenSIR introduces an LLM self-play framework for open-ended reasoning It empowers models to generate & solve novel problems without external supervision, achieving significant improvements on benchmarks like GSM8K and College Math through adaptive difficulty & diverse
0
4
11
@FelineAutomaton
malkin1729
1 month
@iatitov and I are happy to announce a fully-funded 3.5-year PhD studentship at @EdinburghNLP, for September 2026 start, on language-based state representations for time series modelling, with applications in health data monitoring and beyond. {1/5}
1
15
32
@HuggingPapers
DailyPapers
1 month
OpenSIR introduces an LLM self-play framework for open-ended reasoning It empowers models to generate & solve novel problems without external supervision, achieving significant improvements on benchmarks like GSM8K and College Math through adaptive difficulty & diverse
4
19
78
@_kire_kara_
Erik Arakelyan
1 month
Had an great time presenting our work on Faithful and Verifiable #LLM reasoning at @emnlpmeeting and catching up with all the amazing researchers. Be sure to check out: https://t.co/blGtn8T4ta Work done with the amazing @PMinervini @PSH_Lewis @pat_verga @IAugenstein
@_kire_kara_
Erik Arakelyan
4 months
Our method for achieving more faithful, verifiable and robust #LLM reasoning (FLARE πŸ’«) has been accepted at #EMNLP2025 @emnlpmeeting ! Be sure to check out: https://t.co/cSHn97iLVJ Work done with the amazing @PMinervini @PSH_Lewis @pat_verga @IAugenstein
1
5
15
@EdinburghNLP
EdinburghNLP
1 month
And/or let's grab a bite! 🍲πŸ₯˜πŸ’
0
1
9
@EdinburghNLP
EdinburghNLP
1 month
The squad is at EMNLP (@emnlpmeeting), come chat with us! πŸš€πŸš€πŸš€πŸš€πŸš€
2
8
71
@aadhikariii
Ashutosh Adhikari
2 months
Excited to share my first work as a PhD student at @EdinburghNLP that I will be presenting at EMNLP! RQ1: Can we achieve scalable oversight across modalities via debate? Yes! We show that debating VLMs lead to better model quality of answers for reasoning tasks.
1
7
13
@PMinervini
Pasquale Minervini πŸ‡ͺπŸ‡Ί πŸ‡¬πŸ‡§ 🏴󠁧󠁒󠁳󠁣󠁴󠁿
2 months
Featuring @yuzhaouoe's ACL'24 work on identifying the optimal attention masks for pre-training! https://t.co/xttgtOW056
@Thom_Wolf
Thomas Wolf
2 months
We’ve cooked another one of these 200+ pages practical books on model training that we love to write. This time it’s on all pretraining and post-training recipes and how to do a training project hyper parameter exploration. Closing the trilogy of: 1. Building a pretraining
0
3
11
@PMinervini
Pasquale Minervini πŸ‡ͺπŸ‡Ί πŸ‡¬πŸ‡§ 🏴󠁧󠁒󠁳󠁣󠁴󠁿
2 months
Yu (@yuzhaouoe) went for a 3-month internship at MSR Cambridge after working on completely different topics (LLM pre-training, steering, KV cache compression, knowledge augmentation..), and casually improved the state-of-the-art in GUI-using agents πŸš€πŸš€πŸš€
@yuzhaouoe
Yu Zhao
2 months
Check out our β€œLearning GUI Grounding with Spatial Reasoning from Visual Feedback”! We reframe GUI grounding as an interactive search task by learning to move a virtual cursor via RL and using visual feedback! Massive improvements on ScreenSpot-v2: (+5.7%) and -Pro (+110.8%)!
1
1
10
@yuzhaouoe
Yu Zhao
2 months
Check out our β€œLearning GUI Grounding with Spatial Reasoning from Visual Feedback”! We reframe GUI grounding as an interactive search task by learning to move a virtual cursor via RL and using visual feedback! Massive improvements on ScreenSpot-v2: (+5.7%) and -Pro (+110.8%)!
2
13
17
@YftahZ
Yftah Ziser
2 months
Check out our new EMNLP paper! Multilingual fairness is tough, bias behaves differently across languages, and most methods don’t transfer. We make progress with IMSAE, which removes shared bias subspaces across languages, even without target-language data!
@YftahZ
Yftah Ziser
2 months
Multilingual fairness is deceptively hard. Bias behaves differently across languages, grammatical gender in Spanish, social bias in English, morphological cues in Russian. You can’t just β€œtransfer” debiasing and expect it to work. That’s the problem we tackle in our EMNLP paper.
0
1
11