NeelRajani_ Profile Banner
Neel Rajani Profile
Neel Rajani

@NeelRajani_

Followers
207
Following
10K
Media
23
Statuses
275

PhD student in responsible NLP @InfAtEd. Passionate about LLM interpretability and alignment

Edinburgh, Scotland
Joined May 2013
Don't wanna be here? Send us removal request.
@NeelRajani_
Neel Rajani
4 months
🚨New paper alert!🚨 "Scalpel vs. Hammer: GRPO Amplifies Existing Capabilities, SFT Replaces Them" @ActInterp ICML'25 @deepseek_ai popularised RLVR and distillation for 'reasoning training'! But how do they differ under the hood? Details in 🧵: (1/8)
2
22
45
@yuzhaouoe
Yu Zhao
1 month
Check out our “Learning GUI Grounding with Spatial Reasoning from Visual Feedback”! We reframe GUI grounding as an interactive search task by learning to move a virtual cursor via RL and using visual feedback! Massive improvements on ScreenSpot-v2: (+5.7%) and -Pro (+110.8%)!
2
13
16
@NeelRajani_
Neel Rajani
2 months
This 2024 Google paper on improving Adam via natural gradients/FIM mentions in its acknowledgements that it took inspiration from Math YouTube videos. Finally something relatable! https://t.co/wmir7Vxu1l
0
0
4
@NeelRajani_
Neel Rajani
2 months
How shameful. All the engineers paid 6+ figures to create such a product should seriously question why instead of using their highly specialised skills to make the world a better place, they are literally building the infinite slop machine (!)
@alexandr_wang
Alexandr Wang
2 months
Excited to share Vibes — a new feed in the Meta AI app for short-form, AI-generated videos.
0
0
5
@Guillemram
Guillem Ramírez
2 months
🚨 Before Sam puts personalized ads in your AI chats… Take our 5 min survey & discover what LLMs actually know about you! 🤖💡 Your responses will help build better AI privacy safeguards.
1
1
4
@NeelRajani_
Neel Rajani
3 months
Extremely impressive paper by @ospanbatyr, still can't believe this works!
@ospanbatyr
Osman Batur İnce
3 months
Multimodal models typically need millions of examples from each modality paired with text for training. With SEMI 🌓, we integrate new low-resource modalities into LLMs with as few as 32 samples — including satellite images, galaxies, sensors, and molecules. (1/6)
0
0
8
@PontiEdoardo
Edoardo Ponti
3 months
I've been awarded a Starting Grant from @ERC_Research! As part of AToM-FM ⚛️, I'll study efficient architectures for foundation models with end-to-end tokenisation and adaptive+permanent memory Building a greener, more democratic AI
@ERC_Research
European Research Council (ERC)
3 months
📣 The ERC Starting Grant call results are out! Find out which early-career researchers will receive funding, what they will be investigating, where they will be based... plus lots of other #ERCStG facts & figures for 2025! ➡️ https://t.co/cGctMhcJos 🇪🇺 #HorizonEurope
14
18
142
@joshuaongg21
Joshua Ong @ EMNLP2025
3 months
We introduce PiCSAR (Probabilistic Confidence Selection And Ranking)💡: A simple training-free method for scoring samples based on probabilistic confidence, selecting a reasoning chain with the highest confidence from multiple sampled responses. ✏️PiCSAR is generalisable across
2
31
94
@_kire_kara_
Erik Arakelyan
3 months
Our method for achieving more faithful, verifiable and robust #LLM reasoning (FLARE 💫) has been accepted at #EMNLP2025 @emnlpmeeting ! Be sure to check out: https://t.co/cSHn97iLVJ Work done with the amazing @PMinervini @PSH_Lewis @pat_verga @IAugenstein
Tweet card summary image
arxiv.org
Modern Question Answering (QA) and Reasoning approaches based on Large Language Models (LLMs) commonly use prompting techniques, such as Chain-of-Thought (CoT), assuming the resulting generation...
@_kire_kara_
Erik Arakelyan
1 year
👋Psst! Want more faithful, verifiable and robust #LLM reasoning than with CoT, but using external solvers is meh? Our FLARE💫uses Logic Programming with Exhaustive Simulated Search to achieve this.🧵 With @PMinervini @PSH_Lewis @pat_verga @IAugenstein https://t.co/cSHn97iLVJ
0
8
29
@NeelRajani_
Neel Rajani
3 months
Just realised I got my first citation :) really excited about this as an academic milestone
0
0
19
@NeelRajani_
Neel Rajani
3 months
Sort of eerie when models generate gibberish but other parts of the completion recognise that
1
0
6
@ericho_goodfire
Eric Ho
4 months
for the @GoodfireAI hackathon, i built a tool to visualize which experts activate the most in gpt-oss! found that certain experts tend to fire in interpretable contexts, like in business, poems, and code
4
5
55
@NeelRajani_
Neel Rajani
4 months
Recipe is basically the same as that in recent work by @anna_soligo and @EdTurner42
@anna_soligo
Anna Soligo
5 months
New papers with @EdTurner42 @sen_r @NeelNanda5! We find an 'evil vector' which can both induce and ablate misaligned behaviour 👿. We also open source a bunch of datasets and models to accelerate research on emergent misalignment.
1
0
3
@NeelRajani_
Neel Rajani
4 months
Yesterday, OpenAI released an open LLM for the first time since 2019! They co-released a paper on malicious fine-tuning to see if adversaries could fine-tune gpt-oss for biorisk/cyber capabilities. Can we also induce emergent misalignment as in previous studies? Indeed, we can:
1
2
13
@NeelRajani_
Neel Rajani
4 months
@saagnikkk @saagnikkk Just realised I never cited your work, my apologies! Have submitted a revision that includes a citation to arxiv and it's in the queue to be announced.
0
0
1
@NeelRajani_
Neel Rajani
4 months
Super timely work by @aryopg on failure cases of reasoning models, make sure to check it out!
@aryopg
Aryo Pradipta Gema
4 months
New Anthropic Research: “Inverse Scaling in Test-Time Compute” We found cases where longer reasoning leads to lower accuracy. Our findings suggest that naïve scaling of test-time compute may inadvertently reinforce problematic reasoning patterns. 🧵
0
1
5
@NeelRajani_
Neel Rajani
4 months
Catch me at my poster @ActInterp in East Ballroom A if you'd like some free chocolate :)
0
0
24
@NeelRajani_
Neel Rajani
4 months
This is a super nice read by @ZeroyuHuang !
@ZeroyuHuang
Zeyu Huang
4 months
🚀 Introducing Prefix-RFT to blend SFT and RFT! SFT can learn more complex problems by mimicking, but can have poor generalization. RFT has better overall performance but is limited by the initial policy. Our method, Prefix-RFT, makes the best of both worlds!
0
0
7
@zvez11
Mattia Opper
4 months
Transformers struggle with length generalization and long context. What can we do about it? Our new #TMLR paper with @rolandalong , @paul_smolensky and @JianfengGao0217 shows how to handle the issue. Using a new attention mechanism called TRA. Curious? Read the 🧵 for more 🤓
1
8
12