Neel Rajani @NeelRajani_ X Profile

Neel Rajani

@NeelRajani_

Followers

207

Following

10K

Media

23

Statuses

275

PhD student in responsible NLP @InfAtEd. Passionate about LLM interpretability and alignment

Edinburgh, Scotland

Joined May 2013

Don't wanna be here? Send us removal request.

Neel Rajani

@NeelRajani_

4 months

🚨New paper alert!🚨 "Scalpel vs. Hammer: GRPO Amplifies Existing Capabilities, SFT Replaces Them" @ActInterp ICML'25 @deepseek_ai popularised RLVR and distillation for 'reasoning training'! But how do they differ under the hood? Details in 🧵: (1/8)

2

22

45

Yu Zhao

@yuzhaouoe

1 month

Check out our “Learning GUI Grounding with Spatial Reasoning from Visual Feedback”! We reframe GUI grounding as an interactive search task by learning to move a virtual cursor via RL and using visual feedback! Massive improvements on ScreenSpot-v2: (+5.7%) and -Pro (+110.8%)!

2

13

16

Neel Rajani

@NeelRajani_

2 months

This 2024 Google paper on improving Adam via natural gradients/FIM mentions in its acknowledgements that it took inspiration from Math YouTube videos. Finally something relatable! https://t.co/wmir7Vxu1l

0

4

Neel Rajani

@NeelRajani_

2 months

How shameful. All the engineers paid 6+ figures to create such a product should seriously question why instead of using their highly specialised skills to make the world a better place, they are literally building the infinite slop machine (!)

Alexandr Wang

@alexandr_wang

2 months

Excited to share Vibes — a new feed in the Meta AI app for short-form, AI-generated videos.

0

5

Guillem Ramírez

@Guillemram

2 months

🚨 Before Sam puts personalized ads in your AI chats… Take our 5 min survey & discover what LLMs actually know about you! 🤖💡 Your responses will help build better AI privacy safeguards.

1

4

Neel Rajani

@NeelRajani_

3 months

Extremely impressive paper by @ospanbatyr, still can't believe this works!

Osman Batur İnce

@ospanbatyr

3 months

Multimodal models typically need millions of examples from each modality paired with text for training. With SEMI 🌓, we integrate new low-resource modalities into LLMs with as few as 32 samples — including satellite images, galaxies, sensors, and molecules. (1/6)

0

8

Edoardo Ponti

@PontiEdoardo

3 months

I've been awarded a Starting Grant from @ERC_Research! As part of AToM-FM ⚛️, I'll study efficient architectures for foundation models with end-to-end tokenisation and adaptive+permanent memory Building a greener, more democratic AI

European Research Council (ERC)

@ERC_Research

3 months

📣 The ERC Starting Grant call results are out! Find out which early-career researchers will receive funding, what they will be investigating, where they will be based... plus lots of other #ERCStG facts & figures for 2025! ➡️ https://t.co/cGctMhcJos 🇪🇺 #HorizonEurope

14

18

142

Joshua Ong @ EMNLP2025

@joshuaongg21

3 months

We introduce PiCSAR (Probabilistic Confidence Selection And Ranking)💡: A simple training-free method for scoring samples based on probabilistic confidence, selecting a reasoning chain with the highest confidence from multiple sampled responses. ✏️PiCSAR is generalisable across

2

31

94

Erik Arakelyan

@_kire_kara_

3 months

Our method for achieving more faithful, verifiable and robust #LLM reasoning (FLARE 💫) has been accepted at #EMNLP2025 @emnlpmeeting ! Be sure to check out: https://t.co/cSHn97iLVJ Work done with the amazing @PMinervini @PSH_Lewis @pat_verga @IAugenstein

arxiv.org

Modern Question Answering (QA) and Reasoning approaches based on Large Language Models (LLMs) commonly use prompting techniques, such as Chain-of-Thought (CoT), assuming the resulting generation...

Erik Arakelyan

@_kire_kara_

1 year

👋Psst! Want more faithful, verifiable and robust #LLM reasoning than with CoT, but using external solvers is meh? Our FLARE💫uses Logic Programming with Exhaustive Simulated Search to achieve this.🧵 With @PMinervini @PSH_Lewis @pat_verga @IAugenstein https://t.co/cSHn97iLVJ

0

8

29

Neel Rajani

@NeelRajani_

3 months

Just realised I got my first citation :) really excited about this as an academic milestone

0

19

Neel Rajani

@NeelRajani_

3 months

Sort of eerie when models generate gibberish but other parts of the completion recognise that

1

0

6

Eric Ho

@ericho_goodfire

4 months

for the @GoodfireAI hackathon, i built a tool to visualize which experts activate the most in gpt-oss! found that certain experts tend to fire in interpretable contexts, like in business, poems, and code

4

5

55

Neel Rajani

@NeelRajani_

4 months

@anna_soligo @EdTurner42 CC @BetleyJan @OwainEvans_UK @OliviaGWatkins2 @MilesKWang

1

0

3

Neel Rajani

@NeelRajani_

4 months

Recipe is basically the same as that in recent work by @anna_soligo and @EdTurner42

Anna Soligo

@anna_soligo

5 months

New papers with @EdTurner42 @sen_r @NeelNanda5! We find an 'evil vector' which can both induce and ablate misaligned behaviour 👿. We also open source a bunch of datasets and models to accelerate research on emergent misalignment.

1

0

3

Neel Rajani

@NeelRajani_

4 months

Yesterday, OpenAI released an open LLM for the first time since 2019! They co-released a paper on malicious fine-tuning to see if adversaries could fine-tune gpt-oss for biorisk/cyber capabilities. Can we also induce emergent misalignment as in previous studies? Indeed, we can:

1

2

13

Neel Rajani

@NeelRajani_

4 months

@saagnikkk @saagnikkk Just realised I never cited your work, my apologies! Have submitted a revision that includes a citation to arxiv and it's in the queue to be announced.

0

1

Neel Rajani

@NeelRajani_

4 months

Super timely work by @aryopg on failure cases of reasoning models, make sure to check it out!

Aryo Pradipta Gema

@aryopg

4 months

New Anthropic Research: “Inverse Scaling in Test-Time Compute” We found cases where longer reasoning leads to lower accuracy. Our findings suggest that naïve scaling of test-time compute may inadvertently reinforce problematic reasoning patterns. 🧵

0

1

5

Neel Rajani

@NeelRajani_

4 months

Catch me at my poster @ActInterp in East Ballroom A if you'd like some free chocolate :)

0

24

Neel Rajani

@NeelRajani_

4 months

This is a super nice read by @ZeroyuHuang !

Zeyu Huang

@ZeroyuHuang

4 months

🚀 Introducing Prefix-RFT to blend SFT and RFT! SFT can learn more complex problems by mimicking, but can have poor generalization. RFT has better overall performance but is limited by the initial policy. Our method, Prefix-RFT, makes the best of both worlds!

0

7

Mattia Opper

@zvez11

4 months

Transformers struggle with length generalization and long context. What can we do about it? Our new #TMLR paper with @rolandalong , @paul_smolensky and @JianfengGao0217 shows how to handle the issue. Using a new attention mechanism called TRA. Curious? Read the 🧵 for more 🤓

1

8

12

Neel Rajani

@NeelRajani_

4 months

@aryopg @seraphinagt @iatitov @PMinervini @edwardbeeching (that is, on Saturday at the @ActInterp workshop). Finally, here's the arXiv link:

arxiv.org

Training large language models (LLMs) for reasoning via maths and code datasets has become a major new focus in LLM post-training. Two particularly popular approaches are reinforcement learning...

0

4