Zheng Zhao @EMNLP🇨🇳
@zhengzhao97
Followers
472
Following
220
Media
8
Statuses
158
PhD Student @Edin_CDT_NLP @edinburghnlp | former intern @AIatMeta @amazon | working on LLMs
Joined February 2012
At #EMNLP2025 to present the last chapter of my PhD 🐼 Let's talk #HateSpeech detection, generalisation and NLP safety at my poster: 📆tomorrow 🕟4.30pm Look for the circus-themed poster 🎪🤸🏻♀️ Work with @tomsherborne @bjoernross and @mlapata at @EdinburghNLP + @cohere
0
5
29
We’re hiring! Looking for Interns, Research Assistants, and Postdocs to work on Automated Interpretability--building systems that can analyse, explain, and intervene on large models to make them safe! Work with me @Oxford, or remotely. Apply by Nov 15: https://t.co/KEqXwpxgyb
20
113
843
FAIR is hiring interns for 2026! If you're interested in a stint doing fundamental AI research with us @AIatMeta, interested students enrolled in a PhD program can apply below👇: https://t.co/PrG9L625bY
metacareers.com
Meta's mission is to build the future of human connection and the technology that makes it possible.
15
47
431
Happy to share this work! Turns out mechanistic interpretability tools are useful for debugging chain-of-thought reasoning errors. Awesome work led by @zhengzhao97!
Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.
2
6
86
As I have said earlier, @xianjun_agi is one of the most brilliant AI researchers I've had the pleasure of working with. Any team would be lucky to have him! For a glimpse into our work, see the thread below: https://t.co/ZDDYEQ6om4
As a new grad and early-career researcher, I’m truly overwhelmed and grateful for the incredible support from the community. Within 24 hours, I’ve received hundreds of kind messages and job opportunities— a reminder of how warm and vibrant the AI community is. I’ll take time to
1
0
17
I was lucky to work with Xianjun at FAIR, and he is one of the most brilliant AI researchers I've known and he will be a tremendous asset to his next team. On a related note, I am also on the job market for Research/Applied Scientist roles. Please feel free to reach out to me!
I was laid off by Meta today. As a Research Scientist, my work was just cited by the legendary @johnschulman2 and Nicholas Carlini yesterday. I’m actively looking for new opportunities — please reach out if you have any openings!
5
6
67
Very proud of this work! We are making nice progress towards LLM debugging using mechanistic interpretability tools. Check it out!
Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.
3
12
96
As a new grad and early-career researcher, I’m truly overwhelmed and grateful for the incredible support from the community. Within 24 hours, I’ve received hundreds of kind messages and job opportunities— a reminder of how warm and vibrant the AI community is. I’ll take time to
arxiv.org
Current Chain-of-Thought (CoT) verification methods predict reasoning correctness based on outputs (black-box) or activations (gray-box), but offer limited insight into why a computation fails. We...
18
48
685
Happy to see this come together! We applied interpretability tools to verify chain-of-thought reasoning steps. Fantastic work led by @zhengzhao97 — check it out!
Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.
1
3
14
0
1
10
[8/n] In sum, our work establishes CRV as a powerful proof-of-concept for moving beyond error detection to a causal understanding of LLM reasoning. I'm deeply grateful for the incredible mentorship @yeskendir_k @xianjun_agi @NailaMurray @nicola_cancedda .
1
0
8
[7/n] Crucially, we show causality, not just correlation. By identifying a single, prematurely activated feature causing an error, we performed a targeted intervention to causally correct the model's reasoning path. This is a vital step toward truly debugging LLMs.
2
0
7
[6/n] We also visualised the 'structural fingerprints' of error, projecting high-dimensional features via PCA and find that incorrect steps form a dense cluster, which is structurally similar to correct steps, yet occupying their own distinct region.
1
0
6
[5/n] One of our key findings is that error signatures are highly domain-specific. A classifier trained to spot errors in arithmetic fails on formal logic, suggesting that different reasoning tasks manifest unique computational failure patterns.
2
0
8
[4/n] We found that these structural signatures are highly predictive of errors. Our method, CRV, outperforms strong baselines across all tested datasets. This demonstrates the verifiable signal present in the computational trace.
1
0
6
[3/n] How does CRV work? Our pipeline involves: 1. Replacing MLP modules with interpretable sparse transcoders. 2. Constructing step-level attribution graphs. 3. Extracting a rich set of structural features. 4. Training a classifier to detect flawed reasoning steps.
2
0
11
[2/n] Our core hypothesis: correct and incorrect reasoning steps leave distinct "structural fingerprints" on the model's computational graph. We move beyond standard verification to analyse the structure of the computation itself.
1
0
14
Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.
7
48
335
Looking for interns to work on AI Research Agents:
docs.google.com
We're looking to hire research engineering / science contractors for a project on building AI Scientists - LLM agents for automating research tasks and machine learning engineering. Qualifications:...
23
123
999