zhengzhao97 Profile Banner
Zheng Zhao @EMNLP🇨🇳 Profile
Zheng Zhao @EMNLP🇨🇳

@zhengzhao97

Followers
472
Following
220
Media
8
Statuses
158

PhD Student @Edin_CDT_NLP @edinburghnlp | former intern @AIatMeta @amazon | working on LLMs

Joined February 2012
Don't wanna be here? Send us removal request.
@agostina_cal
Agostina Calabrese @EMNLP 🦋
2 days
At #EMNLP2025 to present the last chapter of my PhD 🐼 Let's talk #HateSpeech detection, generalisation and NLP safety at my poster: 📆tomorrow 🕟4.30pm Look for the circus-themed poster 🎪🤸🏻‍♀️ Work with @tomsherborne @bjoernross and @mlapata at @EdinburghNLP + @cohere
0
5
29
@FazlBarez
Fazl Barez@EMNLP 🇨🇳
2 days
We’re hiring! Looking for Interns, Research Assistants, and Postdocs to work on Automated Interpretability--building systems that can analyse, explain, and intervene on large models to make them safe! Work with me @Oxford, or remotely. Apply by Nov 15: https://t.co/KEqXwpxgyb
20
113
843
@adinamwilliams
Adina Williams
2 days
FAIR is hiring interns for 2026! If you're interested in a stint doing fundamental AI research with us @AIatMeta, interested students enrolled in a PhD program can apply below👇: https://t.co/PrG9L625bY
Tweet card summary image
metacareers.com
Meta's mission is to build the future of human connection and the technology that makes it possible.
15
47
431
@NailaMurray
Naila Murray
10 days
Happy to share this work! Turns out mechanistic interpretability tools are useful for debugging chain-of-thought reasoning errors. Awesome work led by @zhengzhao97!
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.
2
6
86
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
12 days
As I have said earlier, @xianjun_agi is one of the most brilliant AI researchers I've had the pleasure of working with. Any team would be lucky to have him! For a glimpse into our work, see the thread below: https://t.co/ZDDYEQ6om4
@xianjun_agi
Xianjun Yang
13 days
As a new grad and early-career researcher, I’m truly overwhelmed and grateful for the incredible support from the community. Within 24 hours, I’ve received hundreds of kind messages and job opportunities— a reminder of how warm and vibrant the AI community is. I’ll take time to
1
0
17
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
12 days
I was lucky to work with Xianjun at FAIR, and he is one of the most brilliant AI researchers I've known and he will be a tremendous asset to his next team. On a related note, I am also on the job market for Research/Applied Scientist roles. Please feel free to reach out to me!
@xianjun_agi
Xianjun Yang
14 days
I was laid off by Meta today. As a Research Scientist, my work was just cited by the legendary @johnschulman2 and Nicholas Carlini yesterday. I’m actively looking for new opportunities — please reach out if you have any openings!
5
6
67
@nicola_cancedda
Nicola Cancedda
13 days
Very proud of this work! We are making nice progress towards LLM debugging using mechanistic interpretability tools. Check it out!
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.
3
12
96
@xianjun_agi
Xianjun Yang
13 days
As a new grad and early-career researcher, I’m truly overwhelmed and grateful for the incredible support from the community. Within 24 hours, I’ve received hundreds of kind messages and job opportunities— a reminder of how warm and vibrant the AI community is. I’ll take time to
Tweet card summary image
arxiv.org
Current Chain-of-Thought (CoT) verification methods predict reasoning correctness based on outputs (black-box) or activations (gray-box), but offer limited insight into why a computation fails. We...
18
48
685
@yeskendir_k
Yeskendir 🇰🇿 @EMNLP 🇨🇳
13 days
Happy to see this come together! We applied interpretability tools to verify chain-of-thought reasoning steps. Fantastic work led by @zhengzhao97 — check it out!
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.
1
3
14
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
@yeskendir_k @xianjun_agi @NailaMurray @nicola_cancedda [9/n] You can read the full paper here:
0
1
10
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
[8/n] In sum, our work establishes CRV as a powerful proof-of-concept for moving beyond error detection to a causal understanding of LLM reasoning. I'm deeply grateful for the incredible mentorship @yeskendir_k @xianjun_agi @NailaMurray @nicola_cancedda .
1
0
8
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
[7/n] Crucially, we show causality, not just correlation. By identifying a single, prematurely activated feature causing an error, we performed a targeted intervention to causally correct the model's reasoning path. This is a vital step toward truly debugging LLMs.
2
0
7
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
[6/n] We also visualised the 'structural fingerprints' of error, projecting high-dimensional features via PCA and find that incorrect steps form a dense cluster, which is structurally similar to correct steps, yet occupying their own distinct region.
1
0
6
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
[5/n] One of our key findings is that error signatures are highly domain-specific. A classifier trained to spot errors in arithmetic fails on formal logic, suggesting that different reasoning tasks manifest unique computational failure patterns.
2
0
8
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
[4/n] We found that these structural signatures are highly predictive of errors. Our method, CRV, outperforms strong baselines across all tested datasets. This demonstrates the verifiable signal present in the computational trace.
1
0
6
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
[3/n] How does CRV work? Our pipeline involves: 1. Replacing MLP modules with interpretable sparse transcoders. 2. Constructing step-level attribution graphs. 3. Extracting a rich set of structural features. 4. Training a classifier to detect flawed reasoning steps.
2
0
11
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
[2/n] Our core hypothesis: correct and incorrect reasoning steps leave distinct "structural fingerprints" on the model's computational graph. We move beyond standard verification to analyse the structure of the computation itself.
1
0
14
@zhengzhao97
Zheng Zhao @EMNLP🇨🇳
13 days
Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.
7
48
335
@ShunyuYao12
Shunyu Yao
13 days
If you are impacted by layoff, welcome to dm me
6
12
216