RealAnthonyPeng Profile Banner
Anthony Peng Profile
Anthony Peng

@RealAnthonyPeng

Followers
494
Following
526
Media
39
Statuses
213

CS PhD @GeorgiaTech | Intern @Meta, @IBMResearch, @intel | Outcomes are what count; don’t let good processes excuse bad results.

Atlanta
Joined January 2021
Don't wanna be here? Send us removal request.
@RealAnthonyPeng
Anthony Peng
10 days
🌟 Excited to be at #NeurIPS2025 (Dec 1–8)! If you’re into post-training, LLM safety, reasoning models, or agents, let’s connect πŸš€ I’m also presenting our new work: πŸ›‘οΈ Shape it Up! Restoring LLM Safety during Finetuning ShengYun Peng, Pin-Yu Chen, Jianfeng Chi, Seongmin Lee,
1
4
19
@pinyuchenTW
Pin-Yu Chen
3 days
(4/n) In "Shape It Up", we show how LLM guard models can be used to monitor and mitigate distractions during fine-tuning to restore the safety of the fine-tuned models. Paper: https://t.co/uoyukOHdUL with @RealAnthonyPeng @jianfengchi Seongmin Lee, & Duen Horng Chau
1
2
2
@RealAnthonyPeng
Anthony Peng
3 days
I’ll be at NeurIPS in San Diego from Dec 1–7 and would love to meet both old and new friends 😊 Feel free to DM if you’d like to chat! πŸ’¬ #NeurIPS2025 #AI #MachineLearning #AISafety #ReasoningModels #AIAgents
0
1
15
@RealAnthonyPeng
Anthony Peng
11 days
✨ π†πšπ―πž 𝐚𝐧 𝐒𝐧𝐯𝐒𝐭𝐞𝐝 𝐭𝐚π₯𝐀 𝐚𝐭 𝐈𝐁𝐌 π‘πžπ¬πžπšπ«πœπ‘! ✨ I recently spoke at @IBMResearch about sthe afety alignment of generative foundation models. Huge thanks to @pinyuchenTW for the invitation and the amazing discussions! πŸŽ™οΈ π“πšπ₯𝐀: Safety Alignment of
2
3
11
@RealAnthonyPeng
Anthony Peng
20 days
Thank you for having me! I will talk about the safety alignment of generative foundation models tonight at Ploutos!
@ceciletamura
Cecile Tamura
21 days
Breaking down how Large Reasoning Models can become more aligned by learning to override flawed thinking β€” a big step for robust AI agents. Featuring ShengYun β€œAnthony” Peng (@GeorgiaTech ) & @ceciletamura for @ploutosai πŸ”— [ https://t.co/bRvZ3hkhat](https://t.co/bRvZ3hkhat)
0
4
4
@RealAnthonyPeng
Anthony Peng
21 days
I passed my PhD proposal this week and officially became a PhD candidate! πŸŽ‰ Feeling excited and thankful to everyone who has supported me along the way β€” especially my advisor, @PoloChau!
0
2
13
@RealAnthonyPeng
Anthony Peng
1 month
πŸ“„ Read the paper:
0
1
3
@RealAnthonyPeng
Anthony Peng
1 month
#EMNLP2025 is here, and check out our latest survey on π‹π‹πŒ 𝐒𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐭𝐒𝐨𝐧 Γ— π’πšπŸπžπ­π² Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety 🌟 The first survey connecting LLM interpretation & safety 🌟 Covers ~70
2
5
13
@RealAnthonyPeng
Anthony Peng
1 month
No one is secure in today’s job market :-(
0
0
6
@rohanpaul_ai
Rohan Paul
2 months
New @AIatMeta paper shows LLMs behave more safely by training on flawed reasoning and learning to correct it. On tough tests it stays safe even when harmful reasoning is injected, reaching about 98%. Fixes a real weakness by training models to recover when early reasoning goes
5
9
24
@RealAnthonyPeng
Anthony Peng
2 months
Our paper is also available on HuggingFace. If you find it interesting, drop an upvote ⭐ and share your take β€” we’d love to discuss!
Tweet card summary image
huggingface.co
0
3
2
@RealAnthonyPeng
Anthony Peng
2 months
🚨 New paper alert! 🚨 Can you believe it? Flawed thinking helps reasoning models learn better! Injecting just a bit of flawed reasoning can collapse safety by 36% 😱 β€” but we teach large reasoning models to fight back πŸ’ͺπŸ›‘οΈ. Introducing RECAP πŸ”„: an RL post-training method
3
21
75
@haozhu_wang
Haozhu Wang
2 months
Sharing our RL method on training LLMs to be resilient safety reasoners.
@RealAnthonyPeng
Anthony Peng
2 months
🚨 New paper alert! 🚨 Can you believe it? Flawed thinking helps reasoning models learn better! Injecting just a bit of flawed reasoning can collapse safety by 36% 😱 β€” but we teach large reasoning models to fight back πŸ’ͺπŸ›‘οΈ. Introducing RECAP πŸ”„: an RL post-training method
1
6
38
@jianfengchi
Jianfeng Chi
2 months
[1/N] Check out our new LLM reasoning work! The "aha moment" in Math can be elicited through RLVR, can we do the same for (safety) alignment in RLHF without much modification in the training algorithm. The answer is yes.
@RealAnthonyPeng
Anthony Peng
2 months
🚨 New paper alert! 🚨 Can you believe it? Flawed thinking helps reasoning models learn better! Injecting just a bit of flawed reasoning can collapse safety by 36% 😱 β€” but we teach large reasoning models to fight back πŸ’ͺπŸ›‘οΈ. Introducing RECAP πŸ”„: an RL post-training method
1
20
77
@pinyuchenTW
Pin-Yu Chen
2 months
In philosophy, false premise can lead to correct conclusion, provided that valid arguments and deduction are used. We are excited to see that large reasoning models can achieve the same improvements in correctness and safety! Paper:
Tweet card summary image
arxiv.org
Large reasoning models (LRMs) "think" by generating structured chain-of-thought (CoT) before producing a final answer, yet they still lack the ability to reason critically about safety alignment...
@RealAnthonyPeng
Anthony Peng
2 months
🚨 New paper alert! 🚨 Can you believe it? Flawed thinking helps reasoning models learn better! Injecting just a bit of flawed reasoning can collapse safety by 36% 😱 β€” but we teach large reasoning models to fight back πŸ’ͺπŸ›‘οΈ. Introducing RECAP πŸ”„: an RL post-training method
0
4
16
@RealAnthonyPeng
Anthony Peng
2 months
We demonstrate that RECAP yields persistent robustness even under adaptive attacks and fundamentally improves LRM reasoning dynamics by increasing the frequency of self-reflection.
1
1
7
@RealAnthonyPeng
Anthony Peng
2 months
RECAP simultaneously strengthens safety, helpfulness, and math reasoning capability, with theoretical analysis supporting its robustness.
1
0
5