PinjiaHE Profile Banner
Pinjia He Profile
Pinjia He

@PinjiaHE

Followers
997
Following
811
Media
12
Statuses
246

Assistant Professor at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen) @cuhksz.

Shenzhen, China
Joined March 2015
Don't wanna be here? Send us removal request.
@PinjiaHE
Pinjia He
6 months
๐Ÿ“ข Can LLMs locate software service failures? ๐Ÿค” My student @SiyuexiH's #ICLR2025 paper introduces OpenRCA, the first benchmark dataset for evaluating LLMs' root cause analysis capabilities in software systems. LLMs/Agents need to analyze system telemetry data to infer results
Tweet media one
0
5
11
@daniel_d_kang
Daniel Kang
2 months
SWE-bench Verified is the gold standard for evaluating coding agents: 500 real-world issues + tests by OpenAI. Sounds bullet-proof? Not quite. We show passing its unit tests != matching ground truth. In our ACL paper, we fixed buggy evals: 24% of agents moved up or down the
Tweet media one
11
36
200
@PinjiaHE
Pinjia He
3 months
My student Xiaoyuan Liu's @xyliu_cs collaboration work with Tencent. #ACL2025NLP
@tuzhaopeng
Zhaopeng Tu
3 months
When eyes and memory clash, who wins? ๐Ÿ‘๏ธ๐Ÿง  Introducing a comprehensive study on vision-knowledge conflicts in MLLMs, where visual input contradicts the model's internal commonsense knowledgeโ€”and the results might surprise you. #ACL2025NLP ๐Ÿ“ˆ We developed an automated framework
Tweet media one
0
0
4
@DominikWinterer
Dominik Winterer
4 months
๐Ÿš€ I'll be launching the Formal Methods Engineering Lab ( https://t.co/9pjKYVa89h) โ€“ and I am hiring! If youโ€™re interested in working with me, feel free to reach out.
@DominikWinterer
Dominik Winterer
4 months
Super excited to share that I will be joining The University of Manchester (@OfficialUoM) as a Lecturer (Assistant Professor) in Cyber Security! The Systems and Software Security group at Manchester is already incredibly impressive, and Iโ€™m honored to help further strengthen it.
Tweet media one
1
11
28
@PinjiaHE
Pinjia He
4 months
Check out my student Xiaoyuan Liu's @xyliu_cs collaboration work with Tencent: RISE (Reinforcing Reasoning with Self-Verification), enabling LLMs to simultaneously level-up BOTH their problem-solving AND self-checking skills.
@tuzhaopeng
Zhaopeng Tu
4 months
Trust your AI, but can it trust itself? ๐Ÿค” Introducing an online reinforcement learning framework, RISE (Reinforcing Reasoning with Self-Verification), enabling LLMs to simultaneously level-up BOTH their problem-solving AND self-checking skills! ๐Ÿง Problems tackled: โœ…
Tweet media one
0
1
15
@chao_peng_
Chao Peng
4 months
Weโ€™re proud to bring @Trae_ai to @ICSEconf. Our booth, product showcase, banquet, and workshops were a great success. Huge thanks to everyone who joined our events. Looking forward to deeper collaboration in AI4SE research. See you again at @FSEconf !
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
2
24
@PinjiaHE
Pinjia He
7 months
Truly humbled and honored to receive the IEEE CS TCSE Rising Star Award. Thanks a lot for the help along the way from my supervisors, referees, students, and co-authors. Will continue to focus on impactful projects about AI4SE and SE4AI. ๐ŸŽฏ https://t.co/cUX83bzcoQ
Tweet media one
10
6
58
@tan_hwei
Shin Hwei Tan
7 months
We invite you to nominate yourself to serve on the Program Committee for FSE'26. Please use the following link to access the nomination form: https://t.co/wCGjjsdCVv
Tweet card summary image
docs.google.com
Please use this form to nominate yourself for the program committee of the ACM International Conference on the Foundations of Software Engineering (FSE 2026) by March 14, 2025. While we cannot select...
1
12
32
@istoica05
Ion Stoica
7 months
Agree DeepSeek is not as good as o1-pro and o3, but I think we need to look at the trends. This is what happened during the last nine months. What will happen in the next nine month if we do not change anything in the structure of the AI ecosystem in the US?
Tweet media one
@NicolasSerna314
Nicolas
7 months
@istoica05 We are definitely not doing a good job in the USA given our resources. However, I am not entirely sure about the last claim. DeepSeek might be as good as o1 if not better, but I don't think it is as good as o1 pro or o3 (based on deep research). Additionally, one little detail
9
29
170
@PinjiaHE
Pinjia He
7 months
๐Ÿš€Can LLMs repair programs without known buggy hunks?๐Ÿค” ๐Ÿ’ก My student @SiyuexiH's #ICSE2025 paper reveals that current infilling approaches constrain LLMs' repair potential. By simply aligning the task objective from fixing buggy hunks to rewriting the entire program with tests,
Tweet media one
2
8
30
@Yoshua_Bengio
Yoshua Bengio
8 months
Today, we are publishing the first-ever International AI Safety Report, backed by 30 countries and the OECD, UN, and EU. It summarises the state of the science on AI capabilities and risks, and how to mitigate those risks. ๐Ÿงต Link to full Report: https://t.co/k9ggxL7i66 1/16
49
527
1K
@mboehme_
Marcel Bรถhme๐Ÿ‘จโ€๐Ÿ”ฌ
9 months
Nominate yourself for the ASE'25 PC! The 40th IEEE/ACM International Conference on Automated Software Engineering (@ASE_conf) is looking for PC nominations to maximize diversity of perspectives. ๐Ÿ–Š๏ธ https://t.co/Th6wpRfuX7 ๐Ÿง‘โ€๐Ÿ’ป https://t.co/M0qPBkfwqo w/ @LingmingZhang
Tweet card summary image
docs.google.com
Please indicate your interest to serve on the ASE 2025 program committee through filling out this form. Please include as much information as possible. After submitting the form you will receive a...
0
10
28
@PinjiaHE
Pinjia He
9 months
Dominik's research is solid and highly impactful! He is also very easy to get along with๐Ÿ˜ƒ
@DominikWinterer
Dominik Winterer
9 months
๐Ÿš€๐Ÿ”๐Ÿง‘โ€๐Ÿซ I am on the academic job market! My research focuses on advancing Formal Methods, Programming Languages, and Software Engineering. Website: https://t.co/ypqj71vafu Research Statement:
0
0
1
@DominikWinterer
Dominik Winterer
9 months
๐Ÿš€๐Ÿ”๐Ÿง‘โ€๐Ÿซ I am on the academic job market! My research focuses on advancing Formal Methods, Programming Languages, and Software Engineering. Website: https://t.co/ypqj71vafu Research Statement:
3
23
67
@lilianweng
Lilian Weng
10 months
๐Ÿฆƒ At the end of Thanksgiving holidays, I finally finished the piece on reward hacking. Not an easy one to write, phew. Reward hacking occurs when an RL agent exploits flaws in the reward function or env to maximize rewards without learning the intended behavior. This is imo a
68
225
2K
@FSEconf
FSE 2025
10 months
FSE'25 will be buzzing with 14 co-located workshops. Congratulations to the organizers for their hard work! More details will be posted in the next few days. #FSE25 #Workshops
Tweet media one
0
6
20