Kaijie Zhu
@KaijieZhu07
Followers
69
Following
101
Media
1
Statuses
20
Furthermore, we show that by designing as many mask prompts and ensembling them together, we can derive a stronger defense method which reduces error rates exponentially! 📉
1
0
0
Based on this observation, MELON re-executes the agent trajectories with masked prompts and compares tool call outputs to detect deviations in behavior caused by injected content.
1
0
0
MELON builds on the observation that under a successful attack, the agent’s next action becomes less dependent on user tasks and more on malicious tasks.
1
0
0
Excited to share our paper at #ICML2025! We've developed MELON🍉, a robust defense method against indirect prompt injection attacks on LLM agents that achieves near 0 ASR! Hope you enjoy🍉! Grateful to my incredible collaborators! @WilliamWangNLP @WenboGuo4 @jd92wang @xianjun_agi
2
6
26
Toward Trustworthy Generative Foundation Models (GenFMs) 🚀 🎇After six months of hard work and thanks to the efforts of the entire team, our report on the trustworthiness of generative foundation models (GenFMs) has finally been released. 💡In this work, we: -Developed a
2
34
98
Thanks Wenbo! Looking forward to the journey to KAUST!
Congrats @KaijieZhu07 for being selected as the KAUST Rising Stars in AI ( https://t.co/pUcGYyd9Z0). It is an impressive achievement for a first-year Ph.D. student (co-advising with @WilliamWangNLP). Keijie recently just finished great work on prompt injection defense, which does
1
0
3
1/3 Today, an anecdote shared by an invited speaker at #NeurIPS2024 left many Chinese scholars, myself included, feeling uncomfortable. As a community, I believe we should take a moment to reflect on why such remarks in public discourse can be offensive and harmful.
180
564
4K
Personal update: After 5.5 yrs at @MSFTResearch , I will join @williamandmary in 2025 to be an assistant professor. Welcome to apply for my PhD/interns. Interest: ML with foundation models, LLM understanding, and AI for social sciences. More information: https://t.co/Yhf8L8WOXy
19
24
337
I am on job market for full-time industry positions. My research focuses on text generation evaluation and LLM alignment. If you have relevant positions, I’d love to connect! Here are list of my publications and summary of my research:
1
19
58
🚀 Since its invention, the mouse has been our way to control computers. But what if it didn’t have to be? 🤔 Thrilled to introduce Agent S, a new state-of-the-art GUI agent framework that interacts with computers just like a human and takes on the toughest automation challenges.
8
66
206
Unified Library for Evaluating LLMs Neat project by Microsoft offering a unified library to evaluate LLMs. A lot of the effort when working with LLMs goes into building a robust evaluation pipeline which could involve different models, tasks, and prompts. PrompBench looks
9
201
861
A new preprint to study the competition ⚔️behaviors of LLM-based agents! Observations align with sociological and economic theories:) 💡Title: CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents 🎁Paper: https://t.co/F08tYa8f2p
1
12
34
Concerned about broken LLM evaluation facing data contamination? Check our latest paper *DyVal: Graph-informed Dynamic Evaluation of Large Language Models*, dynamic evaluation protocol with flexible difficulty! With @KaijieZhu07 @jiaao_chen @Diyi_Yang
https://t.co/DEpatfg8by
0
11
38