Denghui Zhang
@denghui_zhang
Followers
717
Following
389
Media
21
Statuses
102
Visiting Faculty at UIUC NLP, Assistant Professor@Stevens Institute of Technology. Research: GenAI safety, mechanistic interpretab, data valuation©right.
New Jersey, USA
Joined December 2018
Not only is it feasible to reverse-engineer input tokens from an LLM’s internal states, but “forward engineering’’—predicting output properties directly from internal states while skipping decoding—is also possible. Why does this matter? If we can detect that the model is about
aclanthology.org
Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Heng Ji, Denghui Zhang. Findings of the Association for Computational Linguistics: EMNLP 2025. 2025.
LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)
5
4
39
"How does SSI make money?" "You know..." "..." "Focus on research and the answer will reveal" The privilege you will get after done truly groundbreaking research. 🫡
0
0
7
Of course, if you know how to predict federal rate cut, then nothing in this tweet matters.
0
0
2
Some random thoughts on recent high volatility market due to the narrative of #AIBubble. The market has been moving up and down lately, and many people are saying we are in an “AI bubble.” But this view misses what is really happening. AI is not a short-term hype. It is a
1
0
6
Another interesting related observation is that the internal state of an LLM can not only be reversed to recover its input tokens, but can also be analyzed to predict properties of its future outputs, such as safety, before the model actually decodes and speaks, allowing early
0
0
5
Still trying to learn how to safely stay on twitter, without upsetting people...
0
0
4
😅 Dear Twitter community: At what point does sharing a “quick one-sentence comment” on a new paper become “reinventing” someone’s early work? Quick Comment = Re-inventing
1
0
9
The paper refer to decoder-only Transformer's embeddings, not encoder-only model's embeddings. There are similar conclusions in early papers. It would be interesting to see safety papers in current thought-communication multi-agent system ( https://t.co/lN0JTLWLRm) where llm
1
0
9
Paper alert! 🚨 "LLMs are like students who can be distracted by loud voices in a classroom. Even if most students are quiet (relevant context), one noisy student (harmful context) can hijack attention." Our EMNLP 25 paper proposed the adapted Rescorla–Wagner conditioning model
oppugno-rushi.github.io
Rescorla–Wagner Steering (RW‑Steering) improves response quality by discounting inappropriate signals in context.
0
1
10
At this year’s #ICDM, our VISTA workshop had the pleasure of hosting outstanding speakers from Johns Hopkins University and the University of Maryland. I was particularly fascinated by @angcharlesli Prof. Ang Li’s talk, “Invisible Tokens, Visible Bills: Auditing the New
0
0
6
Quite interesting. If reverse engineering from latent embedding to input tokens is feasible, then sharing prompt embedding or vectorDB under RAG setting is not safe (or private) anymore.
LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)
17
34
551
Just found it interesting that online AI influencer write better blog on our EMNLP context rot paper than ourselves😅: https://t.co/XlI4vHiQUG "LLMs are like students who can be distracted by loud voices in a classroom. Even if most students are quiet (relevant context), one
0
0
3
Our VISTA workshop at ICDM 2025 is still open for submissions! If you’re working on GenAI standards, legal constraints, copyright risks, & compliance, we’d love to see your papers! 📄✨ 🧵More information and submit:
🚨 Call for Papers: VISTA Workshop @ ICDM 2025 🚨 📅 Nov 12, 2025 | 📍 Washington, DC Explore GenAI standards, legal constraints, copyright risks, & compliance. Submit by Sep 5! 🔗 https://t.co/MjmZx8UunI Speakers: V. Braverman, D. Atkinson, A. Li #ICDM2025 #GenAI #AIStandards
0
2
8
Interpretability: Understanding how AI models think https://t.co/jfvdaC2M5l via @YouTube @AnthropicAI Anthropic’s new video dives into AI interpretability—how models think & why it matters 🧠✨ Our EMNLP paper SafeSwitch takes a similar path: leveraging internal activations
0
3
9