denghui_zhang Profile Banner
Denghui Zhang Profile
Denghui Zhang

@denghui_zhang

Followers
717
Following
389
Media
21
Statuses
102

Visiting Faculty at UIUC NLP, Assistant Professor@Stevens Institute of Technology. Research: GenAI safety, mechanistic interpretab, data valuation©right.

New Jersey, USA
Joined December 2018
Don't wanna be here? Send us removal request.
@denghui_zhang
Denghui Zhang
2 days
Not only is it feasible to reverse-engineer input tokens from an LLM’s internal states, but “forward engineering’’—predicting output properties directly from internal states while skipping decoding—is also possible. Why does this matter? If we can detect that the model is about
Tweet card summary image
aclanthology.org
Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Heng Ji, Denghui Zhang. Findings of the Association for Computational Linguistics: EMNLP 2025. 2025.
@GladiaLab
GLADIA Research Lab
1 month
LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)
5
4
39
@denghui_zhang
Denghui Zhang
1 day
"How does SSI make money?" "You know..." "..." "Focus on research and the answer will reveal" The privilege you will get after done truly groundbreaking research. 🫡
@ns123abc
NIK
2 days
Ilya Sustkever explains @SSI business model: “the answer will reveal itself”
0
0
7
@denghui_zhang
Denghui Zhang
1 day
Of course, if you know how to predict federal rate cut, then nothing in this tweet matters.
0
0
2
@denghui_zhang
Denghui Zhang
1 day
Some random thoughts on recent high volatility market due to the narrative of #AIBubble. The market has been moving up and down lately, and many people are saying we are in an “AI bubble.” But this view misses what is really happening. AI is not a short-term hype. It is a
1
0
6
@denghui_zhang
Denghui Zhang
2 days
0
1
5
@denghui_zhang
Denghui Zhang
2 days
Another interesting related observation is that the internal state of an LLM can not only be reversed to recover its input tokens, but can also be analyzed to predict properties of its future outputs, such as safety, before the model actually decodes and speaks, allowing early
0
0
5
@denghui_zhang
Denghui Zhang
2 days
Still trying to learn how to safely stay on twitter, without upsetting people...
0
0
4
@denghui_zhang
Denghui Zhang
2 days
😅 Dear Twitter community: At what point does sharing a “quick one-sentence comment” on a new paper become “reinventing” someone’s early work? Quick Comment = Re-inventing
1
0
9
@denghui_zhang
Denghui Zhang
2 days
The paper refer to decoder-only Transformer's embeddings, not encoder-only model's embeddings. There are similar conclusions in early papers. It would be interesting to see safety papers in current thought-communication multi-agent system ( https://t.co/lN0JTLWLRm) where llm
1
0
9
@denghui_zhang
Denghui Zhang
3 days
Paper alert! 🚨 "LLMs are like students who can be distracted by loud voices in a classroom. Even if most students are quiet (relevant context), one noisy student (harmful context) can hijack attention." Our EMNLP 25 paper proposed the adapted Rescorla–Wagner conditioning model
oppugno-rushi.github.io
Rescorla–Wagner Steering (RW‑Steering) improves response quality by discounting inappropriate signals in context.
0
1
10
@denghui_zhang
Denghui Zhang
3 days
At this year’s #ICDM, our VISTA workshop had the pleasure of hosting outstanding speakers from Johns Hopkins University and the University of Maryland. I was particularly fascinated by @angcharlesli Prof. Ang Li’s talk, “Invisible Tokens, Visible Bills: Auditing the New
0
0
6
@denghui_zhang
Denghui Zhang
4 days
Quite interesting. If reverse engineering from latent embedding to input tokens is feasible, then sharing prompt embedding or vectorDB under RAG setting is not safe (or private) anymore.
@GladiaLab
GLADIA Research Lab
1 month
LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)
17
34
551
@denghui_zhang
Denghui Zhang
6 days
Just found it interesting that online AI influencer write better blog on our EMNLP context rot paper than ourselves😅: https://t.co/XlI4vHiQUG "LLMs are like students who can be distracted by loud voices in a classroom. Even if most students are quiet (relevant context), one
0
0
3
@eagle_hz
Qingyun Wang
3 months
Our VISTA workshop at ICDM 2025 is still open for submissions! If you’re working on GenAI standards, legal constraints, copyright risks, & compliance, we’d love to see your papers! 📄✨ 🧵More information and submit:
@ran_yide42201
Yide Ran
5 months
🚨 Call for Papers: VISTA Workshop @ ICDM 2025 🚨 📅 Nov 12, 2025 | 📍 Washington, DC Explore GenAI standards, legal constraints, copyright risks, & compliance. Submit by Sep 5! 🔗 https://t.co/MjmZx8UunI Speakers: V. Braverman, D. Atkinson, A. Li #ICDM2025 #GenAI #AIStandards
0
2
8
@denghui_zhang
Denghui Zhang
3 months
Interpretability: Understanding how AI models think https://t.co/jfvdaC2M5l via @YouTube @AnthropicAI Anthropic’s new video dives into AI interpretability—how models think & why it matters 🧠✨ Our EMNLP paper SafeSwitch takes a similar path: leveraging internal activations
0
3
9