Denghui Zhang @denghui_zhang X Profile

Denghui Zhang

@denghui_zhang

Followers

717

Following

389

Media

21

Statuses

102

Visiting Faculty at UIUC NLP, Assistant Professor@Stevens Institute of Technology. Research: GenAI safety, mechanistic interpretab, data valuation©right.

https://t.co/DKYg1dKL4F

New Jersey, USA

Joined December 2018

Don't wanna be here? Send us removal request.

Denghui Zhang

@denghui_zhang

2 days

Not only is it feasible to reverse-engineer input tokens from an LLM’s internal states, but “forward engineering’’—predicting output properties directly from internal states while skipping decoding—is also possible. Why does this matter? If we can detect that the model is about

aclanthology.org

Peixuan Han, Cheng Qian, Xiusi Chen, Yuji Zhang, Heng Ji, Denghui Zhang. Findings of the Association for Computational Linguistics: EMNLP 2025. 2025.

GLADIA Research Lab

@GladiaLab

1 month

LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)

5

4

39

Denghui Zhang

@denghui_zhang

1 day

"How does SSI make money?" "You know..." "..." "Focus on research and the answer will reveal" The privilege you will get after done truly groundbreaking research. 🫡

NIK

@ns123abc

2 days

Ilya Sustkever explains @SSI business model: “the answer will reveal itself”

0

7

Denghui Zhang

@denghui_zhang

1 day

Of course, if you know how to predict federal rate cut, then nothing in this tweet matters.

0

2

Denghui Zhang

@denghui_zhang

1 day

Some random thoughts on recent high volatility market due to the narrative of #AIBubble. The market has been moving up and down lately, and many people are saying we are in an “AI bubble.” But this view misses what is really happening. AI is not a short-term hype. It is a

1

0

6

Denghui Zhang

@denghui_zhang

2 days

https://t.co/BM3KLS2QtK

0

1

5

Denghui Zhang

@denghui_zhang

2 days

Another interesting related observation is that the internal state of an LLM can not only be reversed to recover its input tokens, but can also be analyzed to predict properties of its future outputs, such as safety, before the model actually decodes and speaks, allowing early

0

5

Denghui Zhang

@denghui_zhang

2 days

Still trying to learn how to safely stay on twitter, without upsetting people...

0

4

Denghui Zhang

@denghui_zhang

2 days

😅 Dear Twitter community: At what point does sharing a “quick one-sentence comment” on a new paper become “reinventing” someone’s early work? Quick Comment = Re-inventing

1

0

9

Denghui Zhang

@denghui_zhang

2 days

The paper refer to decoder-only Transformer's embeddings, not encoder-only model's embeddings. There are similar conclusions in early papers. It would be interesting to see safety papers in current thought-communication multi-agent system ( https://t.co/lN0JTLWLRm) where llm

1

0

9

Denghui Zhang

@denghui_zhang

3 days

Paper alert! 🚨 "LLMs are like students who can be distracted by loud voices in a classroom. Even if most students are quiet (relevant context), one noisy student (harmful context) can hijack attention." Our EMNLP 25 paper proposed the adapted Rescorla–Wagner conditioning model

oppugno-rushi.github.io

Rescorla–Wagner Steering (RW‑Steering) improves response quality by discounting inappropriate signals in context.

0

1

10

Denghui Zhang

@denghui_zhang

3 days

At this year’s #ICDM, our VISTA workshop had the pleasure of hosting outstanding speakers from Johns Hopkins University and the University of Maryland. I was particularly fascinated by @angcharlesli Prof. Ang Li’s talk, “Invisible Tokens, Visible Bills: Auditing the New

0

6

Denghui Zhang

@denghui_zhang

4 days

Quite interesting. If reverse engineering from latent embedding to input tokens is feasible, then sharing prompt embedding or vectorDB under RAG setting is not safe (or private) anymore.

GLADIA Research Lab

@GladiaLab

1 month

LLMs are injective and invertible. In our new paper, we show that different prompts always map to different embeddings, and this property can be used to recover input tokens from individual embeddings in latent space. (1/6)

17

34

551

Denghui Zhang

@denghui_zhang

6 days

Just found it interesting that online AI influencer write better blog on our EMNLP context rot paper than ourselves😅: https://t.co/XlI4vHiQUG "LLMs are like students who can be distracted by loud voices in a classroom. Even if most students are quiet (relevant context), one

0

3

Qingyun Wang

@eagle_hz

3 months

Our VISTA workshop at ICDM 2025 is still open for submissions! If you’re working on GenAI standards, legal constraints, copyright risks, & compliance, we’d love to see your papers! 📄✨ 🧵More information and submit:

Yide Ran

@ran_yide42201

5 months

🚨 Call for Papers: VISTA Workshop @ ICDM 2025 🚨 📅 Nov 12, 2025 | 📍 Washington, DC Explore GenAI standards, legal constraints, copyright risks, & compliance. Submit by Sep 5! 🔗 https://t.co/MjmZx8UunI Speakers: V. Braverman, D. Atkinson, A. Li #ICDM2025 #GenAI #AIStandards

0

2

8

Denghui Zhang

@denghui_zhang

3 months

Interpretability: Understanding how AI models think https://t.co/jfvdaC2M5l via @YouTube @AnthropicAI Anthropic’s new video dives into AI interpretability—how models think & why it matters 🧠✨ Our EMNLP paper SafeSwitch takes a similar path: leveraging internal activations

0

3

9