Yichen (Zach) Wang @YichenZW X Profile

Yichen (Zach) Wang

@YichenZW

Followers

270

Following

699

Media

19

Statuses

93

1st-yr Ph.D. student @UChicagoCS @UChicagoCI | Prev. intern @UWNLP @Tsvetshop @BerkeleyNLP | Honored CS BS @XJTU1896 24’

https://t.co/N1xq9QcFUx

Chicago, IL

Joined February 2023

Don't wanna be here? Send us removal request.

Yichen (Zach) Wang

@YichenZW

2 years

🆕I'm excited to share that I'll start my Ph.D. at @UChicago within @UChicagoCI under Prof. @MinaLee__ 's guidance and Prof. Ari Holtzman (@universeinanegg)'s co-advise! I hope to bring my LLM generation and evaluation works to a more human-centered and interactive stage.

5

7

107

Haokun Liu

@HaokunLiu5280

3 days

We're launching a weekly competition where the community decides which research ideas get implemented. Every week, we'll take the top 3 ideas from IdeaHub, run experiments with AI agents, and share everything: code, findings, all the successes and failures. It's completely free

3

19

31

Yichen (Zach) Wang

@YichenZW

8 days

Please check out our EMNLP paper! These results have made me reflect: how much of the model's benchmarked knowledge truly constitutes “known knowns” within a communicative system? Their behavior remains highly susceptible to external (mis)information.

Yiyang Feng

@Yiyang2375

8 days

You may turn to an LLM for reasoning, but what if you are wrong in the first place? LLMs usually propagate misinformation! Even when instructed, LLMs fail to correct misinformation, showing accuracy drops: 📉 10.02%~72.20% (instruction) & 📉 4.30%~19.97% (thinking). 🧵 (1/6)

0

8

Xiaoyan Bai

@Elenal3ai

17 days

❓ Does an LLM know thyself? 🪞 Humans pass the mirror test at ~18 months 👶 But what about LLMs? Can they recognize their own writing — or even admit authorship at all? In our new paper, we put 10 state-of-the-art models to the test. Read on 👇 1/n 🧵

2

18

45

Chantal

@ChantalShaib

21 days

Syntax that spuriously correlates with safe domains can jailbreak LLMs -- e.g. below with GPT4o mini Our paper (co w/ @VMSuriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight! + @MarzyehGhassemi, @leventsagun, @byron_c_wallace

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭

@elder_plinius

1 month

hit me with your favorite jailbreak below ⬇️

4

11

31

Tuhin Chakrabarty

@TuhinChakr

24 days

🚨New paper on AI and copyright Several authors have sued LLM companies for allegedly using their books without permission for model training. 👩‍⚖️Courts, however, require empirical evidence of harm (e.g., market dilution). Our new pre-registered study addresses exactly this

9

173

526

Ming Zhong

@MingZhong_

1 month

Vibe coding with an LLM, but the final vibe is off? 🤔 We analyze why models fail the "vibe check" and what truly matters to users. Key insight: human preference 🧑‍💻 ≈ functional correctness ✅ + instruction following 🎯 Check out our paper: https://t.co/s5gGME5O9I

2

17

69

Xiaochuang Han

@XiaochuangHan

1 month

Our team at Meta FAIR is hiring a PhD research intern for 2026. The topics broadly involve multimodal generative AI (e.g., video/image generation in addition to text), with flexible approaches across architecture/data/algorithms. Please apply via the link below, and feel free to

metacareers.com

Meta's mission is to build the future of human connection and the technology that makes it possible.

3

43

256

Ari Holtzman

@universeinanegg

2 months

For those who missed it, we just releaaed a little LLM-backed game called HR Simulator™ You play an intern ghostwriting emails for your boss. It’s like you’re stuck in corporate email hell…and you’re the devil 😈 link and an initial answer to “WHY WOULD YOU DO THIS?” below

3

22

57

Violet Peng

@VioletNPeng

2 months

One of my most exciting results lately! We identify experts in MoE models for properties like safety and faithfulness, and steer them to improve/hurt model faithfulness and safety. Most shockingly, with stearMoE, we can jailbreak 100% safety guardrails for open models. Details 👇

Mohsen Fayyaz

@mohsen_fayyaz

2 months

🚨 You can bypass ALL safety guardrails of GPT-OSS-120B 🚨❗🤯 How? By detecting behavior-associated experts and switching them on/off. 📄 Steering MoE LLMs via Expert (De)Activation 🔗 https://t.co/U2YRyXon4H 🧵👇

5

36

262

Tianjian Li

@tli104

2 months

Language models often produce repetitive responses, and this issue is further amplified by post-training. In this work, we introduce DARLING, a method that explicitly optimizes for both response diversity and quality within online reinforcement learning!

Jason Weston

@jaseweston

2 months

🌀Diversity Aware RL (DARLING)🌀 📝: https://t.co/MH0tui34Cb - Jointly optimizes for quality & diversity using a learned partition function - Outperforms standard RL in quality AND diversity metrics, e.g. higher pass@1/p@k - Works for both non-verifiable & verifiable tasks 🧵1/5

2

24

90

Tianyi Lorena Yan

@LorenaYannnnn

8 months

When answering queries with multiple answers (e.g., listing cities of a country), how do LMs simultaneously recall knowledge and avoid repeating themselves? 🚀 Excited to share our latest work with @robinomial! We uncover a promote-then-suppress mechanism: LMs first recall all

4

24

111

Xiaoyan Bai

@Elenal3ai

4 months

⚡️Ever asked an LLM-as-Marilyn Monroe about the 2020 election? Our paper calls this concept incongruence, common in AI and human creativity. 🧠Read my blog to learn what we found, why it matters for AI safety and creativity, and what's next. https://t.co/QoXmmJPIbK

1

6

13

Jiacheng Liu

@liujc1998

4 months

Happy to present OLMoTrace at #ACL2025NLP next week!! 🤗 If you stop by the demo session on Tuesday, July 29, 10:30am-12pm, @yanaiela and @sewon__min will be sharing how we use OLMoTrace to make LLMs more transparent. Unfortunately I'm unable to attend in-person due to visa 🥹

Jiacheng Liu

@liujc1998

7 months

Today we're unveiling OLMoTrace, a tool that enables everyone to understand the outputs of LLMs by connecting to their training data. We do this on unprecedented scale and in real time: finding matching text between model outputs and 4 trillion training tokens within seconds. ✨

0

15

45

Aryan Shrivastava

@aryan_shri123

4 months

🤫Jailbreak prompts make aligned LMs produce harmful responses.🤔But is that info linearly decodable? ↗️We show many refused concepts are linearly represented, sometimes persist through instruction-tuning, and may also shape downstream behavior❗ https://t.co/75HyeO6eE9 🧵1/

1

8

20

Chenghao Yang

@chrome1996

5 months

Have you noticed… 🔍 Aligned LLM generations feel less diverse? 🎯 Base models are decoding-sensitive? 🤔 Generations get more predictable as they progress? 🌲 Tree search fails mid-generation (esp. for reasoning)? We trace these mysteries to LLM probability concentration, and

1

30

94

Ari Holtzman

@universeinanegg

5 months

New benchmark! LLMs can retrieve bits of information from ridiculously long contexts (needle-in-a-haystack) but they can't tell what's missing from relatively short documents (AbsenceBench). We can't trust LLMs to annotate or judge documents if they can't see negative space!

Harvey Yiyun Fu

@harveyiyun

5 months

LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing? 🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents. paper:

3

17

98

Harvey Yiyun Fu

@harveyiyun

5 months

LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing? 🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents. paper:

11

33

160

Niloofar

@niloofar_mire

6 months

📣Thrilled to announce I’ll join Carnegie Mellon University (@CMU_EPP & @LTIatCMU) as an Assistant Professor starting Fall 2026! Until then, I’ll be a Research Scientist at @AIatMeta FAIR in SF, working with @kamalikac’s amazing team on privacy, security, and reasoning in LLMs!

226

69

1K

William Merrill

@lambdaviking

7 months

Excited to announce I'll be starting as an assistant professor at @TTIC_Connect for fall 2026! In the meantime, I'll be graduating and hanging around Ai2 in Seattle🏔️

59

25

368

Kevin Yang

@kevinyang41

7 months

Will be at NAACL next week, excited to share two of our papers: FACTTRACK: Time-Aware World State Tracking in Story Outlines https://t.co/1KcL0aCWCI THOUGHTSCULPT: Reasoning with Intermediate Revision and Search https://t.co/ZGqvEeReHr Shoutout to first authors @ZhihengLyu and

0

4

10