YichenZW Profile Banner
Yichen (Zach) Wang Profile
Yichen (Zach) Wang

@YichenZW

Followers
270
Following
699
Media
19
Statuses
93

1st-yr Ph.D. student @UChicagoCS @UChicagoCI | Prev. intern @UWNLP @Tsvetshop @BerkeleyNLP | Honored CS BS @XJTU1896 24’

Chicago, IL
Joined February 2023
Don't wanna be here? Send us removal request.
@YichenZW
Yichen (Zach) Wang
2 years
🆕I'm excited to share that I'll start my Ph.D. at @UChicago within @UChicagoCI under Prof. @MinaLee__ 's guidance and Prof. Ari Holtzman (@universeinanegg)'s co-advise! I hope to bring my LLM generation and evaluation works to a more human-centered and interactive stage.
5
7
107
@HaokunLiu5280
Haokun Liu
3 days
We're launching a weekly competition where the community decides which research ideas get implemented. Every week, we'll take the top 3 ideas from IdeaHub, run experiments with AI agents, and share everything: code, findings, all the successes and failures. It's completely free
3
19
31
@YichenZW
Yichen (Zach) Wang
8 days
Please check out our EMNLP paper! These results have made me reflect: how much of the model's benchmarked knowledge truly constitutes “known knowns” within a communicative system? Their behavior remains highly susceptible to external (mis)information.
@Yiyang2375
Yiyang Feng
8 days
You may turn to an LLM for reasoning, but what if you are wrong in the first place? LLMs usually propagate misinformation! Even when instructed, LLMs fail to correct misinformation, showing accuracy drops: 📉 10.02%~72.20% (instruction) & 📉 4.30%~19.97% (thinking). 🧵 (1/6)
0
0
8
@Elenal3ai
Xiaoyan Bai
17 days
❓ Does an LLM know thyself? 🪞 Humans pass the mirror test at ~18 months 👶 But what about LLMs? Can they recognize their own writing — or even admit authorship at all? In our new paper, we put 10 state-of-the-art models to the test. Read on 👇 1/n 🧵
2
18
45
@ChantalShaib
Chantal
21 days
Syntax that spuriously correlates with safe domains can jailbreak LLMs -- e.g. below with GPT4o mini Our paper (co w/ @VMSuriyakumar) on syntax-domain spurious correlations will appear at #NeurIPS2025 as a ✨spotlight! + @MarzyehGhassemi, @leventsagun, @byron_c_wallace
@elder_plinius
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭
1 month
hit me with your favorite jailbreak below ⬇️
4
11
31
@TuhinChakr
Tuhin Chakrabarty
24 days
🚨New paper on AI and copyright Several authors have sued LLM companies for allegedly using their books without permission for model training. 👩‍⚖️Courts, however, require empirical evidence of harm (e.g., market dilution). Our new pre-registered study addresses exactly this
9
173
526
@MingZhong_
Ming Zhong
1 month
Vibe coding with an LLM, but the final vibe is off? 🤔 We analyze why models fail the "vibe check" and what truly matters to users. Key insight: human preference 🧑‍💻 ≈ functional correctness ✅ + instruction following 🎯 Check out our paper: https://t.co/s5gGME5O9I
2
17
69
@XiaochuangHan
Xiaochuang Han
1 month
Our team at Meta FAIR is hiring a PhD research intern for 2026. The topics broadly involve multimodal generative AI (e.g., video/image generation in addition to text), with flexible approaches across architecture/data/algorithms. Please apply via the link below, and feel free to
Tweet card summary image
metacareers.com
Meta's mission is to build the future of human connection and the technology that makes it possible.
3
43
256
@universeinanegg
Ari Holtzman
2 months
For those who missed it, we just releaaed a little LLM-backed game called HR Simulator™ You play an intern ghostwriting emails for your boss. It’s like you’re stuck in corporate email hell…and you’re the devil 😈 link and an initial answer to “WHY WOULD YOU DO THIS?” below
3
22
57
@VioletNPeng
Violet Peng
2 months
One of my most exciting results lately! We identify experts in MoE models for properties like safety and faithfulness, and steer them to improve/hurt model faithfulness and safety. Most shockingly, with stearMoE, we can jailbreak 100% safety guardrails for open models. Details 👇
@mohsen_fayyaz
Mohsen Fayyaz
2 months
🚨 You can bypass ALL safety guardrails of GPT-OSS-120B 🚨❗🤯 How? By detecting behavior-associated experts and switching them on/off. 📄 Steering MoE LLMs via Expert (De)Activation 🔗 https://t.co/U2YRyXon4H 🧵👇
5
36
262
@tli104
Tianjian Li
2 months
Language models often produce repetitive responses, and this issue is further amplified by post-training. In this work, we introduce DARLING, a method that explicitly optimizes for both response diversity and quality within online reinforcement learning!
@jaseweston
Jason Weston
2 months
🌀Diversity Aware RL (DARLING)🌀 📝: https://t.co/MH0tui34Cb - Jointly optimizes for quality & diversity using a learned partition function - Outperforms standard RL in quality AND diversity metrics, e.g. higher pass@1/p@k - Works for both non-verifiable & verifiable tasks 🧵1/5
2
24
90
@LorenaYannnnn
Tianyi Lorena Yan
8 months
When answering queries with multiple answers (e.g., listing cities of a country), how do LMs simultaneously recall knowledge and avoid repeating themselves? 🚀 Excited to share our latest work with @robinomial! We uncover a promote-then-suppress mechanism: LMs first recall all
4
24
111
@Elenal3ai
Xiaoyan Bai
4 months
⚡️Ever asked an LLM-as-Marilyn Monroe about the 2020 election? Our paper calls this concept incongruence, common in AI and human creativity. 🧠Read my blog to learn what we found, why it matters for AI safety and creativity, and what's next. https://t.co/QoXmmJPIbK
1
6
13
@liujc1998
Jiacheng Liu
4 months
Happy to present OLMoTrace at #ACL2025NLP next week!! 🤗 If you stop by the demo session on Tuesday, July 29, 10:30am-12pm, @yanaiela and @sewon__min will be sharing how we use OLMoTrace to make LLMs more transparent. Unfortunately I'm unable to attend in-person due to visa 🥹
@liujc1998
Jiacheng Liu
7 months
Today we're unveiling OLMoTrace, a tool that enables everyone to understand the outputs of LLMs by connecting to their training data. We do this on unprecedented scale and in real time: finding matching text between model outputs and 4 trillion training tokens within seconds. ✨
0
15
45
@aryan_shri123
Aryan Shrivastava
4 months
🤫Jailbreak prompts make aligned LMs produce harmful responses.🤔But is that info linearly decodable? ↗️We show many refused concepts are linearly represented, sometimes persist through instruction-tuning, and may also shape downstream behavior❗ https://t.co/75HyeO6eE9 🧵1/
1
8
20
@chrome1996
Chenghao Yang
5 months
Have you noticed… 🔍 Aligned LLM generations feel less diverse? 🎯 Base models are decoding-sensitive? 🤔 Generations get more predictable as they progress? 🌲 Tree search fails mid-generation (esp. for reasoning)? We trace these mysteries to LLM probability concentration, and
1
30
94
@universeinanegg
Ari Holtzman
5 months
New benchmark! LLMs can retrieve bits of information from ridiculously long contexts (needle-in-a-haystack) but they can't tell what's missing from relatively short documents (AbsenceBench). We can't trust LLMs to annotate or judge documents if they can't see negative space!
@harveyiyun
Harvey Yiyun Fu
5 months
LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing? 🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents. paper:
3
17
98
@harveyiyun
Harvey Yiyun Fu
5 months
LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing? 🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative space” in documents. paper:
11
33
160
@niloofar_mire
Niloofar
6 months
📣Thrilled to announce I’ll join Carnegie Mellon University (@CMU_EPP & @LTIatCMU) as an Assistant Professor starting Fall 2026! Until then, I’ll be a Research Scientist at @AIatMeta FAIR in SF, working with @kamalikac’s amazing team on privacy, security, and reasoning in LLMs!
226
69
1K
@lambdaviking
William Merrill
7 months
Excited to announce I'll be starting as an assistant professor at @TTIC_Connect for fall 2026! In the meantime, I'll be graduating and hanging around Ai2 in Seattle🏔️
59
25
368
@kevinyang41
Kevin Yang
7 months
Will be at NAACL next week, excited to share two of our papers: FACTTRACK: Time-Aware World State Tracking in Story Outlines https://t.co/1KcL0aCWCI THOUGHTSCULPT: Reasoning with Intermediate Revision and Search https://t.co/ZGqvEeReHr Shoutout to first authors @ZhihengLyu and
0
4
10