Joseph Jeesung Suh Profile
Joseph Jeesung Suh

@JosephJSSuh

Followers
57
Following
16
Media
7
Statuses
25

CS Grad student @ BAIR, UC Berkeley

Berkeley, CA
Joined June 2024
Don't wanna be here? Send us removal request.
@JosephJSSuh
Joseph Jeesung Suh
14 days
(11/11) For people who are interested, here is a link: Paper: https://t.co/WvMRy4DdjR Github: https://t.co/wEYPxMH4TU Huge thanks to my amazing PI @serinachang5 and collaborator @SuhongMoon.
Tweet card summary image
github.com
GEMS: Rethinking LLM Human Simulation, When a Graph is What You Need - schang-lab/gems
0
0
5
@JosephJSSuh
Joseph Jeesung Suh
14 days
(10/11) Takeaway 🥡 If your simulation task is a discrete choice with relational structure, try GEMS 💎 before spinning up a 70B param model. You might get similar (or better!) accuracy with a fraction of the compute and better debug-gability!
1
0
2
@JosephJSSuh
Joseph Jeesung Suh
14 days
(9/11) This builds on our earlier work SubPOP 🍭 (ACL 2025 main), where fine-tuning LLMs on scaled survey data reduced human-LLM gaps by up to half and generalized to new subpopulations & topics. Now we ask: when is a graph what you need? SubPOP:
Tweet card summary image
aclanthology.org
Joseph Suh, Erfan Jahanparast, Suhong Moon, Minwoo Kang, Serina Chang. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025.
1
0
2
@JosephJSSuh
Joseph Jeesung Suh
14 days
(8/11) Interpretability & transparency matter. Node embeddings from GEMS reveal latent dimensions, from public opinion ideologies to pricing sensitivities 🔍 Unlike LLMs, GEMS is trained in-house from scratch 🪟 removing risks of data leakage and biases from opaque pretraining
1
0
2
@JosephJSSuh
Joseph Jeesung Suh
14 days
(7/11) Efficiency matters. Smaller models mean faster iteration, lower cost, and easier deployment for survey design, policy analysis, and decision support. 🚀 Also, it is much easier to scale up to larger datasets with 1000× smaller params and 100× less compute!
1
0
1
@JosephJSSuh
Joseph Jeesung Suh
14 days
(6/11) Our datasets and settings: We test 3 settings - predicting missing responses (ie, imputation), new individuals, new questions - and 3 datasets, spanning public opinion, personality traits, economics experiments, and grammar skills.
1
0
1
@JosephJSSuh
Joseph Jeesung Suh
14 days
(5/11) Key finding: A GNN that’s ~1000× smaller than LLMs matches or surpasses them on predicting human behaviors consistently across datasets and settings — while being far more interpretable and transparent. 💡
1
0
3
@JosephJSSuh
Joseph Jeesung Suh
14 days
(4/11) Why graphs? Relational structure is the signal for many human behaviors: for example, a person who is ‘worried’ to ‘health effects of COVID-19’ would likely ‘often’ ‘watch public health news’. GEMS learns from those relations directly on graphs.
1
0
2
@JosephJSSuh
Joseph Jeesung Suh
14 days
(3/11) Meet GEMS 💎 — Graph-basEd Models for human Simulation We cast human simulation as link prediction on a heterogeneous graph: nodes = individuals, subgroups, choices; edges = individual ↔ subgroup, individual ↔ choice. Simple, transparent, and fast. ⚡
1
0
2
@JosephJSSuh
Joseph Jeesung Suh
14 days
(2/11) Why discrete‑choice? A lot of “human simulation” with LLMs is predicting which choice an individual would pick from a small set: • Respondents in opinion polls • Customers choosing one item over another • Game players with finite next actions • Students answering MCQs
1
0
3
@JosephJSSuh
Joseph Jeesung Suh
14 days
LLMs have dominated recent work on simulating human behaviors. But do you really need them? In discrete‑choice settings, our answer is: not necessarily. A lightweight graph neural network (GNN) can match or beat strong LLM-based methods. Paper: https://t.co/WvMRy4DdjR 🧵👇
3
15
54
@joshminwookang
Minwoo (Josh) Kang
5 months
🤔 Do LLMs exhibit in-group↔out-group perceptions like us? ❓ Can they serve as faithful virtual subjects of human political partisans? Excited to share our paper on taking LLM virtual personas to the *next level* of depth! 🔗 https://t.co/LzeDAMtrEV 🧵
2
9
16
@rajivmovva
Raj Movva
8 months
💡New preprint & Python package: We use sparse autoencoders to generate hypotheses from large text datasets. Our method, HypotheSAEs, produces interpretable text features that predict a target variable, e.g. features in news headlines that predict engagement. 🧵1/
10
33
135
@lexin_zhou
Lexin Zhou
8 months
New Paper: We unlock AI Evaluation with explanatory and predictive power through general ability scales! -Explains what common benchmarks really measure -Extracts explainable ability profiles of AI systems -Predicts performance for new task instances, in & out-of-distribution 🧵
3
26
78
@JosephJSSuh
Joseph Jeesung Suh
9 months
For people who are interested, here is a link: Paper: https://t.co/zQ2klONwCM Github: https://t.co/r1v8QqSw0C This work would not have been possible without our amazing PI @serinachang5 and collaborators @erfan_jp, @SuhongMoon, @joshminwookang, and Prof. John Canny.
Tweet card summary image
github.com
[ACL 2025 Long Main] Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions - JosephJeesungSuh/subpop
0
1
9
@JosephJSSuh
Joseph Jeesung Suh
9 months
Why does this matter? Researchers often need estimation of responses for unseen subpopulations or newly formulated questions (or both), especially in the early stages of survey design. Our approach helps fill these gaps when immediate large-scale human polling isn't available.
1
0
5
@JosephJSSuh
Joseph Jeesung Suh
9 months
Beyond accuracy, generalization is crucial. Fine-tuned models exhibit stable prediction improvements for: • Unseen subpopulations (not in the fine-tuning data) • New survey topics • Different survey families (American Trends Panel → General Social Survey)
1
1
3
@JosephJSSuh
Joseph Jeesung Suh
9 months
Key finding: Fine-tuning our LLMs drastically narrows the human-LLM opinion gap—by up to 46%. Even better, every subgroup sees consistent improvement, addressing previous concerns that LLM-based methods might favor certain demographics' opinions over others.
1
1
3
@JosephJSSuh
Joseph Jeesung Suh
9 months
Meet SubPOP! 🍭 SubPOP is a dataset of 70K subpopulation-response pairs (6.5× larger than past work), curated from two major opinion survey families. We fine-tune LLMs on SubPOP to match their response distributions to those of human subjects.
2
1
4
@JosephJSSuh
Joseph Jeesung Suh
9 months
However, there hasn't been a survey dataset that is: 1. large-scale, with expansive sets of survey data sufficient for fine-tuning LLMs 2. high quality, with careful filtering and curation 3. capable of evaluating model generalization across topics & styles
1
1
3