morlikow Profile Banner
Matthias Orlikowski Profile
Matthias Orlikowski

@morlikow

Followers
629
Following
3K
Media
24
Statuses
1K

NLProc, Computational Social Science • Human Label Variation, Disagreement, Subjectivity • PhD candidate @unibielefeld • he/him • EN, DE

Bielefeld, Germany
Joined February 2015
Don't wanna be here? Send us removal request.
@morlikow
Matthias Orlikowski
4 months
I will be at #acl2025 to present "Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals’ Subjective Text Perceptions" ✨ Heartfelt thank you to my collaborators @jiaxin_pei @paul_rottger @pcimiano @david__jurgens Dirk Hovy more below
1
2
12
@tiancheng_hu
Tiancheng Hu
23 days
Can AI simulate human behavior? 🧠 The promise is revolutionary for science & policy. But there’s a huge "IF": Do these simulations actually reflect reality? To find out, we introduce SimBench: The first large-scale benchmark for group-level social simulation. (1/9)
3
22
54
@suvarna_ashima
Ashima Suvarna🌻
3 months
1/ 🧵 New #EMNLP2025 Paper !! Toxicity detection is subjective; shaped by norms, identity, & context. Existing models and dataset overlook this nuance. Enter MODELCITIZENS: a new dataset designed to address this. ✔️ 6.8K posts, 40K annotations across diverse groups ✔️
3
7
41
@liweijianglw
Liwei Jiang
4 months
We have put up all slide decks on the tutorial website: https://t.co/xhQfa5wylv 🥳🥳🥳 Although I was only able to deliver the tutorial remotely due to visa constraints, I was really thrilled to learn that our tutorial received a quite full room of audience for the entire 3.5 hr!
@liweijianglw
Liwei Jiang
4 months
🥳🥳🥳Join us at the tutorial of 𝐆𝐮𝐚𝐫𝐝𝐫𝐚𝐢𝐥𝐬 𝐚𝐧𝐝 𝐒𝐞𝐜𝐮𝐫𝐢𝐭𝐲 𝐟𝐨𝐫 𝐋𝐋𝐌𝐬: 𝐒𝐚𝐟𝐞, 𝐒𝐞𝐜𝐮𝐫𝐞, 𝐚𝐧𝐝 𝐂𝐨𝐧𝐭𝐫𝐨𝐥𝐥𝐚𝐛𝐥𝐞 𝐒𝐭𝐞𝐞𝐫𝐢𝐧𝐠 𝐨𝐟 𝐋𝐋𝐌 𝐀𝐩𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬! Time: 14:00 - 17:30 July 27 Location: Hall CT8
1
10
60
@paul_rottger
Paul Röttger @ EMNLP
4 months
Very excited about all these papers on sociotechnical alignment & the societal impacts of AI at #ACL2025. As is now tradition, I made some timetables to help me find my way around. Sharing here in case others find them useful too :) 🧵
4
12
124
@StellaLisy
Stella Li
4 months
WHY do you prefer something over another? Reward models treat preference as a black-box😶‍🌫️but human brains🧠decompose decisions into hidden attributes We built the first system to mirror how people really make decisions in our #COLM2025 paper🎨PrefPalette✨ Why it matters👉🏻🧵
6
84
413
@Schropes
Hope Schroeder
4 months
🗣️ Excited to share our new #ACL2025 Findings paper: “Just Put a Human in the Loop? Investigating LLM-Assisted Annotation for Subjective Tasks” with @jad_kabbara and @dkroy. Arxiv: https://t.co/FeWQLQxt5K Read about our findings ⤵️
Tweet card summary image
arxiv.org
LLM use in annotation is becoming widespread, and given LLMs' overall promising performance and speed, simply "reviewing" LLM annotations in interpretive tasks can be tempting. In subjective...
1
10
56
@morlikow
Matthias Orlikowski
4 months
More detail on the paper in this thread:
@morlikow
Matthias Orlikowski
7 months
Can LLMs learn to simulate individuals' judgments based on their demographics? Not quite! In our new paper, we found that LLMs do not learn information about demographics, but instead learn individual annotators' patterns based on unique combinations of attributes! 🧵
0
0
0
@morlikow
Matthias Orlikowski
4 months
I will present on Monday July 28 during Poster Session 1 in the Human-Centered NLP track. The session runs 11:00-12:30 in Hall 4 and 5, looking forward to discuss our work! https://t.co/BoLWVGtcs4
Tweet card summary image
arxiv.org
People naturally vary in their annotations for subjective questions and some of this variation is thought to be due to the person's sociodemographic characteristics. LLMs have also been used to...
1
0
0
@MilaNLProc
MilaNLP
4 months
🎉 The @MilaNLProc lab is excited to present 15 papers and 1 tutorial at #ACL2025 & workshops! Grateful to all our amazing collaborators, see everyone in Vienna! 🚀
0
6
18
@NLLG_lab
NLLG
6 months
📢📢👇New job openings. Topic: social bias detection+analysis with LLMs across time (1950-now) & languages. There are 2 Post-Doc/PhD positions, supervised by @egere14 (@utn_nuremberg)+Simone Ponzetto (@dwsunima). Fully funded, up to 3 yrs. More infos:
0
3
10
@nicole__meister
Nicole Meister
1 year
Prior work has used LLMs to simulate survey responses, yet their ability to match the distribution of views remains uncertain. Our new paper [ https://t.co/DleesiPbif] introduces a benchmark to evaluate how distributionally aligned LLMs are with human opinions. 🧵
4
38
160
@angelhwang6
Angel Hsing-Chi Hwang
7 months
📣Calling all #CHI2025 attendees who work with human participants: Join our panel discussion on #LLM, #simulation, #syntheticdata, and the future of human subjects research on Apr 30 (Wed), 2:10 - 3:40 PM (JP Time) Post your questions for panelists here: https://t.co/FxbwBA3nW0
3
18
107
@morlikow
Matthias Orlikowski
7 months
There is more detail and additional analysis in the paper! You can read it on arXiv, happy to receive any comments or questions! Preprint:
Tweet card summary image
arxiv.org
People naturally vary in their annotations for subjective questions and some of this variation is thought to be due to the person's sociodemographic characteristics. LLMs have also been used to...
0
4
6
@morlikow
Matthias Orlikowski
7 months
Our findings underscore that LLMs can’t be expected to be accurate models of individual variation based on sociodemographics. We should not use LLMs to attempt “simulation”, in particular when we do not have access to examples of individual behaviour
1
0
0
@morlikow
Matthias Orlikowski
7 months
Attributes help most for annotators with unique sociodemographic profiles. Apparently, LLMs learn to use unique combinations as a proxy ID! Learning from individual-level examples provides richer information than knowing sociodemographics.
1
0
1
@morlikow
Matthias Orlikowski
7 months
But why did attributes improve predictions when we tested with known annotators? We started to wonder: Is performance linked to how many annotators our models see for each combination of attributes (sociodemographic profile)? We compare unique and frequent profiles.
1
0
1
@morlikow
Matthias Orlikowski
7 months
Ok, but surely attributes are much more useful when transferring to annotators not seen in training? This setting is rarely tested in NLP, so we built a separate partitioning of DeMo for this evaluation. Turns out no model improves over the baseline!
1
0
1
@morlikow
Matthias Orlikowski
7 months
Sociodemographic prompting and models fine-tuned with only the text content are our baselines. We compare against fine-tuning with annotator attributes or unique annotator identifiers (IDs). Trends are clear: attributes help a bit, but IDs are much more accurate!
1
0
0
@morlikow
Matthias Orlikowski
7 months
We compare models on DeMo, a dataset of subjective classification tasks with annotator attributes: age, gender, race and education. We curate DeMo from existing datasets and normalise attributes to increase comparability. DeMo is available in our repo:
github.com
Data and experiment code for "Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals’ Subjective Text Perceptions" (ACL2025) - morlikowski/beyond-demographics
1
0
1