SmithaMilli Profile Banner
smitha milli Profile
smitha milli

@SmithaMilli

Followers
3K
Following
6K
Media
14
Statuses
151

research scientist, FAIR; opinions are my own 🥺 👉👈

nyc
Joined November 2011
Don't wanna be here? Send us removal request.
@SmithaMilli
smitha milli
3 months
Today we're releasing Community Alignment - the largest open-source dataset of human preferences for LLMs, containing ~200k comparisons from >3000 annotators in 5 countries / languages! There was a lot of research that went into this... 🧵
12
70
330
@xuanalogue
xuan (ɕɥɛn / sh-yen)
8 days
sad that they've taken out the Cooperative Inverse Reinforcement Learning paper out of most AI alignment syllabi since assistance games are such a useful formalism for thinking about how to make LLM assistants more rational
1
3
18
@dmshanmugam
Divya Shanmugam
12 days
I am on the job market this year! My research advances methods for reliable machine learning from real-world data, with a focus on healthcare. Happy to chat if this is of interest to you or your department/team.
4
49
247
@2plus2make5
Emma Pierson
11 days
One month left to apply for our postdoc position @berkeley_ai! Apply here: https://t.co/fdGyaB6f8R.
Tweet card summary image
aprecruit.berkeley.edu
University of California, Berkeley is hiring. Apply now!
@2plus2make5
Emma Pierson
2 months
🚨 New postdoc position in our lab @Berkeley_EECS! 🚨 (please retweet + share with relevant candidates) We seek applicants with experience in language modeling who are excited about high-impact applications in the health and social sciences! More info in thread 1/3
0
5
19
@manoelribeiro
Manoel
1 month
Social media feeds today are optimized for engagement, often leading to misalignment between users' intentions and technology use. In a new paper, we introduce Bonsai, a tool to create feeds based on stated preferences, rather than predicted engagement.
1
13
39
@marksibrahim
Mark Ibrahim
17 days
One can manipulate LLM rankings to put any model in the lead—only by modifying the single character separating demonstration examples. Learn more in our new paper https://t.co/D8CzSpPxMU w/ Jingtong Su, Jianyu Zhang, @karen_ullrich , and Léon Bottou. 1/3 🧵
1
3
11
@steverathje2
Steve Rathje
25 days
🚨 New preprint 🚨 Across 3 experiments (n = 3,285), we found that interacting with sycophantic (or overly agreeable) AI chatbots entrenched attitudes and led to inflated self-perceptions. Yet, people preferred sycophantic chatbots and viewed them as unbiased! Thread 🧵
4
37
137
@chengmyra1
Myra Cheng
23 days
AI always calling your ideas “fantastic” can feel inauthentic, but what are sycophancy’s deeper harms? We find that in the common use case of seeking AI advice on interpersonal situations—specifically conflicts—sycophancy makes people feel more right & less willing to apologize.
6
50
196
@SadhikaMalladi
Sadhika Malladi
1 month
Excited to share that I will be starting as an Assistant Professor in CSE at UCSD (@ucsd_cse) in Fall 2026! I am currently recruiting PhD students who want to bridge theory and practice in deep learning - see here:
37
70
525
@SmithaMilli
smitha milli
1 month
do you use Letterboxd? would you be willing to participate in a 30-min research study where you use movie recommenders based on your Letterboxd ratings? DM me! (you will receive $20 for participating)
0
3
13
@jugander
Johan Ugander
1 month
📣Yale social algorithms workshop, Oct 16-17!📣 What's new in content ranking? Content moderation? How can platforms promote civility? Hosted by Yale's Institute for Foundations of Data Science. Great speakers! Submit posters by 9/22! Spread the word!
Tweet card summary image
yalefds.swoogo.com
As social media algorithms increasingly mediate social experiences, there has been a rapid increase in research on the effects of how these algorithms are configured, alternatives to engagement-cen...
2
2
11
@ryan_t_lowe
Ryan Lowe 🥞
4 months
Introducing: Full-Stack Alignment 🥞 A research program dedicated to co-aligning AI systems *and* institutions with what people value. It's the most ambitious project I've ever undertaken. Here's what we're doing: 🧵
13
44
207
@SmithaMilli
smitha milli
3 months
And this is not the end! 😉 If you want to support us in doing more of these releases, email communityalignment@meta.com (or me) with feedback on what you liked about CA and what you want to see more of Paper: https://t.co/0XorBjggtv Dataset:
Tweet card summary image
huggingface.co
0
0
30
@SmithaMilli
smitha milli
3 months
This was a big project and collective effort -- major thanks to all the collaborators (see image)🙏 @lilyhzhang and I will be presenting it at the ICML MoFA workshop on Friday, say hi if you want to chat more!
1
4
19
@SmithaMilli
smitha milli
3 months
Finally, based on these insights we collect Community Alignment (CA). Features include: - NC-sampled candidate responses - Multilingual - >2500 prompts are annotated by >= 10 people - Natural language explanations for > 1/4 of choices and more!
1
1
10
@SmithaMilli
smitha milli
3 months
We show that using NC sampled candidates significantly improves the ability of alignment methods to learn heterogeneous preferences. Win rates jump from random chance to ~0.8 in the settings we tested.
1
0
12
@SmithaMilli
smitha milli
3 months
To produce more diverse candidate sets, rather than independently sampling them, you want some kind of "negatively-correlated (NC) sampling", where sampling one candidate makes other similar ones less likely Turns out, prompting can implement this decently well 🤡
1
0
14
@SmithaMilli
smitha milli
3 months
Intuitively, if all the candidate responses only cover one set of values, then you'll never be able to learn preferences outside of those values. it's like if someone asks me to pick between four types of apples... like hello ??? i want a mango, but you won't be measuring that
1
1
21
@SmithaMilli
smitha milli
3 months
Standard alignment methods fail to learn common human preferences (as identified from our joint human-model study) from existing preference datasets because the candidate responses that people choose from are too homogeneous, even when they are sampled from multiple models.
1
2
18
@SmithaMilli
smitha milli
3 months
We started by conducting a joint human study and model evaluation with 15,000 nationally-representative participants from 5 countries & 21 LLMs. We found that the LLMs exhibited an *algorithmic monoculture* and were all aligned with the same minority of human preferences.
1
5
21