Saadia Gabriel
@GabrielSaadia
Followers
1K
Following
465
Media
15
Statuses
248
UCLA NLP Prof. Previously UW, MIT and NYU.
Los Angeles, CA
Joined August 2019
Didn't make it to #EMNLP25 but my amazing co-author Sophie and our poster made it to Suzhou!! ๐ Paper : https://t.co/AEYrvCBoDj ๐Underline : https://t.co/xs14CSQsfo
1
2
8
Even before @mmitchell_ai recently raised this discussion, I've had conversation after conversation with students & new grads struggling with this exact dilemma. I want to help! Here's a live thread of AI-related opportunities for those looking to do good & make (enough) money:
9
24
125
Also not to be missed: Sophie will be presenting our poster for ModelCitizens the day before (session 2, 11-12:30pm).
0
2
7
I will unfortunately only be at EMNLP virtually, but everyone there should see Genglinโs oral presentation of our work on MOSAIC (session 11, 11/6 4:30-6pm)!!
0
0
3
Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other
62
392
3K
Weโve been running the UCLA NLP Seminar for a while now and realized itโs a waste not to share these amazing talks more broadly. So hereโs our YouTube channel now! ๐ฅ Watch and subscribe to our channel for past and upcoming sessions: ๐ https://t.co/dbGIRMAAS4
#AI #UCLANLP
youtube.com
We are a group of researchers working on Natural Language Processing and Large Language Models at UCLA. For more detailed information, please visit our websites: Prof. Kai-Wei Chang: http://kwchang...
3
20
104
๐ข PhD Students in GenAI/RL! Our team at FAIR is hiring a Research Intern for Summer 2026 to push the boundaries of multimodal multi-agent social interaction. Learn more and apply: https://t.co/7P66mnEY97
metacareers.com
Meta's mission is to build the future of human connection and the technology that makes it possible.
7
48
319
It was so fun organizing the @WiMLworkshop Lunch social at @COLM_conf today with @nikitasaxena02, @kim__minseon and Zena! We had such an amazing set of speakers and roundtables..loved the energy in the room ๐ #COLM2025 #wiml
3
10
50
Proud advisor moment at #COLM2025! Congrats to all the organizers for a wonderful week. Iโm ready for COLM 3โฆbut first workshops and then back to the West Coast Monday where Iโll be speaking at Tech St Santa Monica for LA Tech Week.
2
3
36
(Thu Oct 9, 11:00amโ1:00pm) Poster Session 5 ๐๐จ๐ฌ๐ญ๐๐ซ #๐๐: X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents; w/ amazing co-leads @salman1422571 @jamesnshiffer In this work, we introduce a ๐๐จ๐ฆ๐ฉ๐ซ๐๐ก๐๐ง๐ฌ๐ข๐ฏ๐ and ๐๐๐ฌ๐ฒ-๐ญ๐จ-๐ซ๐ฎ๐ง
Although I canโt attend #COLM2025 in person this year, my ๐๐๐๐๐๐๐๐๐๐ ๐๐๐๐๐๐๐๐๐๐ collaborators and co-organizers are running some exciting sessions. Be sure to check them out! (1/N)
0
4
5
Kicking off another quarter of my 269 NLP Ethics seminar with a new focus on mechanistic interpretability
0
0
19
Looking forward to jumping from the first week of teaching at UCLA to seeing everyone at COLM in Montreal next week ๐
! Iโll be at the WiML mentoring session Tuesday, then Friday Iโm giving a keynote at Social Simulation with LLMs and a talk at Visions of Language Modeling.
1
0
11
โ ๏ธ The #CHI2026 paper I submitted? It almost didn't exist. That's the BTS part academics never post. So I willโฆto normalize what I call unglamorous persistence. This summer was one of my hardest, mentally. ๐ฅ๏ธ Between global (funding crises in academia, political tension) and
1
2
56
Wondering whether AI debates can drive biased perspectives toward truth? Our answer is YES and this scalable oversight work is now accepted to #NeurIPS2025 ! Finally bringing a large-scale human study into an AI conference! (+++ my first time as a last-ish author is very fun!
๐จ๐๐๐ work on ๐ฌ๐๐๐ฅ๐๐๐ฅ๐ ๐จ๐ฏ๐๐ซ๐ฌ๐ข๐ ๐ก๐ญ for controversial claims! ๐จ ๐๐;๐๐: AI debates help people with ๐๐ข๐๐๐๐ซ๐ข๐ง๐ ๐ฉ๐ซ๐ข๐จ๐ซ ๐๐๐ฅ๐ข๐๐๐ฌ better assess the ๐ญ๐ซ๐ฎ๐ญ๐ก in controversial casesโeven when their initial beliefs are inaccurateโshowing a
1
6
43
One of my most exciting results lately! We identify experts in MoE models for properties like safety and faithfulness, and steer them to improve/hurt model faithfulness and safety. Most shockingly, with stearMoE, we can jailbreak 100% safety guardrails for open models. Details ๐
๐จ You can bypass ALL safety guardrails of GPT-OSS-120B ๐จโ๐คฏ How? By detecting behavior-associated experts and switching them on/off. ๐ Steering MoE LLMs via Expert (De)Activation ๐ https://t.co/U2YRyXon4H ๐งต๐
5
36
261
Honored to be back on TIME100 AI for 2025 โ alongside my longtime heroes @drfeifei and @BarzilayRegina! ๐ The recognition goes to my amazing students and colleagues, who strive to find ways to use AI to better humanity, as opposed to making AI for the sake of making AI better
40
39
490
We are hiring in Bespoke Labs for a new role: Member of Technical Staff: AI Data and RL Environments. Work on data curation strategies with the team that created OpenThoughts. Invent novel data recipes, strategies of curating datasets, environments, tasks and verifiers. (My
6
15
142
Huge thanks to the annotators who made this work possible ๐ Done in collaboration w/ the amazing co-authors: @christinachanc, @karolinaranjo, @hamidpalangi, Sophie, @tom_hartvigsen and my incredible advisor @GabrielSaadia!! Data: https://t.co/Ts7WFzwklQ Code :
huggingface.co
0
1
4
Announcing ModelCitizens at EMNLP 2025: very excited for Ashimaโs new work on participatory and context-aware design for online safety tools, finally aligning them with the communities theyโre deployed to protect!
1/ ๐งต New #EMNLP2025 Paper !! Toxicity detection is subjective; shaped by norms, identity, & context. Existing models and dataset overlook this nuance. Enter MODELCITIZENS: a new dataset designed to address this. โ๏ธ 6.8K posts, 40K annotations across diverse groups โ๏ธ
0
1
4