
Sanmi Koyejo
@sanmikoyejo
Followers
3K
Following
717
Media
3
Statuses
581
I lead @stai_research at Stanford. Co-founder @VirtueAI_co
Joined September 2014
"Are Emergent Abilities of Large Language Models a Mirage?" is a NeurIPS outstanding paper!🙌🏿. Congrats especially to the students @RylanSchaeffer @BrandoHablando & other awardees. If you want to learn more, check out the oral & poster 👇🏿this afternoon (Dec 14) .1/2
**Test of Time**.Distributed Representations of Words and Phrases and their Compositionality. **Outstanding Main Track Papers**.Privacy Auditing with One (1) Training Run.Are Emergent Abilities of Large Language Models a Mirage?.
10
60
317
My first tweet!.I'm excited to share my recent interview on Metric Elicitation and Robust Distributed Learning with @samcharrington for the @twimlai podcast. Check it out! via @twimlai.
5
14
74
Thanks @RylanSchaeffer .And thanks to my students and collaborators who make the work possible!. Congrats to @chelseabfinn , @DorsaSadigh, and @mkwoot!.
I couldn't find this shared on Twitter, so congratulations @chelseabfinn @sanmikoyejo @DorsaSadigh @mkwoot !!!. 🥳🥳🥳🥳🥳🥳🥳
11
6
68
For those who have requested the video, my HAI seminar “Beyond Benchmarks: Building a Science of AI Measurement” is up!. I discuss some of @stai_research’s latest work aimed at improving AI measurement foundations towards real-world impact.
0
7
54
#NeurIPS2020 will be holding a symposium on the COVID-19 response in the @NeurIPSConf community. We ask that you do not submit workshop/symposium proposals that are entirely on the same topic. We are happy to consider workshops with additional and/or complimentary themes.
1
11
48
Excited to share 'Shaping AI's Impact on Billions of Lives'! We present 18 concrete milestones for steering AI toward the common good. A blueprint for responsible innovation.
Shaping AI's Impact on Billions of Lives.I’m delighted to have collaborated with an awesome set of co-authors 🎊 on this paper that offers a blueprint of actions that can be taken by practitioners, policymakers, and other stakeholders to maximize the upsides of AI and minimize
1
6
32
(Re-) examining some of the emergence claims in large language models. Turns out the metrics matter!.Work with @RylanSchaeffer and @BrandoHablando.
We had meant to keep this under wraps for a few weeks, but it seems that the cat is out of the bag. Excited to announce our newest preprint!!. **Are Emergent Abilities of Large Language Models a Mirage?**. Joint w/ @sanmikoyejo & @BrandoHablando . 1/12.
1
2
28
Welcome!!!.
Excited to share I’ll be joining be joining as an Assistant Professor at @CornellInfoSci @Cornell_Tech in Summer 2025! This coming year I’ll be a postdoc at @StanfordHAI with @SanmiKoyejo and Daniel Ho 🎈. I am so grateful to all of my mentors, friends, family. Come visit!
1
0
22
Location & time for our paper: "Are Emergent Abilities of Large Language Models a Mirage?".#NeurIPS2023 . Presentation: 3:20pm, CST Hall C2 (level 1 gate 9 south of food court) . Poster: #1108, 5pm CST, Great Hall & Hall B1+B2 (level 1). Paper link: 2/2.
0
4
20
This was published a while ago, but I just noticed it. @theafricaiknow_ Thanks for the interview and profile!.
Previously, he held the position of Associate Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Read full article:
0
2
15
Are you interested in human or algorithmic challenges when learning from human feedback? Check out the @StanfordHAI Postdoc with @msbernst and me starting Fall 2023. Information here:
Postdoc position: How should people and communities articulate how AIs should navigate difficult tradeoffs? Prof. @sanmikoyejo and I have a jointly mentored postdoctoral scholar position open at @Stanford CS starting in the fall. Information here:
0
7
18
Great opportunity to learn a bit more about this work, led by @sangttruong!.alphaXiv has been a great platform for us to engage deeply with papers.
Author @sangttruong is on alphaXiv this week to answer questions on his latest paper “Crossing Linguistic Horizons”: . This paper focuses on expanding LLMs beyond English-speaking communities by introducing the first large-scale open-source Vietnamese.
0
4
16
Exciting work on state of the art guardrails — faster and more effective than baselines!.
[1/3] Excited to unveil VirtueGuard-Text-Lite, our State-of-the-art text guardrail model! 🛡️ Offering unparalleled protection against risky inputs/outputs while maintaining high efficiency. 10%+ improvement on OpenAI Moderation benchmarks compared with LlamaGuard v3 and runs 30+
0
3
16
@NeurIPSConf #NeurIPS2020 workshop proposal deadline has been extended by one week. The new deadline is 3 July 2020. We will update the other due dates soon as we complete the planning of the virtual workshops.
0
9
12
I am excited to help organize and speak at this event with @theNASEM. Our panel will focus on the role of data in harnessing opportunities and mitigating the shortfalls of #AI in #DrugDevelopment.
#AI and #MachineLearning have the potential to transform #ChemicalSciences and #DrugDiscovery. Explore AI's role in expanding research capacity and addressing challenges such as reducing #DrugDevelopment costs at our February 20 – 21 workshop:
0
1
12
It was great to host you. Thanks for the awesome lecture and engagement with students!.
I gave an RLHF lecture at Stanford today, here are the slides. The newer figures from other talks I've given:.* visuals on history of RLHF / related fields.* figures on advanced RL methods (CAI / DPO / rejection sampling)
0
1
12
Thank you for hosting me! Great engagement during the talk, and I learned of so much great work from individual meetings.
Monday, April 7th at 11:30am CT: TTIC Colloquium presents @SanmiKoyejo of @stai_research with a talk titled "Beyond Benchmarks: Building a Science of AI Measurement." Please join us in Room 530, 5th floor.
0
0
12
Welcome!!!.
Thrilled to share I'll be spending the next few months at @Stanford as a visiting researcher at @sanmikoyejo's lab 🎉. Grateful to @sanmikoyejo, @marcuswallacej and @bcaputo_iit for this opportunity 🙏
0
1
11
Come hang out at @NeurIPSConf !.
We'll be at @NeurIPSConf next week in Vancouver!. Stop by Booth #44 to:.- Play our Jailbreak Game – test your skills for exciting prizes!.- Hear talks from our AI researchers Dawn Song (@dawnsongtweets), Bo Li (@uiuc_aisecure), Sanmi Koyejo (@sanmikoyejo), Yu Yang (@YUYANG_UCLA).
0
0
11
Red-teaming eval for Claude Sonnet 3.7.
Can Reasoning Improve Safety & Security? Red-Teaming Analysis for Claude 3.7. 🚀 Claude 3.7 Sonnet Thinking: A New Era of Hybrid Reasoning?. Anthropic's latest release introduces a Thinking mode, letting users switch between rapid responses and step-by-step reasoning. But does
0
1
10
First @VirtueAI_co webinar! Come learn about our work on enterprise-ready AI safety and security.
Join Virtue AI Co-founder Sanmi Koyejo for a live webinar on why protecting your AI apps isn’t just about safety—it’s the key to faster deployment and growth. 📅 April 24 | 🕙 10 AM PT | 💻 Virtual. In this session, we’ll cover:.✅ Why traditional security tooling falls short
1
2
11
Are you at #wsdm and interested in Trustworthy Large Language Models? Come check out my tutorial with @uiuc_aisecure in Room 22B, starting at 8:30 AM.
0
2
8
Improving Suggestions For Student Feedback Using Direct Preference Optimization (DPO) by @juliettewoodrow presents a method for producing LLM-generated feedback suggestions that align with human preferences through an iterative fine-tuning process. 2/n.
1
1
8
@russpoldrack @tallinzen @glupyan @RylanSchaeffer Some have argued that some improvements in model capabilities are unpredictable (along with a semi-precise definition of emergence). We argue that many claimed emergent capabilities are predictable, either using better statistics or alternative metrics. See thread for more.
We had meant to keep this under wraps for a few weeks, but it seems that the cat is out of the bag. Excited to announce our newest preprint!!. **Are Emergent Abilities of Large Language Models a Mirage?**. Joint w/ @sanmikoyejo & @BrandoHablando . 1/12.
1
0
7
FedAvg / fine-tuning will fail in federated domain adaptation when the domain shift is large. To address this, we propose FedGP, an effective aggregation rule, and a theoretical framework showing why it works. Exciting work with @enyij2 and @sanmikoyejo.
0
3
6
:) thanks for shouting out the class. Congrats on completing this fascinating project. P.S. many CS 329H projects for this year also look promising!.
We're indebted to helpful feedback from @xave_rg; @baileyflan; @fierycushman; @PReaulx; @maxhkw; Matthew Cashman; @TobyNewberry; Hilary Greaves; @Ronan_LeBras; @JenaHwang2; @sanmikoyejo, @sangttruong, and Stanford Class of 329H; attendees of @cogsci_soc and SPP 2024; and more.
0
0
6
A friendly introduction to double descent, focusing on building intuition with linear models (see thread and links).
@SAIA_Alignment @AnthropicAI @daniela_witten Joint work with @sanmikoyejo @KhonaMikail @KaterynaPistun1 @FieteGroup Jason, Zach & Akhilan. Comments, questions & feedback are welcome!. Paper: Code: 8/8.
0
3
6
Llama 4 redteaming report!.
Llama 4 Security: Progress or Plateau?. @Meta recently launched Llama 4, heralding it as a new era of multimodal AI with its Scout (17B active/16 experts) and Maverick (17B active/128 experts) models. Built on a Mixture-of-Experts architecture, they promise cutting-edge
0
0
4
@Ana_koloskova is awesome, and you should apply to work with her if you can!. Also, Congrats!.
I am excited to announce that I will join the University of Zurich as an assistant professor in August this year! I am looking for PhD students and postdocs starting from the fall. My research is on optimization, federated learning, machine learning, privacy, and unlearning.
0
0
4
Information-Theoretic Measures for LLM Output Evaluation by @zwrobertson, Suhana Bedi, and Hansol Lee explores using total variation mutual information to evaluate LLM-based preference learning.5/n.
1
0
3
Cost and Reward Infused Metric Elicitation by @ChetBhateja, @joseph_c_obrien, @afnaanmhashmi, and @evaishaprakash extends metric elicitation to consider additional factors like monetary cost and latency. 3/n.
1
0
3
Will be discussing the latest work from @stai_research and @VirtueAI_co on measuring and operationalizing AI safety and security.
1
0
2
@roydanroy @ys_alh @iclr_conf We do think we contribute a novel understanding of the in- vs. out-of-distribution effects on unlearning. Would be great to get your thoughts if you get a chance to read the paper.
1
0
1
@roydanroy @ys_alh @iclr_conf Thanks for engaging. In my bubble, we see more heuristic papers (perhaps we have a biased sampling). That said, we cite much of the existing work (those that were available when the paper was written).
1
0
1
@autreche @NeurIPSConf From your title, the workshop proposal sounds like its broader than COVID-19 only and should be fine. Feel free to contact us directly if you need more details. We will be happy to answer.
0
0
1