Sanmi Koyejo @sanmikoyejo profile

Sanmi Koyejo

@sanmikoyejo

Followers

3K

Following

717

Media

3

Statuses

581

I lead @stai_research at Stanford. Co-founder @VirtueAI_co

Joined September 2014

Don't wanna be here? Send us removal request.

Sanmi Koyejo

@sanmikoyejo

1 year

"Are Emergent Abilities of Large Language Models a Mirage?" is a NeurIPS outstanding paper!🙌🏿. Congrats especially to the students @RylanSchaeffer @BrandoHablando & other awardees. If you want to learn more, check out the oral & poster 👇🏿this afternoon (Dec 14) .1/2

NeurIPS Conference

@NeurIPSConf

1 year

**Test of Time**.Distributed Representations of Words and Phrases and their Compositionality. **Outstanding Main Track Papers**.Privacy Auditing with One (1) Training Run.Are Emergent Abilities of Large Language Models a Mirage?.

10

60

317

Sanmi Koyejo

@sanmikoyejo

4 months

📚 Incredible student projects from the 2024 Fall quarter's Machine Learning from Human Preferences course Our students tackled some fascinating challenges at the intersection of AI alignment and human values. Selected project details follow. 1/n.

3

43

252

Sanmi Koyejo

@sanmikoyejo

5 years

My first tweet!.I'm excited to share my recent interview on Metric Elicitation and Robust Distributed Learning with @samcharrington for the @twimlai podcast. Check it out! via @twimlai.

5

14

74

Sanmi Koyejo

@sanmikoyejo

4 months

Thanks @RylanSchaeffer .And thanks to my students and collaborators who make the work possible!. Congrats to @chelseabfinn , @DorsaSadigh, and @mkwoot!.

Rylan Schaeffer

@RylanSchaeffer

4 months

I couldn't find this shared on Twitter, so congratulations @chelseabfinn @sanmikoyejo @DorsaSadigh @mkwoot !!!. 🥳🥳🥳🥳🥳🥳🥳

11

6

68

Sanmi Koyejo

@sanmikoyejo

2 months

For those who have requested the video, my HAI seminar “Beyond Benchmarks: Building a Science of AI Measurement” is up!. I discuss some of @stai_research’s latest work aimed at improving AI measurement foundations towards real-world impact.

0

7

54

Sanmi Koyejo

@sanmikoyejo

5 years

#NeurIPS2020 will be holding a symposium on the COVID-19 response in the @NeurIPSConf community. We ask that you do not submit workshop/symposium proposals that are entirely on the same topic. We are happy to consider workshops with additional and/or complimentary themes.

1

11

48

Sanmi Koyejo

@sanmikoyejo

6 months

Excited to share 'Shaping AI's Impact on Billions of Lives'! We present 18 concrete milestones for steering AI toward the common good. A blueprint for responsible innovation.

Jeff Dean

@JeffDean

6 months

Shaping AI's Impact on Billions of Lives.I’m delighted to have collaborated with an awesome set of co-authors 🎊 on this paper that offers a blueprint of actions that can be taken by practitioners, policymakers, and other stakeholders to maximize the upsides of AI and minimize

1

6

32

Sanmi Koyejo

@sanmikoyejo

2 years

(Re-) examining some of the emergence claims in large language models. Turns out the metrics matter!.Work with @RylanSchaeffer and @BrandoHablando.

Rylan Schaeffer

@RylanSchaeffer

2 years

We had meant to keep this under wraps for a few weeks, but it seems that the cat is out of the bag. Excited to announce our newest preprint!!. **Are Emergent Abilities of Large Language Models a Mirage?**. Joint w/ @sanmikoyejo & @BrandoHablando . 1/12.

1

2

28

Sanmi Koyejo

@sanmikoyejo

1 year

Welcome!!!.

Angelina Wang @angelinawang.bsky.social

@ang3linawang

1 year

Excited to share I’ll be joining be joining as an Assistant Professor at @CornellInfoSci @Cornell_Tech in Summer 2025! This coming year I’ll be a postdoc at @StanfordHAI with @SanmiKoyejo and Daniel Ho 🎈. I am so grateful to all of my mentors, friends, family. Come visit!

1

0

22

Sanmi Koyejo

@sanmikoyejo

1 year

Location & time for our paper: "Are Emergent Abilities of Large Language Models a Mirage?".#NeurIPS2023 . Presentation: 3:20pm, CST Hall C2 (level 1 gate 9 south of food court) . Poster: #1108, 5pm CST, Great Hall & Hall B1+B2 (level 1). Paper link: 2/2.

0

4

20

Sanmi Koyejo

@sanmikoyejo

2 months

Thrilled to return to my PhD alma mater this Friday (Apr 11)! I’ll be visiting IFML at UT Austin to share insights and reconnect with the amazing community that shaped my research journey.

0

1

18

Sanmi Koyejo

@sanmikoyejo

1 year

This was published a while ago, but I just noticed it. @theafricaiknow_ Thanks for the interview and profile!.

The Africa I Know®

@theafricaiknow_

1 year

Previously, he held the position of Associate Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Read full article:

0

2

15

Sanmi Koyejo

@sanmikoyejo

2 years

Are you interested in human or algorithmic challenges when learning from human feedback? Check out the @StanfordHAI Postdoc with @msbernst and me starting Fall 2023. Information here:

Michael Bernstein

@msbernst

2 years

Postdoc position: How should people and communities articulate how AIs should navigate difficult tradeoffs? Prof. @sanmikoyejo and I have a jointly mentored postdoctoral scholar position open at @Stanford CS starting in the fall. Information here:

0

7

18

Sanmi Koyejo

@sanmikoyejo

10 months

Excited to see this safety evaluation of the Llama 403B model. Covers both regulation and use-case perspectives!.

Virtue AI

@VirtueAI_co

10 months

We at Virtue AI are excited to announce our recent public effort: .🧵[1/3].🧾Comprehensive Safety Assessment of Llama 3.1 405B:

0

3

18

Sanmi Koyejo

@sanmikoyejo

1 year

Great opportunity to learn a bit more about this work, led by @sangttruong!.alphaXiv has been a great platform for us to engage deeply with papers.

alphaXiv

@askalphaxiv

1 year

Author @sangttruong is on alphaXiv this week to answer questions on his latest paper “Crossing Linguistic Horizons”: . This paper focuses on expanding LLMs beyond English-speaking communities by introducing the first large-scale open-source Vietnamese.

0

4

16

Sanmi Koyejo

@sanmikoyejo

9 months

Exciting work on state of the art guardrails — faster and more effective than baselines!.

Virtue AI

@VirtueAI_co

9 months

[1/3] Excited to unveil VirtueGuard-Text-Lite, our State-of-the-art text guardrail model! 🛡️ Offering unparalleled protection against risky inputs/outputs while maintaining high efficiency. 10%+ improvement on OpenAI Moderation benchmarks compared with LlamaGuard v3 and runs 30+

0

3

16

Sanmi Koyejo

@sanmikoyejo

2 months

If you are at NVIDIA GTC today, check out my session “S73872- From Guardrails to Agents: Navigating Safety and Security at AI's Frontier” at 3pm in Room 212B (L2).

1

0

14

Sanmi Koyejo

@sanmikoyejo

5 years

@NeurIPSConf #NeurIPS2020 workshop proposal deadline has been extended by one week. The new deadline is 3 July 2020. We will update the other due dates soon as we complete the planning of the virtual workshops.

0

9

12

Sanmi Koyejo

@sanmikoyejo

3 months

I am excited to help organize and speak at this event with @theNASEM. Our panel will focus on the role of data in harnessing opportunities and mitigating the shortfalls of #AI in #DrugDevelopment.

National Academies

@theNASEM

3 months

#AI and #MachineLearning have the potential to transform #ChemicalSciences and #DrugDiscovery. Explore AI's role in expanding research capacity and addressing challenges such as reducing #DrugDevelopment costs at our February 20 – 21 workshop:

0

1

12

Sanmi Koyejo

@sanmikoyejo

4 months

PrefLearn: How Do Advanced Replay Buffers and Online DPO Affect the Performance of RL Tetris with DQNs by Andy Liang, Abhinav Sinha, Jeremy Tian, and Kenny Dao proposes PrefLearn with superior performance and faster convergence .4/n.

1

2

10

Sanmi Koyejo

@sanmikoyejo

8 months

New website!.

Black in AI

@black_in_ai

8 months

We’ve created a new digital home for Black in AI! Now you can easily sign up for our new quarterly newsletter, register for programming and more. There's more on the horizon, so watch this space and help us shape the future of Black in AI!. 🔗

0

1

12

Sanmi Koyejo

@sanmikoyejo

1 year

It was great to host you. Thanks for the awesome lecture and engagement with students!.

Nathan Lambert

@natolambert

1 year

I gave an RLHF lecture at Stanford today, here are the slides. The newer figures from other talks I've given:.* visuals on history of RLHF / related fields.* figures on advanced RL methods (CAI / DPO / rejection sampling)

0

1

12

Sanmi Koyejo

@sanmikoyejo

2 months

Thank you for hosting me! Great engagement during the talk, and I learned of so much great work from individual meetings.

TTIC

@TTIC_Connect

2 months

Monday, April 7th at 11:30am CT: TTIC Colloquium presents @SanmiKoyejo of @stai_research with a talk titled "Beyond Benchmarks: Building a Science of AI Measurement." Please join us in Room 530, 5th floor.

0

12

Sanmi Koyejo

@sanmikoyejo

2 years

Welcome!!!.

Debora Caldarola

@debcaldarola

2 years

Thrilled to share I'll be spending the next few months at @Stanford as a visiting researcher at @sanmikoyejo's lab 🎉. Grateful to @sanmikoyejo, @marcuswallacej and @bcaputo_iit for this opportunity 🙏

0

1

11

Sanmi Koyejo

@sanmikoyejo

6 months

Come hang out at @NeurIPSConf !.

Virtue AI

@VirtueAI_co

6 months

We'll be at @NeurIPSConf next week in Vancouver!. Stop by Booth #44 to:.- Play our Jailbreak Game – test your skills for exciting prizes!.- Hear talks from our AI researchers Dawn Song (@dawnsongtweets), Bo Li (@uiuc_aisecure), Sanmi Koyejo (@sanmikoyejo), Yu Yang (@YUYANG_UCLA).

0

11

Sanmi Koyejo

@sanmikoyejo

3 months

Red-teaming eval for Claude Sonnet 3.7.

Virtue AI

@VirtueAI_co

3 months

Can Reasoning Improve Safety & Security? Red-Teaming Analysis for Claude 3.7. 🚀 Claude 3.7 Sonnet Thinking: A New Era of Hybrid Reasoning?. Anthropic's latest release introduces a Thinking mode, letting users switch between rapid responses and step-by-step reasoning. But does

0

1

10

Sanmi Koyejo

@sanmikoyejo

1 month

First @VirtueAI_co webinar! Come learn about our work on enterprise-ready AI safety and security.

Virtue AI

@VirtueAI_co

1 month

Join Virtue AI Co-founder Sanmi Koyejo for a live webinar on why protecting your AI apps isn’t just about safety—it’s the key to faster deployment and growth. 📅 April 24 | 🕙 10 AM PT | 💻 Virtual. In this session, we’ll cover:.✅ Why traditional security tooling falls short

1

2

11

Sanmi Koyejo

@sanmikoyejo

1 year

Are you at #wsdm and interested in Trustworthy Large Language Models? Come check out my tutorial with @uiuc_aisecure in Room 22B, starting at 8:30 AM.

0

2

8

Sanmi Koyejo

@sanmikoyejo

4 months

Improving Suggestions For Student Feedback Using Direct Preference Optimization (DPO) by @juliettewoodrow presents a method for producing LLM-generated feedback suggestions that align with human preferences through an iterative fine-tuning process. 2/n.

1

8

Sanmi Koyejo

@sanmikoyejo

2 years

@russpoldrack @tallinzen @glupyan @RylanSchaeffer Some have argued that some improvements in model capabilities are unpredictable (along with a semi-precise definition of emergence). We argue that many claimed emergent capabilities are predictable, either using better statistics or alternative metrics. See thread for more.

Rylan Schaeffer

@RylanSchaeffer

2 years

We had meant to keep this under wraps for a few weeks, but it seems that the cat is out of the bag. Excited to announce our newest preprint!!. **Are Emergent Abilities of Large Language Models a Mirage?**. Joint w/ @sanmikoyejo & @BrandoHablando . 1/12.

1

0

7

Sanmi Koyejo

@sanmikoyejo

2 years

New work on improving aggregation for federated domain adaptation with @Ybo_Z and @enyij2!.

Yibo Jacky Zhang

@Ybo_Z

2 years

FedAvg / fine-tuning will fail in federated domain adaptation when the domain shift is large. To address this, we propose FedGP, an effective aggregation rule, and a theoretical framework showing why it works. Exciting work with @enyij2 and @sanmikoyejo.

0

3

6

Sanmi Koyejo

@sanmikoyejo

4 months

Heterogeneity of Preference Datasets for Pluralistic AI Alignment by Emily Bunnapradist, Niveditha Iyer, Megan Li, and Nikil Selvam proposes objectively quantifying the diversity of a dataset and evaluates preference datasets for pluralistic alignment. n/n.

2

0

6

Sanmi Koyejo

@sanmikoyejo

7 months

:) thanks for shouting out the class. Congrats on completing this fascinating project. P.S. many CS 329H projects for this year also look promising!.

Jared Moore

@jaredlcm

7 months

We're indebted to helpful feedback from @xave_rg; @baileyflan; @fierycushman; @PReaulx; @maxhkw; Matthew Cashman; @TobyNewberry; Hilary Greaves; @Ronan_LeBras; @JenaHwang2; @sanmikoyejo, @sangttruong, and Stanford Class of 329H; attendees of @cogsci_soc and SPP 2024; and more.

0

6

Sanmi Koyejo

@sanmikoyejo

2 years

A friendly introduction to double descent, focusing on building intuition with linear models (see thread and links).

Rylan Schaeffer

@RylanSchaeffer

2 years

@SAIA_Alignment @AnthropicAI @daniela_witten Joint work with @sanmikoyejo @KhonaMikail @KaterynaPistun1 @FieteGroup Jason, Zach & Akhilan. Comments, questions & feedback are welcome!. Paper: Code: 8/8.

0

3

6

Sanmi Koyejo

@sanmikoyejo

1 year

@roydanroy Another potentially relevant ref: .(suggested by @walesalaudeen96).

1

0

6

Sanmi Koyejo

@sanmikoyejo

4 months

Can Symbolic Scaffolding and DPO Enhance Solution Quality and Accuracy in Mathematical Problem Solving with LLMs by Shree Reddy, Shubhra Mishra fine-tune the Qwen-2.5-7B-Instruct model with symbolic-enhanced traces, achieving improved performance on GSM8K and MathCAMPS benchmarks.

0

5

Sanmi Koyejo

@sanmikoyejo

2 months

Llama 4 redteaming report!.

Virtue AI

@VirtueAI_co

2 months

Llama 4 Security: Progress or Plateau?. @Meta recently launched Llama 4, heralding it as a new era of multimodal AI with its Scout (17B active/16 experts) and Maverick (17B active/128 experts) models. Built on a Mixture-of-Experts architecture, they promise cutting-edge

0

4

Sanmi Koyejo

@sanmikoyejo

3 months

@Ana_koloskova is awesome, and you should apply to work with her if you can!. Also, Congrats!.

Anastasiia Koloskova

@Ana_koloskova

3 months

I am excited to announce that I will join the University of Zurich as an assistant professor in August this year! I am looking for PhD students and postdocs starting from the fall. My research is on optimization, federated learning, machine learning, privacy, and unlearning.

0

4

Sanmi Koyejo

@sanmikoyejo

4 months

Information-Theoretic Measures for LLM Output Evaluation by @zwrobertson, Suhana Bedi, and Hansol Lee explores using total variation mutual information to evaluate LLM-based preference learning.5/n.

1

0

3

Sanmi Koyejo

@sanmikoyejo

1 year

Congrats!!!.

Farnaz Jahanbakhsh (@[email protected])

@FarnazJ_

1 year

Got my hood!

1

0

3

Sanmi Koyejo

@sanmikoyejo

4 months

HP-GS: Human-Preference Next Best View Selection for 3D Gaussian Splatting by Matt Strong and Aditya Dutt presents a simple method for guiding the next best view in 3D Gaussian Splatting. Link: .Video link: 6/n.

1

2

3

Sanmi Koyejo

@sanmikoyejo

4 months

Cost and Reward Infused Metric Elicitation by @ChetBhateja, @joseph_c_obrien, @afnaanmhashmi, and @evaishaprakash extends metric elicitation to consider additional factors like monetary cost and latency. 3/n.

1

0

3

Sanmi Koyejo

@sanmikoyejo

2 months

Will be discussing the latest work from @stai_research and @VirtueAI_co on measuring and operationalizing AI safety and security.

1

0

2

Sanmi Koyejo

@sanmikoyejo

4 months

@roydanroy @ys_alh @iclr_conf We do think we contribute a novel understanding of the in- vs. out-of-distribution effects on unlearning. Would be great to get your thoughts if you get a chance to read the paper.

1

0

1

Sanmi Koyejo

@sanmikoyejo

2 years

Generative AI adoption is growing fast, but computational resources are not keeping up. Can adaptive pricing help, and how does one implement auctions for Generative AI? See some of our early work on this (led by Zachary Robertson).

Stanford Trustworthy AI Research (STAIR) Lab

@stai_research

2 years

🚀 Thrilled to share some work out of our lab researching how to better price AI content using auction design theory! We consider both consumer and data worker payment in this work. Paper: #OpenAI #AI #Stanford.Thread 🧵.

0

1

Sanmi Koyejo

@sanmikoyejo

4 months

@roydanroy @ys_alh @iclr_conf Thanks for engaging. In my bubble, we see more heuristic papers (perhaps we have a biased sampling). That said, we cite much of the existing work (those that were available when the paper was written).

1

0

1

Sanmi Koyejo

@sanmikoyejo

5 years

@autreche @NeurIPSConf From your title, the workshop proposal sounds like its broader than COVID-19 only and should be fine. Feel free to contact us directly if you need more details. We will be happy to answer.

0

1