Zico Kolter @zicokolter X Profile

Zico Kolter

@zicokolter

Followers

24K

Following

829

Media

38

Statuses

643

Professor and Head of Machine Learning Department at @CarnegieMellon. Board member @OpenAI and @Qualcomm. Chief Technical Advisor @GraySwanAI.

https://t.co/dmnxCg9Ptc

Pittsburgh, PA

Joined March 2017

Don't wanna be here? Send us removal request.

Gray Swan AI

@GraySwanAI

4 days

Gray Swan AI Arena sponsored by @hackthebox_eu present the Machine-in-the-Middle Challenge, a $100K competition exploring how humans & AI perform together in real offensive security scenarios.

7

23

151

Jeremy Cohen

@deepcohen

18 days

Even with full-batch gradients, DL optimizers defy classical optimization theory, as they operate at the *edge of stability.* With @alex_damian_, we introduce "central flows": a theoretical tool to analyze these dynamics that makes accurate quantitative predictions on real NNs.

19

204

1K

Dylan Sam

@dylanjsam

1 month

🚨Excited to introduce a major development in building safer language models: Safety Pretraining! Instead of post-hoc alignment, we take a step back and embed safety directly into pretraining. 🧵(1/n)

7

88

342

Priya L. Donti

@priyald17

2 months

Beyond humbled to be on this year's #TIME100AI AI can be an asset for climate & energy - but only if its development is guided by actual climate needs & planetary limits. Shoutout to those in the community working to shape a responsible, equitable, climate-aligned AI future 🌍💪

24

17

261

Aran Nayebi

@aran_nayebi

2 months

This semester, Matt Gormley & I are co-teaching CMU's Generative AI course! Today we discussed the Transformer architecture & Multi-Headed Attention. Follow along 👇 if you want to learn more about the tech that's powering today's AI, from ChatGPT to reasoning models to agents!

5

8

135

Eric Wallace

@Eric_Wallace_

2 months

Today we release gpt-oss-120b and gpt-oss-20b—two open-weight LLMs that deliver strong performance and agentic tool use. Before release, we ran a first of its kind safety analysis where we fine-tuned the models to intentionally maximize their bio and cyber capabilities 🧵

110

365

3K

Johannes Heidecke

@JoHeidecke

2 months

Open models can unlock huge benefits, and like any powerful technology, they carry misuse risks. Once the weights are released, there’s no pulling them back. This is why safety testing matters even more here. 1/

Eric Wallace

@Eric_Wallace_

2 months

Today we release gpt-oss-120b and gpt-oss-20b—two open-weight LLMs that deliver strong performance and agentic tool use. Before release, we ran a first of its kind safety analysis where we fine-tuned the models to intentionally maximize their bio and cyber capabilities 🧵

4

80

OpenAI

@OpenAI

2 months

Our open models are here. Both of them. https://t.co/9tFxefOXcg

openai.com

Advanced open-weight reasoning models to customize for any use case and run anywhere.

1K

3K

20K

Aran Nayebi

@aran_nayebi

3 months

1/ Updated now with nearly tight lower bounds—i.e., proofs showing when alignment becomes intractable, even for ideal agents. Key AI safety takeaways: 🧠 Too many values ⇒ makes alignment intractable 👁 Task-space growth ⇒ oversight failure 🤖 Bounded agents need the right

Aran Nayebi

@aran_nayebi

8 months

Are there fundamental barriers to AI alignment once we develop generally-capable AI agents? We mathematically prove the answer is *yes*, and outline key properties for a "safe yet capable" agent. 🧵👇 Paper: https://t.co/6ogluaAQCm

1

2

14

Andy Zou

@andyzou_jiaming

3 months

We deployed 44 AI agents and offered the internet $170K to attack them. 1.8M attempts, 62K breaches, including data leakage and financial loss. 🚨 Concerningly, the same exploits transfer to live production agents… (example: exfiltrating emails through calendar event) 🧵

71

393

2K

Yiding Jiang

@yidingjiang

4 months

A mental model I find useful: all data acquisition (web scrapes, synthetic data, RL rollouts, etc.) is really an exploration problem 🔍. This perspective has some interesting implications for where AI is heading. Wrote down some thoughts: https://t.co/VQLrYuJVAR

yidingjiang.github.io

This post explores the idea that the next breakthroughs in AI may hinge more on how we collect experience through exploration, and less on how many parameters and data points we have.

5

59

429

Zhengyang Geng

@ZhengyangGeng

4 months

now the code is up here:

github.com

JAX implementation of MeanFlow. Contribute to Gsunshine/meanflow development by creating an account on GitHub.

Zhengyang Geng

@ZhengyangGeng

5 months

Excited to share our work with my amazing collaborators, @Goodeat258, @SimulatedAnneal, @zicokolter, and Kaiming. In a word, we show an “identity learning” approach for generative modeling, by relating the instantaneous/average velocity in an identity. The resulting model,

2

17

71

Maksym Andriushchenko

@maksym_andr

4 months

🚨Excited to release OS-Harm! 🚨 The safety of computer use agents has been largely overlooked. We created a new safety benchmark based on OSWorld for measuring 3 broad categories of harm: 1. deliberate user misuse, 2. prompt injections, 3. model misbehavior.

3

31

110

Vaishnavh Nagarajan

@_vaishnavh

4 months

Wrote my first blog post! I wanted to share a powerful yet under-recognized way to develop emotional maturity as a researcher: making it a habit to read about the ✨past ✨ and learn from it to make sense of the present

2

14

122

YixuanEvenXu

@YixuanEvenXu

4 months

✨ Did you know that NOT using all generated rollouts in GRPO can boost your reasoning LLM? Meet PODS! We down-sample rollouts and train on just a fraction, delivering notable gains over vanilla GRPO. (1/7)

6

16

137

Hao Kang

@haok1402

5 months

Introducing FLAME-MoE: a fully open platform for Mixture-of-Experts (MoE) research. All code, data, checkpoints, training logs, and evaluation results are public—across 7 different scales. Paper: https://t.co/NsSk603rPi Code: https://t.co/pLgXfWkJnB

2

22

61

Zhengyang Geng

@ZhengyangGeng

5 months

Excited to share our work with my amazing collaborators, @Goodeat258, @SimulatedAnneal, @zicokolter, and Kaiming. In a word, we show an “identity learning” approach for generative modeling, by relating the instantaneous/average velocity in an identity. The resulting model,

5

39

154

Pratyush Maini

@pratyushmaini

5 months

Excited to be talking today about how research into memorization provides a fundamentally different lens on safety!

Stanford NLP Group

@stanfordnlp

5 months

For this week’s NLP Seminar, we are thrilled to host @pratyushmaini to talk about “What Memorization Research Taught Me About Safety” When: 5/8 Thurs 11am PT Non-Stanford affiliates registration form: https://t.co/G3IoKOFey7

3

9

100

Runtian Zhai

@RuntianZhai

6 months

A shorter version of the first three chapters of my thesis is accepted by ICML 2025. It provides a quick start for those interested in learning about the contexture theory. Check it out:

arxiv.org

Despite the empirical success of foundation models, we do not have a systematic characterization of the representations that these models learn. In this paper, we establish the contexture theory....

Runtian Zhai

@RuntianZhai

6 months

Why can foundation models transfer to so many downstream tasks? Will the scaling law end? Will pretraining end like Ilya Sutskever predicted? My PhD thesis builds the contexture theory to answer the above. Blog: https://t.co/MCIJifkU1Z Paper: https://t.co/RXVF7n7mHR 🧵1/12

1

2

37

Pratyush Maini

@pratyushmaini

6 months

Looking forward to giving a talk this Friday @OpenAI with @zhilifeng on some of our privacy & memorization research + how it applies to production LLMs! We've been gaining momentum on detecting, quantifying & erasing memorization; excited to explore its real-world impact!

0

12

104