Zico Kolter Profile
Zico Kolter

@zicokolter

Followers
24K
Following
829
Media
38
Statuses
643

Professor and Head of Machine Learning Department at @CarnegieMellon. Board member @OpenAI and @Qualcomm. Chief Technical Advisor @GraySwanAI.

Pittsburgh, PA
Joined March 2017
Don't wanna be here? Send us removal request.
@GraySwanAI
Gray Swan AI
4 days
Gray Swan AI Arena sponsored by @hackthebox_eu present the Machine-in-the-Middle Challenge, a $100K competition exploring how humans & AI perform together in real offensive security scenarios.
7
23
151
@deepcohen
Jeremy Cohen
18 days
Even with full-batch gradients, DL optimizers defy classical optimization theory, as they operate at the *edge of stability.* With @alex_damian_, we introduce "central flows": a theoretical tool to analyze these dynamics that makes accurate quantitative predictions on real NNs.
19
204
1K
@dylanjsam
Dylan Sam
1 month
🚨Excited to introduce a major development in building safer language models: Safety Pretraining! Instead of post-hoc alignment, we take a step back and embed safety directly into pretraining. 🧵(1/n)
7
88
342
@priyald17
Priya L. Donti
2 months
Beyond humbled to be on this year's #TIME100AI AI can be an asset for climate & energy - but only if its development is guided by actual climate needs & planetary limits. Shoutout to those in the community working to shape a responsible, equitable, climate-aligned AI future 🌍💪
24
17
261
@aran_nayebi
Aran Nayebi
2 months
This semester, Matt Gormley & I are co-teaching CMU's Generative AI course! Today we discussed the Transformer architecture & Multi-Headed Attention. Follow along 👇 if you want to learn more about the tech that's powering today's AI, from ChatGPT to reasoning models to agents!
5
8
135
@Eric_Wallace_
Eric Wallace
2 months
Today we release gpt-oss-120b and gpt-oss-20b—two open-weight LLMs that deliver strong performance and agentic tool use. Before release, we ran a first of its kind safety analysis where we fine-tuned the models to intentionally maximize their bio and cyber capabilities 🧵
110
365
3K
@JoHeidecke
Johannes Heidecke
2 months
Open models can unlock huge benefits, and like any powerful technology, they carry misuse risks. Once the weights are released, there’s no pulling them back. This is why safety testing matters even more here. 1/
@Eric_Wallace_
Eric Wallace
2 months
Today we release gpt-oss-120b and gpt-oss-20b—two open-weight LLMs that deliver strong performance and agentic tool use. Before release, we ran a first of its kind safety analysis where we fine-tuned the models to intentionally maximize their bio and cyber capabilities 🧵
4
4
80
@aran_nayebi
Aran Nayebi
3 months
1/ Updated now with nearly tight lower bounds—i.e., proofs showing when alignment becomes intractable, even for ideal agents. Key AI safety takeaways: 🧠 Too many values ⇒ makes alignment intractable 👁 Task-space growth ⇒ oversight failure 🤖 Bounded agents need the right
@aran_nayebi
Aran Nayebi
8 months
Are there fundamental barriers to AI alignment once we develop generally-capable AI agents? We mathematically prove the answer is *yes*, and outline key properties for a "safe yet capable" agent. 🧵👇 Paper: https://t.co/6ogluaAQCm
1
2
14
@andyzou_jiaming
Andy Zou
3 months
We deployed 44 AI agents and offered the internet $170K to attack them. 1.8M attempts, 62K breaches, including data leakage and financial loss. 🚨 Concerningly, the same exploits transfer to live production agents… (example: exfiltrating emails through calendar event) 🧵
71
393
2K
@yidingjiang
Yiding Jiang
4 months
A mental model I find useful: all data acquisition (web scrapes, synthetic data, RL rollouts, etc.) is really an exploration problem 🔍. This perspective has some interesting implications for where AI is heading. Wrote down some thoughts: https://t.co/VQLrYuJVAR
yidingjiang.github.io
This post explores the idea that the next breakthroughs in AI may hinge more on how we collect experience through exploration, and less on how many parameters and data points we have.
5
59
429
@ZhengyangGeng
Zhengyang Geng
4 months
now the code is up here:
Tweet card summary image
github.com
JAX implementation of MeanFlow. Contribute to Gsunshine/meanflow development by creating an account on GitHub.
@ZhengyangGeng
Zhengyang Geng
5 months
Excited to share our work with my amazing collaborators, @Goodeat258, @SimulatedAnneal, @zicokolter, and Kaiming. In a word, we show an “identity learning” approach for generative modeling, by relating the instantaneous/average velocity in an identity. The resulting model,
2
17
71
@maksym_andr
Maksym Andriushchenko
4 months
🚨Excited to release OS-Harm! 🚨 The safety of computer use agents has been largely overlooked. We created a new safety benchmark based on OSWorld for measuring 3 broad categories of harm: 1. deliberate user misuse, 2. prompt injections, 3. model misbehavior.
3
31
110
@_vaishnavh
Vaishnavh Nagarajan
4 months
Wrote my first blog post! I wanted to share a powerful yet under-recognized way to develop emotional maturity as a researcher: making it a habit to read about the ✨past ✨ and learn from it to make sense of the present
2
14
122
@YixuanEvenXu
YixuanEvenXu
4 months
✨ Did you know that NOT using all generated rollouts in GRPO can boost your reasoning LLM? Meet PODS! We down-sample rollouts and train on just a fraction, delivering notable gains over vanilla GRPO. (1/7)
6
16
137
@haok1402
Hao Kang
5 months
Introducing FLAME-MoE: a fully open platform for Mixture-of-Experts (MoE) research. All code, data, checkpoints, training logs, and evaluation results are public—across 7 different scales. Paper: https://t.co/NsSk603rPi Code: https://t.co/pLgXfWkJnB
2
22
61
@ZhengyangGeng
Zhengyang Geng
5 months
Excited to share our work with my amazing collaborators, @Goodeat258, @SimulatedAnneal, @zicokolter, and Kaiming. In a word, we show an “identity learning” approach for generative modeling, by relating the instantaneous/average velocity in an identity. The resulting model,
5
39
154
@pratyushmaini
Pratyush Maini
5 months
Excited to be talking today about how research into memorization provides a fundamentally different lens on safety!
@stanfordnlp
Stanford NLP Group
5 months
For this week’s NLP Seminar, we are thrilled to host @pratyushmaini to talk about “What Memorization Research Taught Me About Safety” When: 5/8 Thurs 11am PT Non-Stanford affiliates registration form: https://t.co/G3IoKOFey7
3
9
100
@RuntianZhai
Runtian Zhai
6 months
A shorter version of the first three chapters of my thesis is accepted by ICML 2025. It provides a quick start for those interested in learning about the contexture theory. Check it out:
Tweet card summary image
arxiv.org
Despite the empirical success of foundation models, we do not have a systematic characterization of the representations that these models learn. In this paper, we establish the contexture theory....
@RuntianZhai
Runtian Zhai
6 months
Why can foundation models transfer to so many downstream tasks? Will the scaling law end? Will pretraining end like Ilya Sutskever predicted? My PhD thesis builds the contexture theory to answer the above. Blog: https://t.co/MCIJifkU1Z Paper: https://t.co/RXVF7n7mHR 🧵1/12
1
2
37
@pratyushmaini
Pratyush Maini
6 months
Looking forward to giving a talk this Friday @OpenAI with @zhilifeng on some of our privacy & memorization research + how it applies to production LLMs! We've been gaining momentum on detecting, quantifying & erasing memorization; excited to explore its real-world impact!
0
12
104