Charles Sutton
@RandomlyWalking
Followers
17K
Following
10K
Media
88
Statuses
4K
Research scientist @GoogleAI / Previously academic @InfAtEd / Deep learning to help people write code. / @[email protected] / ❤️s:🐱🐶☕️🍕
Mountain View, CA
Joined February 2009
New blog on how to manage stress as a researcher: If you really are an impostor, it's not a syndrome.
theexclusive.org
A natural way to end this serious of posts would be to talk about impostor syndrome. Instead, let me say something more personal about how I experience self-doubt.
17
178
639
Excited to launch Google Antigravity, our next generation agentic IDE, now powered by Gemini 3!
339
519
8K
Gemini 3 is finally out. 🚀 The numbers on the hardest benchmarks are wild. Seeing MathArena Apex go from <2% to 23.4% and ARC-AGI-2 hit 31% feels like a real turning point for reasoning. We're starting to crack problems that used to look impossible. Huge congrats to the team.
4
22
226
We’re excited to see the security and OSS communities engage on vulnerability disclosure in light of new AI technologies that we believe will enable both defenders and attackers alike. Existing and emerging norms around disclosure are important debates, and we’ve noted the
7
36
114
I am pleased to announce our new paper, which provides an extremely sample-efficient way to create an agent that can perform well in multi-agent, partially-observed, symbolic environments. The key idea is to use LLM-powered code synthesis to learn a code world model (in the form
17
103
824
Initial results from a large scale run of @Google Big Sleep are here!Our AI agent found a series of vulnerabilities in widely used & reviewed software,demonstrating a new frontier in automated vulnerability discovery.Full details once the issues are fixed:
1
4
28
Today as part of our commitment to transparency in this space, we are proud to announce that we have reported the first 20 vulnerabilities discovered using our AI-based "Big Sleep" system powered by Gemini —
17
74
281
Back in grad school, when I realized how the “marketplace of ideas” actually works, it felt like I’d found the cheat codes to a research career. Today, this is the most important stuff I teach students, more than anything related to the substance of our research. A quick
9
58
443
I’m excited to share the news of Gemini Deep Think’s gold-medal level performance 🥇 at the International Math Olympiad! It has been an absolute blast building Deep Think this year and then scaling it to the IMO.
An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵
11
10
108
New blog! Advice about career and creativity for researchers and engineers. This time: What my PhD in computer science taught me about gardening.
theexclusive.org
Several years ago, I moved into a house with a garden. My PhD was in computer science, so it did not have much to do with gardens, or plants, or even sunlight. It still helped me in the garden....
6
13
221
New from our security teams: Our AI agent Big Sleep helped us detect and foil an imminent exploit. We believe this is a first for an AI agent - definitely not the last - giving cybersecurity defenders new tools to stop threats before they’re widespread.
254
829
10K
Very excited to share this update about our team's work on AI for security! Joint work with @miltos1 @xennygrimmato_ @dancherp and many others from GDM and Google Project Zero! To learn more about our agent Big Sleep, check out this blog post:
New from our security teams: Our AI agent Big Sleep helped us detect and foil an imminent exploit. We believe this is a first for an AI agent - definitely not the last - giving cybersecurity defenders new tools to stop threats before they’re widespread.
0
4
42
Together with the Big Sleep team, I discovered the first in-the-wild 0-day using an AI agent. Go patch your boxes!
New from our security teams: Our AI agent Big Sleep helped us detect and foil an imminent exploit. We believe this is a first for an AI agent - definitely not the last - giving cybersecurity defenders new tools to stop threats before they’re widespread.
0
2
11
🔔 Announcing our paper on Natural Language Outlines for Code! Our vision 🔮 - NL Outlines empower human developers with new forms of AI assistance throughout the software development process 🚀 Paper: https://t.co/2jMPKzXdyW FSE'25 presentation: https://t.co/Yu7WinLhS4 🧵👇
1
8
23
🚀 Really excited to launch #AgentX competition hosted by @BerkeleyRDI @UCBerkeley alongside our LLM Agents MOOC series (a global community of 22k+ learners & growing fast). Whether you're building the next disruptive AI startup or pushing the research frontier, AgentX is your
20
110
417
📣 Join us for the 5th Advanced LLM Agents MOOC lecture on Coding Agents and AI for Vulnerability Detection, @RandomlyWalking @GoogleDeepMind, 4:10 pm PT today March 3. 🚀 Join the thriving community of the LLM Agents MOOC series, with 21K+ registered learners & ~9K members on
9
18
119
[1/x] can we scale small, open LMs to o1 level? Using classical probabilistic inference methods, YES! Joint @MIT_CSAIL / @RedHat AI Innovation Team work introduces a particle filtering approach to scaling inference w/o any training! check out https://t.co/Iz8zoVbZPn
2
65
234
Google just released Gemma Embeddings! "GemmaEmbed is a dense-vector embedding model, trained especially for retrieval. As of December 12, 2024, GemmaEmbed achieves the #1 position overall on the MTEB leaderboard, with a score of 72.72."
23
136
1K
Excited to share our prompt tuning playbook! (Not an official product. Just authors tips & tricks for better prompting). I'm most excited about first half on mental models for post-training & prompting. Feedback/forks welcome! #LLM #PromptEngineering
https://t.co/TrVhKVJc64
13
131
610