Joey Gonzalez @profjoeyg X Profile

Joey Gonzalez

@profjoeyg

Followers

5K

Following

1K

Media

48

Statuses

727

Professor @UCBerkeley, co-director of @LMSysorg, and co-founder @RunLLM

https://t.co/DSHUxstzny

Berkeley, CA

Joined June 2011

Don't wanna be here? Send us removal request.

Sumanth

@Sumanth_077

5 days

Turn your laptop into a powerful RAG system! LEANN can index and search through millions of documents while using 97% less storage than traditional solutions without accuracy loss. LEANN achieves this through graph-based selective recomputation with high-degree preserving

17

137

891

Laude Institute

@LaudeInstitute

7 days

Meet Slingshots // One. This inaugural batch includes leading-edge researchers advancing the science and practice of AI - with benchmarks, frameworks, and agents that ship real impact into the world. We're honored to support research from: @alexgshaw @Mike_A_Merrill

2

17

62

Vikram Sreekanti

@vsreekanti

7 days

Trillion dollar data center buildouts are all the rage. Why is all of this kicking off at once? The infrastructure investment we're seeing tells us a lot about the future of inference and the economics of intelligence. @profjoeyg and I break down why intelligence might not be

1

3

5

Charles Packer

@charlespacker

10 days

@nicoalbanese10 yeah it works beautifully in @Letta_AI, since it's basically post-training of claude to be better at "MemGPT"/Letta-style context engineering great example of better post-training (claude) lifting the performance of an existing harness (@Letta_AI) https://t.co/s8OVJ7uT8p

3

4

20

Jelani Nelson

@minilek

11 days

Risk paralysis run amok. "We are concerned that a culture of risk aversion limits creative problem solving, inhibits collaboration and interferes with the systemic change needed to reduce bureaucracy" -- UC Berkeley Task force on Reducing Bureaucratic Burden

Doris Tsao

@doristsao

11 days

Unbelievable: the famed Berkeley Math Circle is being forced to shut down due to a bureaucratic requirement where a guest lecturer giving an hour long lesson needs to be officially fingerprinted. How is fingerprinting even still a thing in the 21st century? Chancellor Lyons

6

14

87

Melissa Pan

@melissapan

13 days

The Sky’s Fun Committee, representing the ppl of sky, just dropped the new lab theme: ⚫️💖 Black Pink x Halloween 🎃🦇 We have: - Gru & the minions - kpop ??? 🫰😉

8

52

Tsung-Han (Patrick) Wu

@tsunghan_wu

23 days

Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: ⛔ Force Stop → Reasoning leakage (won’t stop) ⚡️ Speedup → Panic (rushed answers) ❓ Info Updates → Self-doubt (reject updates) 👉Check out https://t.co/wKrnsMkiFY

5

21

68

Vikram Sreekanti

@vsreekanti

23 days

AI coding tools are all the rage, but very few people are thinking about Day 2: What happens when the code generated by Cursor/Claude Code/etc. goes into production? Maintaining code is often costlier than generating it — here's what you'll need to consider before you drive

RunLLM

@RunLLM

23 days

Your SRE team is about to go bankrupt—and AI coding tools are why. Every CTO celebrates the productivity gains: 2× throughput, 50% faster development. But AI-generated code enters production with zero ownership. Reading code is not the same as writing it. The hidden costs: ✅

0

1

4

Joey Gonzalez

@profjoeyg

14 days

What are some good examples of opinionated AI products where the opinion helps to meaningfully define the product?

Vikram Sreekanti

@vsreekanti

14 days

"Build opinionated products" is not new advice, but it's more important than ever. If you're not careful, your agents can be everything to everyone. That might sound wonderful at first, but it's going to cause you headaches later. Here's why 👇

1

0

Vikram Sreekanti

@vsreekanti

21 days

For the most part, everyone's use of AI today is synchronous and interactive... but it doesn't have to be that. As agents proliferate, we'll see more and more agents working in the background, doing things for us that we didn't want to bother doing ourselves. The most obvious

1

4

5

Joey Gonzalez

@profjoeyg

23 days

What's wrong with this picture? We are still managing GPUs like _old_ mainframes. It's time to start sharing!

Yifan Qiao

@yifandotqiao

23 days

🚀 End the GPU Cost Crisis Today!!! Headache with LLMs lock a whole GPU but leave capacity idle? Frustrated by your cluster's low utilization? We launch kvcached, the first library for elastic GPU sharing across LLMs. 🔗 https://t.co/3BC7B6s2EX 🧵👇 Why it matters:

1

15

Yifan Qiao

@yifandotqiao

23 days

🚀 End the GPU Cost Crisis Today!!! Headache with LLMs lock a whole GPU but leave capacity idle? Frustrated by your cluster's low utilization? We launch kvcached, the first library for elastic GPU sharing across LLMs. 🔗 https://t.co/3BC7B6s2EX 🧵👇 Why it matters:

9

53

196

Joey Gonzalez

@profjoeyg

23 days

AI coding tools are enabling anyone to create and modify applications faster than ever. I fear we are about to see an explosion of poorly scoped and rapidly "improving" applications running and interacting on infrastructure designed in a bygone era where software was

Vikram Sreekanti

@vsreekanti

23 days

AI coding tools are all the rage, but very few people are thinking about Day 2: What happens when the code generated by Cursor/Claude Code/etc. goes into production? Maintaining code is often costlier than generating it — here's what you'll need to consider before you drive

2

3

12

Mir Miroyan

@mirmiroyan

23 days

Pair programming with your coding agent would be cool, right? But are the current models ready for this challenge? Not quite. In our recent work, we evaluate reasoning models under "dynamic" world settings. Check it out and reach out to chat!

Tsung-Han (Patrick) Wu

@tsunghan_wu

23 days

Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: ⛔ Force Stop → Reasoning leakage (won’t stop) ⚡️ Speedup → Panic (rushed answers) ❓ Info Updates → Self-doubt (reject updates) 👉Check out https://t.co/wKrnsMkiFY

0

1

9

Joey Gonzalez

@profjoeyg

23 days

... but wait, maybe this is an interesting tweet. </think> Have you ever wondered what happens if you force a model to stop thinking? It turns out, models are pretty good at answering with partial thoughts but occasionally they will cleverly return to contemplating in the

Tsung-Han (Patrick) Wu

@tsunghan_wu

23 days

Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: ⛔ Force Stop → Reasoning leakage (won’t stop) ⚡️ Speedup → Panic (rushed answers) ❓ Info Updates → Self-doubt (reject updates) 👉Check out https://t.co/wKrnsMkiFY

0

2

17

xAlg-ai

@xalg_ai

1 month

Excited to share our new research: vAttention - Verified Sparse Attention. Sparse attention with provable quality guarantees for LLMs. Full paper: https://t.co/pvOSEI8E7J Gibhub: xAlg-ai/sparse-attention-hub 🧵 A thread 👇

arxiv.org

State-of-the-art sparse attention methods for reducing decoding latency fall into two main categories: approximate top-$k$ (and its extension, top-$p$) and recently introduced sampling-based...

1

9

15

Jelani Nelson

@minilek

28 days

At @Berkeley_EECS we always work to keep our curriculum fresh. Our intro ML course CS 189 just got a drastic makeover this semester (thanks @profjoeyg @NargesNorouzi!) and now includes ~12 lectures on e.g. Adam, PyTorch, various NN architectures, LLMs, and more (see

eecs189.org

A week-to-week description of the content covered in the course.

Zara Zhang

@zarazhangrui

1 month

Harvard and Stanford students tell me their professors don't understand AI and the courses are outdated. If elite schools can't keep up, the credential arms race is over. Self-learning is the only way now.

21

95

845

Vikram Sreekanti

@vsreekanti

28 days

Over the last year, AI companies have either moved towards building broader products or narrower ones. Which one's better? @profjoeyg and I have some opinions: The post this week explores why narrowly-scoped agents are beginning to deliver more value than generic platforms 👇

1

3

6

Parth Asawa

@pgasawa

1 month

📜 Paper: https://t.co/NU2mNypAzz 💻 Code: https://t.co/SZYOxcd9M4 This project was co-led with @aczhu1326 and advised by @matei_zaharia, @AlexGDimakis, and @profjoeyg. Reach out to @aczhu1326 and me if you want to chat about interesting applications! (8/n)

github.com

How to Train Your Advisor: Steering Black-Box LLMs with Advisor Models - az1326/advisor-models

0

2

23