Joey Gonzalez Profile
Joey Gonzalez

@profjoeyg

Followers
5K
Following
1K
Media
48
Statuses
727

Professor @UCBerkeley, co-director of @LMSysorg, and co-founder @RunLLM

Berkeley, CA
Joined June 2011
Don't wanna be here? Send us removal request.
@Sumanth_077
Sumanth
5 days
Turn your laptop into a powerful RAG system! LEANN can index and search through millions of documents while using 97% less storage than traditional solutions without accuracy loss. LEANN achieves this through graph-based selective recomputation with high-degree preserving
17
137
891
@LaudeInstitute
Laude Institute
7 days
Meet Slingshots // One. This inaugural batch includes leading-edge researchers advancing the science and practice of AI - with benchmarks, frameworks, and agents that ship real impact into the world. We're honored to support research from: @alexgshaw @Mike_A_Merrill
2
17
62
@vsreekanti
Vikram Sreekanti
7 days
Trillion dollar data center buildouts are all the rage. Why is all of this kicking off at once? The infrastructure investment we're seeing tells us a lot about the future of inference and the economics of intelligence. @profjoeyg and I break down why intelligence might not be
1
3
5
@charlespacker
Charles Packer
10 days
@nicoalbanese10 yeah it works beautifully in @Letta_AI, since it's basically post-training of claude to be better at "MemGPT"/Letta-style context engineering great example of better post-training (claude) lifting the performance of an existing harness (@Letta_AI) https://t.co/s8OVJ7uT8p
3
4
20
@minilek
Jelani Nelson
11 days
Risk paralysis run amok. "We are concerned that a culture of risk aversion limits creative problem solving, inhibits collaboration and interferes with the systemic change needed to reduce bureaucracy" -- UC Berkeley Task force on Reducing Bureaucratic Burden
@doristsao
Doris Tsao
11 days
Unbelievable: the famed Berkeley Math Circle is being forced to shut down due to a bureaucratic requirement where a guest lecturer giving an hour long lesson needs to be officially fingerprinted. How is fingerprinting even still a thing in the 21st century? Chancellor Lyons
6
14
87
@melissapan
Melissa Pan
13 days
The Sky’s Fun Committee, representing the ppl of sky, just dropped the new lab theme: βš«οΈπŸ’– Black Pink x Halloween πŸŽƒπŸ¦‡ We have: - Gru & the minions - kpop ??? πŸ«°πŸ˜‰
8
8
52
@tsunghan_wu
Tsung-Han (Patrick) Wu
23 days
Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: β›” Force Stop β†’ Reasoning leakage (won’t stop) ⚑️ Speedup β†’ Panic (rushed answers) ❓ Info Updates β†’ Self-doubt (reject updates) πŸ‘‰Check out https://t.co/wKrnsMkiFY
5
21
68
@vsreekanti
Vikram Sreekanti
23 days
AI coding tools are all the rage, but very few people are thinking about Day 2: What happens when the code generated by Cursor/Claude Code/etc. goes into production? Maintaining code is often costlier than generating it β€” here's what you'll need to consider before you drive
@RunLLM
RunLLM
23 days
Your SRE team is about to go bankruptβ€”and AI coding tools are why. Every CTO celebrates the productivity gains: 2Γ— throughput, 50% faster development. But AI-generated code enters production with zero ownership. Reading code is not the same as writing it. The hidden costs: βœ…
0
1
4
@profjoeyg
Joey Gonzalez
14 days
What are some good examples of opinionated AI products where the opinion helps to meaningfully define the product?
@vsreekanti
Vikram Sreekanti
14 days
"Build opinionated products" is not new advice, but it's more important than ever. If you're not careful, your agents can be everything to everyone. That might sound wonderful at first, but it's going to cause you headaches later. Here's why πŸ‘‡
1
0
0
@vsreekanti
Vikram Sreekanti
21 days
For the most part, everyone's use of AI today is synchronous and interactive... but it doesn't have to be that. As agents proliferate, we'll see more and more agents working in the background, doing things for us that we didn't want to bother doing ourselves. The most obvious
1
4
5
@profjoeyg
Joey Gonzalez
23 days
What's wrong with this picture? We are still managing GPUs like _old_ mainframes. It's time to start sharing!
@yifandotqiao
Yifan Qiao
23 days
πŸš€ End the GPU Cost Crisis Today!!! Headache with LLMs lock a whole GPU but leave capacity idle? Frustrated by your cluster's low utilization? We launch kvcached, the first library for elastic GPU sharing across LLMs. πŸ”— https://t.co/3BC7B6s2EX πŸ§΅πŸ‘‡ Why it matters:
1
1
15
@yifandotqiao
Yifan Qiao
23 days
πŸš€ End the GPU Cost Crisis Today!!! Headache with LLMs lock a whole GPU but leave capacity idle? Frustrated by your cluster's low utilization? We launch kvcached, the first library for elastic GPU sharing across LLMs. πŸ”— https://t.co/3BC7B6s2EX πŸ§΅πŸ‘‡ Why it matters:
9
53
196
@profjoeyg
Joey Gonzalez
23 days
AI coding tools are enabling anyone to create and modify applications faster than ever. I fear we are about to see an explosion of poorly scoped and rapidly "improving" applications running and interacting on infrastructure designed in a bygone era where software was
@vsreekanti
Vikram Sreekanti
23 days
AI coding tools are all the rage, but very few people are thinking about Day 2: What happens when the code generated by Cursor/Claude Code/etc. goes into production? Maintaining code is often costlier than generating it β€” here's what you'll need to consider before you drive
2
3
12
@mirmiroyan
Mir Miroyan
23 days
Pair programming with your coding agent would be cool, right? But are the current models ready for this challenge? Not quite. In our recent work, we evaluate reasoning models under "dynamic" world settings. Check it out and reach out to chat!
@tsunghan_wu
Tsung-Han (Patrick) Wu
23 days
Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: β›” Force Stop β†’ Reasoning leakage (won’t stop) ⚑️ Speedup β†’ Panic (rushed answers) ❓ Info Updates β†’ Self-doubt (reject updates) πŸ‘‰Check out https://t.co/wKrnsMkiFY
0
1
9
@profjoeyg
Joey Gonzalez
23 days
... but wait, maybe this is an interesting tweet. </think> Have you ever wondered what happens if you force a model to stop thinking? It turns out, models are pretty good at answering with partial thoughts but occasionally they will cleverly return to contemplating in the
@tsunghan_wu
Tsung-Han (Patrick) Wu
23 days
Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: β›” Force Stop β†’ Reasoning leakage (won’t stop) ⚑️ Speedup β†’ Panic (rushed answers) ❓ Info Updates β†’ Self-doubt (reject updates) πŸ‘‰Check out https://t.co/wKrnsMkiFY
0
2
17
@xalg_ai
xAlg-ai
1 month
Excited to share our new research: vAttention - Verified Sparse Attention. Sparse attention with provable quality guarantees for LLMs. Full paper: https://t.co/pvOSEI8E7J Gibhub: xAlg-ai/sparse-attention-hub 🧡 A thread πŸ‘‡
Tweet card summary image
arxiv.org
State-of-the-art sparse attention methods for reducing decoding latency fall into two main categories: approximate top-$k$ (and its extension, top-$p$) and recently introduced sampling-based...
1
9
15
@minilek
Jelani Nelson
28 days
At @Berkeley_EECS we always work to keep our curriculum fresh. Our intro ML course CS 189 just got a drastic makeover this semester (thanks @profjoeyg @NargesNorouzi!) and now includes ~12 lectures on e.g. Adam, PyTorch, various NN architectures, LLMs, and more (see
eecs189.org
A week-to-week description of the content covered in the course.
@zarazhangrui
Zara Zhang
1 month
Harvard and Stanford students tell me their professors don't understand AI and the courses are outdated. If elite schools can't keep up, the credential arms race is over. Self-learning is the only way now.
21
95
845
@vsreekanti
Vikram Sreekanti
28 days
Over the last year, AI companies have either moved towards building broader products or narrower ones. Which one's better? @profjoeyg and I have some opinions: The post this week explores why narrowly-scoped agents are beginning to deliver more value than generic platforms πŸ‘‡
1
3
6
@pgasawa
Parth Asawa
1 month
πŸ“œ Paper: https://t.co/NU2mNypAzz πŸ’» Code: https://t.co/SZYOxcd9M4 This project was co-led with @aczhu1326 and advised by @matei_zaharia, @AlexGDimakis, and @profjoeyg. Reach out to @aczhu1326 and me if you want to chat about interesting applications! (8/n)
Tweet card summary image
github.com
How to Train Your Advisor: Steering Black-Box LLMs with Advisor Models - az1326/advisor-models
0
2
23