Jon Saad-Falcon
@JonSaadFalcon
Followers
1K
Following
604
Media
26
Statuses
233
AI PhD @hazyresearch @StanfordAILab @stanfordnlp Previously @databricks @allen_ai @GeorgiaTech
Palo Alto, CA
Joined January 2021
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
42
124
378
I bet in 2022 that large LMs would end up on the edge — not for privacy (most don't care), but because on-device inference feels free, while the cloud always feels metered. Although less efficient, the cost comes out of the user’s phone battery (practically free and diffused).
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
0
1
12
AI has given venture capital a new way to repeat an old mistake: kingmaking. The pattern from 2021 is back: a category becomes "obvious," a top-tier firm anoints its winner, and everyone else acts like the decision is final. Sierra for support. Harvey for legal. Applied Compute
35
40
499
AI has been built on one vendor’s stack for too long. AMD’s GPUs now offer state-of-the-art peak compute and memory bandwidth — but the lack of mature software / the “CUDA moat” keeps that power locked away. Time to break it and ride into our multi-silicon future. 🌊 It's been a
12
98
564
This week, @StanfordAILab shared findings showing AI is entering a new era: from large models in the cloud to smaller, efficient ones running locally. In tests with IBM Granite 4.0, compact models handled 88.7% of queries, cutting energy and latency. ➡️ https://t.co/yGbvuG0BIo
2
16
71
This is exactly the type of systems-level thinking we need to get as much AI bang as we can for our electron buck. As China continues to outpace the US in energy deployment, capitalizing on such efficiencies will become increasingly important to continued US AI leadership.
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
2
2
3
Totally. SGLang's design choices directly improve IPW: 1. RadixAttention for KV cache reuse 2. Fast structured generation 3. Optimized scheduling Every efficiency gain = more intelligence per watt = more AI running locally
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
0
2
14
A very important and incredibly exciting research project: Intelligence Per Watt: A Study of Local Intelligence Efficiency Local AI models are getting dramatically more efficient, and the Hazy Research team proposes Intelligence-per-Watt (IPW) as the key metric to measure how
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
14
9
158
Local AI, measured by intelligence-per-watt (IPW), improved 5.3× in just 2 years! And it feels like the slope is still quite steep. Imagine another 5-10x in the next 2 years. Exciting.
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
8
11
129
Great theory proposed in this paper Intelligence per watt (IPW): intelligence delivered (capabilities) per unit of power consumed (efficiency). The paper asks how much work local AI on laptops can take over from cloud systems. Argues local AI can handle most queries, and
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
10
11
46
Introducing intelligence per watt, and predicting a distribution shift from cloud to edge inference!
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
3
9
67
A shift from cloud to edge? We took a closer look at “Local LMs” (≤20B active parameters) and found that they are: - Surprisingly capable, with 3.1× improvement since 2023 - Increasingly efficient, with 5.3x improvement since 2023 This suggests a shift from mainframe inference
4
17
76
Intelligence per watt (IPW) is a really clean and simple framing. Great way to push this metric forward Really cool work by @HazyResearch
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
2
1
25
@JonSaadFalcon and @Avanika15 strike again! IPW has the potential to be THE rallying cry for MLSys in the same way PPL was for the NLP community. Really great work and a great read!
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
1
1
3
the kind of work that bridges ai research and energy systems!
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
1
1
7
While scaling compute might enable smarter models, it’s equally important to ensure everyday intelligence remains practical and accessible. IPW defines a tangible metric to measure progress toward smarter and more efficient models. Great work @jonSaadFalcon @Avanika15!
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
0
1
7
Intelligence per watt: the metric that matters for getting AI out of data centers and into every device. Absolute privilege working on this with @Avanika15 and @JonSaadFalcon
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
5
1
13
our manifesto (maximizing intelligence per watt):
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
0
2
10
Intelligence per watt (🧠/⚡️) is such a cool metric to capture the trend shifting intelligence from cloud to local! Kudos to @JonSaadFalcon @Avanika15 and team for uncovering this insight, after many many hours of profiling on so many workloads, models, and hardware.
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
1
3
19
Great work by @JonSaadFalcon @Avanika15 @Azaliamirh @etash_guha and others at @HazyResearch ! Power is one of the biggest constraints to adding more compute for AI! Glad to see they have validated what we have been saying at @SambaNovaAI for awhile. We are the most power
Data centers dominate AI, but they're hitting physical limits. What if the future of AI isn't just bigger data centers, but local intelligence in our hands? The viability of local AI depends on intelligence efficiency. To measure this, we propose intelligence per watt (IPW):
2
2
8