
Prof. Anima Anandkumar
@AnimaAnandkumar
Followers
33K
Following
6K
Media
171
Statuses
2K
Bren Professor @caltech, Time100, Fmr Sr Director of #AI research @nvidia Fmr Principal Scientist @awscloud
Joined May 2021
There is a wrong notion that precision is sacrificed in UE8M0. That is not the case. You can retain the original accuracy even when you directly train in that format when you use our Madam algorithm
Looking into DeepSeek X UE8M0, a radical FP8 format (all exponent, no mantissa). It trades precision for range & simpler hardware. This "software-first" move pushes domestic chips (Huawei, Cambricon) to adapt, accelerating China's integrated AI-semiconductor ecosystem strategy.
1
3
15
RT @jiawzhao: Excited to see Logarithmic format (LNS, UE8M0 FP8) used in production by @deepseek_ai! LNS enables efficient multi (just addi….
0
7
0
🚨New survey reveals minimum tariff concerns amid growing economic momentum. A new nationwide August survey of America’s Main Street small businesses revealed muted concerns about the @POTUS Administration’s tariff actions, as business owners prepare for coming growth.
5
10
13
It is interesting that the new @deepseek_ai v3.1 is trained using the UE8M0 FP8 scale data format which is logarithmic number system. Our multiplicative weights update (Madam) for training in that format was done several years ago while at @nvidia It yields maximum hardware
14
106
608
RT @Caltech_LHC: Excellent keynote lecture closing the first day of #ml4jets2025 @caltech by Prof. @AnimaAnandkumar 👏🏽 👏🏽 https://t.co/….
0
1
0
🚨 @TopstepTV is LIVE! It’s Turnaround Tuesday. Come find out what it means and trade the action with us. Let’s go. Follow us on X & subscribe on YouTube.
2
2
7
RT @JeffDean: My longtime collaborator Dave Patterson (long-time faculty at @UCBerkeley, @TheOfficialACM Turing Award winner, and fellow @L….
0
124
0
RT @ericschmidt: when I was in graduate school i was on an NSF fellowship for $15,000 per year, and i needed the mo….
thehill.com
Virtually every smartphone on the planet runs on a chip paid for by American taxpayers — a chip that I helped invent. Now Congress is moving to cut funding for the National Science Foundation that …
0
47
0
RT @Azizzadenesheli: #NeuralOperators learn physics through data. We study long term prediction capability of #NeuralOperator on a hard t….
0
2
0
LIVE Wednesday @ 6 PM ET 📅. See the strategy that’s crushed the S&P 500 by 6X in 2025 . Tom Gentile’s seasonal trading system is up 58% YTD — and on this live call, he’ll show you how he’s preparing for the toughest stretch of the year and reveal his next trade idea. Register
7
4
11
Excited to share our recently published paper in @WileyGlobal on "Ocean Emulation With Fourier Neural Operators: Double Gyre" We used Fourier Neural Operators to build the first high-resolution weather model, FourCastNet. Since it works so well for
0
2
17
My @MLSysConf keynote is now online. . The scaling of large language models has led to impressive gains in language understanding, but at a cost of insatiable memory and bandwidth requirements. I advocated a principled approach of designing optimization.
4
20
101
RT @Azizzadenesheli: FALCON: built on centuries of knowledge from fluid dynamics, turbulent flows, control, RL, and ML, to deliver foundati….
0
1
0
Thank you @BBCNews for featuring our work using AI to overcome turbulence
bbc.com
Flights are getting bumpier, thanks in part to climate change. But new studies are looking into innovative potential ways to turbulence-proof wings - using AI and owls
0
2
14
RT @Caltech: To help capture the impact of Caltech research and provide information on the federal funding that helps support it, the Offic….
researchimpact.caltech.edu
0
7
0
RT @guohao_li: 🚨 [Call for Papers] SEA Workshop @ NeurIPS 2025 🚨.📅 December 6, 2025 | 📍 San Diego, USA.🌐: Environm….
0
19
0
RT @ShuiwangJi: Our 500+ page AI4Science paper is finally published:. Artificial Intelligence for Science in Quantum, Atomistic, and Contin….
0
21
0
.@shoyer We admire neuralGCM and all the contributions you are making for AI+climate modeling. Social media doesn't allow for too much nuance - what I meant to say was FourCastNet 3 is unprecedented in offering competitive skill at 6-hour resolution, with probabilistic estimates.
@AnimaAnandkumar @nvidia @Caltech FourCastNet3 is very impressive, great work!. It's certainly not unprecendented in terms of speed for probabilistic AI-weather prediction, though. E.g., NeuralGCM makes a 15 day probabilistic weather forecast in under 20 seconds.
1
0
9
RT @BTolooshams: I am giving a talk on "Neural Operators and Biologically-informed Latent Embeddings for Foundation Models in NeuroAI" at t….
0
13
0