NMcGreivy Profile Banner
Nick McGreivy Profile
Nick McGreivy

@NMcGreivy

Followers
968
Following
2K
Media
26
Statuses
132

physics & ml phd @princeton | seeking roles in the bay area

Stanford, CA
Joined June 2020
Don't wanna be here? Send us removal request.
@NMcGreivy
Nick McGreivy
1 year
Our new paper in @NatMachIntell tells a story about how, and why, ML methods for solving PDEs do not work as well as advertised. We find that two reproducibility issues are widespread. As a result, we conclude that ML-for-PDE solving has reached overly optimistic conclusions.
7
107
492
@NMcGreivy
Nick McGreivy
2 months
You can read more at the link below. https://t.co/wRFdNxNXja
0
0
0
@NMcGreivy
Nick McGreivy
2 months
A simple way of understanding why tanh leads to exploding activations is that as we increase the size of the initial weights, the weight matrices W become larger but the derivative of tanh, σ'(x), decreases towards zero. As W and σ'(x) are repeatedly multiplied in the backwards
1
0
0
@NMcGreivy
Nick McGreivy
2 months
If the weights are small, tanh behaves like a linear activation function. So just as before, the activations vanish but the weight gradients remain constant. So tanh gives exploding weight gradients for large initial weights, but constant gradients for small initial weights.
1
0
0
@NMcGreivy
Nick McGreivy
2 months
The intuition is that exploding/vanishing activations actually *prevent* the weight gradients from exploding/vanishing. By contrast, tanh activation functions prevent exploding *activations*. And by doing so, they allow the weight gradients to explode if the weights are large.
1
0
0
@NMcGreivy
Nick McGreivy
2 months
The result is that with linear (or ReLU) activations, the weight gradients neither explode nor vanish, but are constant in magnitude across the network. Xavier Glorot and @Yoshua_Bengio, who discovered this in a 2010 paper, described it as "really surprising".
1
0
0
@NMcGreivy
Nick McGreivy
2 months
To start, consider a simple neural network (MLP) with linear activation functions. The weight gradient equals the product of the weights times the activation. But the activations *also* equal the product of weight matrices.
1
0
0
@NMcGreivy
Nick McGreivy
2 months
If you've taken a class on deep learning, you know that the tanh activation function causes vanishing gradients because its derivative saturates. Surprisingly, however, tanh activation functions cause exploding, not vanishing, gradients. Here's why: 🧵
1
0
1
@ModeledBehavior
Adam Ozimek
2 months
This is the Robert Gordon debate all over. Yes GDP undercounts progress. But it *always has*. The value of Tylenol and toilets were undermeasured too. If you can’t increase measured by more than in the past, it’s probably true of unmeasured, and progress has probably slowed
@dwarkesh_sp
Dwarkesh Patel
2 months
@tszzl What if cancer cures and life extension cocktails were $20 a pop? I don't see how that would necessarily lead to some huge downstream explosion in GDP, despite the fact that astronomical consumer surplus is created. I agree GDP will grow. But potentially by much less that how
13
17
198
@NMcGreivy
Nick McGreivy
3 months
if you keep pressing jump, you never fall to the ground. pretty obviously a bug.
0
0
2
@NMcGreivy
Nick McGreivy
3 months
has anyone else noticed that the *very first* demo in the GPT-5 release just... doesn't work?
1
1
5
@NicolasRasmont
Nicolas Rasmont
5 months
We just published a post-mortem on a now-retracted viral AI‐materials paper from MIT. Graduate student Aidan Toner-Rodgers made big claims on the impact of AI on materials science research productivity and was endorsed by Nobel laureate Daron Acemoglu in the WSJ. 1/6
2
6
73
@pli_cachete
Rota
5 months
American funding for hard sciences has fallen 2/3 this year. In physics, they are receiving 15% of what they did last year. What the fuck are we doing?
374
471
6K
@KordingLab
Kording Lab 🦖
5 months
AI for science appears hard. Here is my stance on AI in science: AI is a great side-kick. I am unconvinced it is time to make it a hero. But maybe @FutureHouseSF will change that and then they could be a truly important company. This exchange is interesting and I think folks
@SGRodriques
Sam Rodriques
5 months
@sethbannon @KordingLab (Just linking here for the record that these Elicit findings appear to have been hallucinated. But if anyone has found papers that do appear to show this, please let us know. https://t.co/7tDxWgVHg3)
5
12
83
@NMcGreivy
Nick McGreivy
5 months
In a guest post for Understanding AI (@binarybits), I write about how I got fooled by AI-for-science hype, and what it taught me. I argue that AI is unlikely to revolutionize science, and much more likely to be a normal tool of incremental, uneven scientific progress.
19
56
336
@NMcGreivy
Nick McGreivy
7 months
@ja3k_
ja3k
7 months
What are good examples of long term trends that abruptly stopped?
0
1
14
@NMcGreivy
Nick McGreivy
7 months
I'm looking forward to speaking at the AI summit in Tokyo in 2 weeks. 2週間後に東京で開催されるAIサミットで講演できることを楽しみにしています。
@thought_channel
ThAT (Thinking about Thinking)
7 months
🌸 Spring in Tokyo Just Got Smarter! 🌸 📅 April 9-11, 2025 📍 National Museum of Emerging Science and Innovation, Tokyo #人工知能 #AI研究 #東京テック #未来の技術 #TokyoAI #AIFuture #TechSummit
0
0
4
@NMcGreivy
Nick McGreivy
11 months
In other words, if a scientist tries using machine learning for a "real scientific problem" similar to the ones explored here (i.e., spatiotemporal data), most of the time they'll find that ML is worse than useless! And even in the 29% of cases where the ML model does better
0
0
7
@NMcGreivy
Nick McGreivy
11 months
As the authors readily admit, these models aren't state of the art. With enough effort and tuning, ML *could* do better than this. But they use "time-tested models that are widely used in applications", and "reflect reasonable compute budgets and off-the-shelf choices that might
1
0
3
@NMcGreivy
Nick McGreivy
11 months
The most interesting part of this paper is how poorly ML does at scientific problems. See table 3. Four "popular models" are trained on 17 datasets. Out of 128 total evaluations, ML does worse than the weakest possible baseline (outputting a constant value) 71% of the time.
@oharub
Ruben Ohana
11 months
Generating cat videos is nice, but what if you could tackle real scientific problems with the same methods? 🧪🌌 Introducing The Well: 16 datasets (15TB) for Machine Learning, from astrophysics to fluid dynamics and biology. 🐙: https://t.co/PMAHK7i2lG 📜: https://t.co/6XLJA5lJnI
3
3
14