I think neural network potentials are the most important scientific tool of the next decade. The ability to simulate systems at the molecular scale starting from nothing but quantum mechanics will be transformative for a vast range of problems throughout biology and chemistry 1/n
I want to explain a statistical mechanical concept known as coarse graining which I think might be useful for thinking about things like AF3. Especially a special case known as continuum or implicit solvent models.
Thrilled to announce AlphaFold 3 which can predict the structures and interactions of nearly all of life’s molecules with state-of-the-art accuracy including proteins, DNA and RNA. Biology is a complex dynamical system so modeling interactions is crucial
Ok so the new AlphaFold model relies in large part on a "relatively standard diffusion approach" turns out you can think of this as just a special case of a neural network potential, it just uses experimental data not quantum chemistry to train on. 1/n
Thrilled to announce AlphaFold 3 which can predict the structures and interactions of nearly all of life’s molecules with state-of-the-art accuracy including proteins, DNA and RNA. Biology is a complex dynamical system so modeling interactions is crucial
Ok so what is a neural network potential concretely? It's just a very flexible function with many adjustable parameters that you fit to the 'potential energy surface.' This is just the energy as function of the position of the atoms in your system. 1/n
I think neural network potentials are the most important scientific tool of the next decade. The ability to simulate systems at the molecular scale starting from nothing but quantum mechanics will be transformative for a vast range of problems throughout biology and chemistry 1/n
Thrilled to announce AlphaFold 3 which can predict the structures and interactions of nearly all of life’s molecules with state-of-the-art accuracy including proteins, DNA and RNA. Biology is a complex dynamical system so modeling interactions is crucial
Is there a machine/deep learning textbook anywhere that teaches the Boltzmann/Gibbs distribution? Is it called something else? Have looked at three so far with no mention of it. It is the entropy maximising distribution! Surely it is important to know?
Interesting how deep learning for generating equilibrium distributions seems to be converging back to molecular dynamics. Like this is just langevin dynamics with a learnt score. So just NNP-MD with many runs in parallel right? Or am I missing something?
So pleased to get this preprint out. Feel like we’ve finally worked out how to do something I’ve been trying to do for 13 years since the start of my PhD: Build an accurate continuum solvent model of ion-ion interactions in solution.
I want to record a prediction: ML acceleration of molecular simulation will transform all of physical science. From quantum scale all the way up to climate. Justification: 1/n
Yes, this is the ultimate way ML will help accelerate physical sciences. By constructing custom MCMC operators (eg proposal distributions) to accelerate traditional MD/MCMC simulations in combination with existing tools. This can be done while preserving all error bars.
Quantum computing experts claim computing properties of Femoco is impossible with classical computing and if you could do it you could revolutionize fertilizer synthesis. Turns out you can do it with DFT fine but almost no one cares.
Love the flow of ideas back and forth between molecular simulation and deep learning. Diffusion models originally inspired by molecular dynamics algorithms (langevin dynamics) now inspiring new approaches to accelerate MD.
Oke, the AlphaFlow paper is awesome: AlphaFold Meets Flow Matching for Generating Protein Ensembles
Just watch how AlphaFlow's ensemble reproduces details of MD.
Weights + code
We have it in the reading group on Mon 11am EST!
1/2
Notice how similar to MD this is conceptually. It is actually mathematically essentially the same also. The only difference is the force field is learnt from the PDB where you know the forces are 0 because they are equilibrium states. Really its an implicit solvent force field.
RFdiffusionAA generating a small molecule binding protein against an experimental FXIa inhibitor (OQO), a ligand which is significantly different than any in its training dataset.
Some people are not impressed by this. Maybe im just incompetent but I spent literally years trying to build continuum solvent models of this exact thing and couldn’t do much better it’s really hard to model without explicit water! 🤣
We see exactly the same thing for simple electrolytes. If you cannot get sodium chloride pairing free energy right you are not going to get protein folding right. I often don’t point this out because I don’t want to offend senior researchers.
Take a look at this
#OpenAccess
paper 📝 from the latest issue of Journal of Chemical Theory and Computation
#JCTC
🔎 The Role of Force Fields and Water Models in Protein Folding and Unfolding Dynamics 💦🔬
🔓
#thermodynamics
This is a beautiful clear explanation of diffusion models. The cool thing is they are actually really easy to understand if you know molecular simulation. There is a direct analog for almost every concept. 1/n
New blog post about the geometry of diffusion guidance:
This complements my previous blog post on the topic of guidance, but it has a lot of diagrams which I was too lazy to draw back then! Guest-starring Bundle, the cutest bunny in ML 🐇
Good take as always. I don’t think this axis makes sense really though. I would argue a diffusion model is more physics based than a lennard jones forcefield. Harmonic approximation about the minima is in every physics text book but I’ve never seen a 1/r^12 repulsion.
Alex Zhavoronkov, PhD (aka Aleksandrs Zavoronkovs)
Awesome paper. Shows how we can train on many different levels of theory simultaneously will be very important as we make DFT databases bigger and bigger. We need to build a PDB equivalent but for quantum chemistry.
Another nice ion pairing paper on NaCl with NNPs. Look at the spread on those classical force fields in comparison! This is the fundamental medium in which all of biology occurs and we haven't been able to predict even its most basic properties until now!
So byte dance have entered the universal machine learned force field race with a very impressive paper starting with the right problem imo: liquid electrolytes. I think this could be a critically important technology.
Just imagine one day we will be able to go to a website like this and run accurate dynamics on any system of atoms we want. This will transform all of science and society. We will finally be masters of the molecular scale.
So cool! I assume this is the same thing that goes on at phase transition boundaries in stat mech: ‘Schramm-Loewner curves appear as domain boundaries between phases at second-order critical points like the critical Ising model’
Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.
Very cool. Diffusion models use a molecular simulation algorithm (thermally annealed langevin dynamics) so of course you see phase transitions directly analogous to the sudden changes that occur when you cool/heat a system of molecules, i.e., crystallization.
This means that Google's claim that they have "surpassed physics based tools" is kind of strange. In fact there is a ton of physics baked into it how diffusion models work!
Check out this really nice collaboration with
@alisterpage
and two awesome students where we show you can resample from DFTB MD, compute forces at a higher level of theory and run stable MD with equivariant neural network potentials.
Fascinating summary of recent work on geometric/graphs neural networks. I’m catching up on this field. But I’m now convinced it will change the way we do a lot of science. In particular I really like
@HannesStaerk
’s point re. the application to learning quantum interactions. 1/7
An annual round of predictions in Geometric and Graph ML coauthored with
@PetarV_93
based on input from the leading experts in the field.
Longread on
@TDataScience
If you trained on the true equilibrium distribution structures extracted from a simulation you get the true forces (if your noise is sufficiently low) This paper first showed this and we have validated for a simple system that it is precise.
Neural network potentials trained on high level quantum chemistry calculations is the only plausible solution to this problem imo. Is there an alternative?
despite advances in protein folding models like AlphaFold, we haven't actually discovered that much new about the underlying principles of protein folding.
these models accurately predict the structure a given protein sequence will fold into, without knowing that much about the
So Mg and Ca look plausible but don't actually form pairs with chloride in water at all in reality so not really physically correct. On the other hand classical simulation approaches can also fail to get that right. The fact they're all roughly the same peak height indicates
This is what I was looking for. I derived this independently and couldn’t work out why I hadn’t seen it before. I think this is a profoundly beautiful and important result. Learning is isomorphic to statistical mechanics!
Next I'll do a thread on where to now that the PDB has been tapped out. Google say they're going to wait for 'cryo electron microscopy and tomography' to give them more data. This will take too long and is inherently limited to equilibria there is a faster way in my opinion.
Wow very cool thread. Amazing it can get this stuff right to me when it doesn’t even know about electrostatics. Thread shows some interesting ways it breaks too though.
I read a lovely post on how AF3 can predict electrolyte RDFs. So I digress from protein-NA complexes to just NA complexes. (Will come back to it)
Case 3: 1JRN oxytricha bimolecular G4T4G4. My fav since I solved it.
Perfect prediction with loops and 5 ions.
MLPs are so foundational, but are there alternatives? MLPs place activation functions on neurons, but can we instead place (learnable) activation functions on weights? Yes, we KAN! We propose Kolmogorov-Arnold Networks (KAN), which are more accurate and interpretable than MLPs.🧵
Lots of exciting new machine learning for molecular simulation papers coming out: one arguing that equivariant features are key and should become standard:
These are already being used today to design new drugs. And everyday they get much better. They work by predicting the solution of the Schrödinger equation much faster than it’s possible to directly solve it.
Instead of finding the perfect prompt for an LLM (let's think step by step), you can ask LLMs to critique their outputs and immediately fix their own mistakes. Here's a fun example:
I think this is more or less correct. The only limitation is training data. Where will that data come from?: AI accelerated first principles molecular simulation will be a big source in my opinion.
"Where do I think the next amazing revolution is going to come? And this is going to be flat out one of the biggest ones ever.
There's no question that digital biology is going to be it."
Jensen Huang, founder & CEO of NVIDIA.
@StasBekman
This is a phase transition we see it in molecular simulations which are directly analogous. Loss: energy, entropy: int rho log rho: neural nets minimise the free energy so you can see states with similar free energy but different losses and jump between.
They predict the forces on atoms allowing us to simulate how atoms and molecules move. They therefore connect the quantum scale to the classical scale. But as important an achievement as that is they are even more useful than that.
Amazing how something as well studied as a protein folding can actually take up to 1000 times longer than we previously thought. Crazy to me how much we still don’t know about the molecular scale.
In contrast diffusion models learn the "score" which is the gradient of the log of a probability distribution. For AF3 this is just the probability of the atoms having a particular position. In stat mech log probs are free energies or potentials of mean force and their grads are
Love all the AI guys talking about thinking from ‘first principles.’ They should follow that thinking through and pick up a text book on the real first principles: the principles of quantum mechanics.
This is because the problem of connecting scales is much more general than just the quantum to the classical there’s also connecting the scale of molecules to proteins and proteins to cells and cells to organs and so on. Same in chemical engineering and climate simulations.
It should be possible to automate this process as we know from renormalisation group theory that there are recurring mathematical features involved in connecting scales.
Excited to share our perspective paper
@JPhysChem
"
#MachineLearning
Interatomic Potentials and Long-Range Physics"
#compchem
It's focused methodologies & models used where presence of nonlocal physics & chemistry phenomena for molecular properties
Not only are denoising diffusion models incredibly powerful, the maths behind them is very cool! They are effectively learning to do gradient ascent using gradient descent, kind of a meta gradient descent. There's also a great connection with comp. chem.
Neural network potentials are already enabling this. (Diffusion models and Alphafold can also be interpreted as more general examples of neural network potentials) We should soon be able to accurately simulate the intermediate scale processes smaller than we can observe directly.
I'm incredibly thankful and excited to say that I've been awarded an
@arc_gov_au
#DECRA
fellowship based at
@UQ_News
to work on discovering new electrolyte solutions for energy storage applications!
This is related to diffusion models as you can show that if you train a diffusion model on equilibrium structures (not just minima) with low nose the score you learn corresponds to the true mean forces and you’re therefore implicitly learning the actual free energy surface.
Wow source coding theorem is awesome. Why can you store a crystal structure in a small file vs a liquid which needs a very big data file? One has much higher physical entropy and this needs more data! That intertwining of physical
and information theoretic entropy is beautiful
Just read this excellent paper from
@ixfoduap
. This is a really useful tool! Developing optimizable DFT functions is absolutely critical for simulating important realistic systems.
I think this is a profound paper … This is what ‘grokking’ is right? A sharp jump downward in energy/loss? It’s just a phase transition right? Stat. mech. must have the tools to explain the success of deep neural networks.
Seems like ML is playing a similar role in climate physics as it is in molecular scale physics. My dream is for someone to build a first principles based global climate simulation using iterative coarse graining. Is that ridiculous?
Finally finished a new review paper: "Machine learning for climate physics and simulations"
@turbulentjet
We highlight the distinct yet complementary goals of ML: accelerating simulations vs learning physics. Share with us your favorite ML4climate papers!
This approximation of linear forces back to minima has a long history in physics its called the harmonic approximation and physicist famously use it everywhere. It results in a Gaussian probability distribution and a nice smooth surface to optimise on. This paper outlines this
Another incredibly impressive electrolyte simulation paper with machine learning potentials. Still many potential improvements though too. Soon we will be able to predict almost everything you could want to know about a given electrolyte with no experiment necessary.
In particular at each scale there is the problem of ignoring the fast dynamics that can be ignored or approximated with Gaussian noise and keeping track of the important features that are useful for prediction. Machine learning is the perfect tool for this.
This is why I think simulating liquids is a perfect application of machine learning to science. It’s a problem where you need to do a ton of inference mostly in domain very fast ie the energies of every frame of a simulation.
💯
Scientific discovery is about at the tails of existing knowledge/data, machine learning is about the bulk of the existing data.
Doesn't mean the latter cannot assist with the former, but it's highly non trivial.
Half of my time line is people saying AI is just curve fitting anti science. The other half is righty freaking out cause it can do stuff like this. I’m so confused.
Exciting times: we have the accuracy with quantum chemistry and the speed with ML that we need to really start cooking with computational chemistry for many condensed phase applications. From: Industry needs to invest in this.
First off a neural network potential is just a very flexible function with a ton of parameters that takes in positions and outputs energies/forces. More detail in this thread:
Ok so what is a neural network potential concretely? It's just a very flexible function with many adjustable parameters that you fit to the 'potential energy surface.' This is just the energy as function of the position of the atoms in your system. 1/n
Is there a machine/deep learning textbook anywhere that teaches the Boltzmann/Gibbs distribution? Is it called something else? Have looked at three so far with no mention of it. It is the entropy maximising distribution! Surely it is important to know?
I actually think this is a great example of why AI is so important. This is a relatively simple combination of ions and solvents that has this remarkable ability and yet it has taken us decades to discover it.
Machine learning seems to have had a much less dramatic impact on direct quantum chemistry than I expected. Maybe it’s still early days but maybe also it’s because a lot of the algorithms are already so similar to ml algorithms that there is not much to be gained.
Another trick diffusion models use? They start with a high noise level and then gradually reduce it to refine the distribution. This is essentially thermal annealing another tool from stat mech.
"The pressures facing today's young research scientists makes it hard for them to find the time simply to think" - Roger Penrose, in this first-rate profile from
@philipcball
So when the diffusion model learns the score it is implicitly learning a free energy and its gradient essentially making it a type of NNP. Now you may object that AF3 doesn't know anything about the forces or the energies as it is trained on the PDB so how can it possible be
And a final trick? They use Langevin dynamics for inference. A standard molecular simulation algorithm invented by a Physicist. When you run Langevin dynamics you get Boltzmann probabilities. i,e,. exponential of the free energy. So we end up back with original probabilities.
This is a fascinating and excellent piece. IMO this part might be a bit pessimistic though. Extending quantum chemistry to larger and longer scales is actually a problem perfectly suited to ML using exactly the same tool that makes alphafold work: equivariant NNPs.
Predicting energies from coordinates is much more general than just simulating atoms directly though. We often want to ignore parts of the system (marginalise) and this means we want to calculate free eneriges which again determine the probabilities of particular arrangements.
New preprint where I show a simple exponential potential added to the hydrogen bond significantly improves the description of water with the SCAN functional. No need to run it at 330 K anymore.
#compchem
#theochem
One I’m working on now is simply the prediction of the thermodynamic properties of electrolyte solutions. We can easily generate large high quality data sets of properties of these solutions from QM and it should be possible to predict their properties from that with GNN. 5/7
This is an essentially identical problem but can work for much bigger particles. You just train them to learn the average forces of a subset of your particles. (Free energies are just determined by the average forces which is a very nice stat mech trick).
For a single particle in 2D you can visualise this concretely as a real surface where the height corresponds to the potential energy. Mostly we care about hugely high dimensional versions of this though where you have many particles, so you have:
Yeah very important work. This is the key issue preventing progress now: Force field dependence. We can overcome this with machine learning potentials and good DFT now.
To all budding compbio & ML folks interested in bio: Don't just only run behind the latest ML model hype train. The greatest long run impact will come by really assimilating prior bio/compbio literature with the goal of really understanding strategies for how to model biology. 1/
Why is it called a PMF? Because you can show that its derivative with position is equal the average forces in a given configuration. (Nice proof to try yourself) So this allows you to connect microscopic forces to marginalised probabilities which is very useful.
Ha literally just gave a talk where I made this same point. Diffusion models are best understood from an equilibrium stat mech pov even though they are inspired by non equilibrium. Was a bit weird cause it was a NE tat mech workshop.
Equivariance is very cool and draws on some deep mathematics that has already revolutionised theoretical physics. But the intuition is just to use neural networks that can keep track of, and compare, directions not just raw numbers.