I'm super excited to share a new initiative I am a part of!
Announcing: Polymathic AI 🎉
We are developing foundation models for scientific *data*, such that they can leverage shared concepts across disciplines.
1/6
John von Neumann: "with four parameters I can fit an elephant"
Meanwhile, this paper : "How to fit any dataset with a single parameter"
Here's a function with a *single* parameter. Even worse: it's differentiable and continuous!
Very excited to share our new paper "Discovering Symbolic Models from Deep Learning with Inductive Biases"!
We describe an approach to convert a deep model into an equivalent symbolic equation.
Blog/code:
Paper:
Thread👇
1/n
Three years ago, I started working on an easy-to-use tool for interpretable machine learning in science. I wanted it to do for symbolic regression what Theano did for deep learning.
Today, I am beyond excited to share with you the paper describing it!
1.
Here's a condensed version of the matplotlib cheatsheets so it can fit a desktop background
()
Full image:
and vectorized .svg, with the non-standard fonts outlined:
Thanks
@NPRougier
et al for making it!
It's crazy how over time I have slowly replaced all of my command line tools with Rust equivalents 🦀
- cat → bat
- pip → uv
- grep → ripgrep
- htop → zenith
- fswatch → watchexec
Any other good ones?
If you’ve never tried it, is the single best explanatory tool for neural networks. An essential demo for any deep learning course!
I still notice improvements in my intuition just by tinkering with it.
From
@dsmilkov
@shancarter
.
Life update: this fall I will be joining the University of Cambridge as Assistant Professor!
I will be appointed as joint faculty between DAMTP and the Institute of Astronomy 🚀
Today I learned you can write numbers like this in Python (!!)
Makes it easier to read long numbers by separating digits into groups, just like 1,000,000.
It’s so esoteric that Google Colab doesn’t even color it correctly!
A matplotlib trick that I wish I learned a long time ago:
To adjust resolution of figures, rather than using
plt.figure(figsize=(8, 8))
followed by a tweaking of every font size, you can just increase the resolution with:
plt.figure(dpi=300)
I'm starting a curated list of interactive machine learning demos:
. Looking for more suggestions!
My plan is to incorporate some into the ML modules of Cambridge's new MPhil in Data Intensive Science, as a way to hone students' intuition.
I am blown away by . Such a useful tool for research.
This in-browser graphical LaTeX tool gives you free-form drawing (tikz export), WYSIWYG rendering, symbol shortcuts, and even picture-based symbol search. I might even write full papers in this...
TabNine is awesome: .
It suggests code completions in real-time using deep learning conditioned on your existing code. Free plugins for Jupyter, vim, emacs, sublime, and VS.
Really enjoying it so far. Thanks
@ykilcher
for pointing it out!
My favorite way to explain a normalizing flow:
- There's a crowd of people; each is a sample of the data distribution.
- Everybody takes a step in some direction according to a neural net
- In steps, the net tries to direct the crowd to form a Gaussian without bumping each other
If you use PyTorch, I highly recommend checking out
@huggingface
's Accelerate: . It's as minimal as it is powerful: multi-GPU/TPU training, while still preserving your original training loop!
+ you can even run multi-device from a Jupyter notebook:
The more I use Julia, the more Python and its numeric libraries look like a Victorian-era stagecoach with jet engines duct-taped to it, each pointing a different direction (=mutually incompatible).
It's such a weird ecosystem, and makes it so much harder for users to contribute.
Just learned about Python Fire, and wish I had heard about it years ago. Seems like an amazing library for productivity!
Fire turns any Python object—function, class, etc—into a command line interface:
Gone are the days of argparse.ArgumentParser and sys.argv...
Wow,
@sagemath
's LaTeX package is amazing.
It bridges the gap between symbolic math software and LaTeX presentation. Uses SymPy as the algebra backend and formats the output into the pdf.
Wish I knew about this in undergrad!
Excited to share Penzai, a JAX research toolkit from
@GoogleDeepMind
for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere.
Check it out on GitHub:
Just-in-time compiled languages are not supposed to be this fast at startup... The speed of the upcoming Julia update is ridiculous.
(Julia 1.10-alpha vs. Python 3.11)
Our paper demonstrating the power of Bayesian Neural Networks for planetary dynamics comes out in PNAS today!
(open access)
This paper explores a match made in heaven: chaotic systems and Bayesian neural networks.
Thread:
Wow, JAX is amazing. Thanks for introducing me
@shoyer
. It's essentially numpy on steroids: parallel functions, GPU support, autodiff, JIT compilation, deep learning.
#NeurIPS2019
Happy to announce SymbolicRegression.jl, a Julia package for learning equations via evolution! It supports distributed computing, allows user-defined operators (even discontinuous!), and exports to SymbolicUtils.jl.
v0.4+ of PySR uses this as backend.
So 1) Lagrangian/Hamiltonian NNs enforce time symmetry, 2) Graph Nets enforce translational symmetry, and 3) Group-CNNs enforce rotational symmetry.
But are there any NNs that can enforce an arbitrary learned symmetry?
@wellingmax
@DaniloJRezende
@KyleCranmer
?
Very excited to present our new work: we adapt Bayesian neural networks to predict the dissolution of compact planetary systems, a variant of the three-body problem!
Blogpost/code:
Paper:
API:
Thread: 👇
The forced hash collision idea from InstantNGP () remains one of the most creative ideas I've ever seen in deep learning.
I tried to explain it to someone today and had no idea where to start... it's too unconventional (in a good way!). And it works well!
This paper distills neural networks onto FPGAs with symbolic regression, obtaining a 5 NANOSECOND inference time!!
Super cool application of PySR and awesome work by the lead authors 🙌
I packed-up a full-text paper scraper, vector database, and LLM into a CLI to answer questions from only highly-cited peer-reviewed papers. Feels unreal to be able instantly get answers by an LLM "reading" dozens of papers. 1/2
Here's a thread on lesser-known tools and packages that I could not live without, starting with Python.
(suggestions are very welcome!)
einops:
-
- Easily-interpretable reshapes + tiling + aggregations for numpy/torch/tf/etc
1/n
I regret not reading through the full LaTeX physics package earlier; so many more features than I realized. Many commands that I usually define by hand...
e.g., some macros for partial derivatives:
I'm late, but weight averaging seems like a great trick for improving DL generalization (
@Pavel_Izmailov
et al).
Take a pretrained model, do SGD about minima, and average weights. Thanks
@andrewgwils
for recommendation!
Found a big improvement in my tuned model at zero cost:
Okay, Pluto.jl is the best part of Julia I've seen so far. Absolutely game-changing.
It's Jupyter, but reactive: change a variable, and the entire notebook updates.
This means you can do things like use a slider to vary a parameter in some cell... and see all your plots change!
My lectures this week include 'Best practices' and I will be assigning
@karpathy
's neural net training blog for reading material :)
Really an *essential* read for every practitioner!
Very excited to start teaching my deep learning course at Cambridge this week, as part of our Data Intensive Science MPhil!
Teaching the first part from
@SimonPrinceAI
's "Understanding Deep Learning" book, which has quickly become one of my favorite textbooks in *any* field.
I'm really starting to like
@michael_nielsen
's strategy of reading papers. Write down a question about the background or results, find the answer, distill, repeat.
It feels like test-driven development. Write a test, make it work, refactor, repeat.
ChatGPT has almost completely replaced StackOverflow for me at this point.
Getting context-specific answers with detailed explanations that I can iterate on in a pair programming-like fashion is incredible.
The crazy part is this is only GPT-3.5...
In a neural network, is there a type of regularization which encourages one learned feature to be independent, **including nonlinearly,** of other features in the same layer?
I can’t use a bottleneck or sparsity constraint—I actually want to maximize the dimensionality!
Using paperqa, I fed GPT every paper in my Zotero library and asked: "What are some ways machine learning can be used in observational astronomy?"
It generated the entire literature review below. Not bad at all!
with
@andrewwhite01
's
Very excited to share our new paper "Discovering Symbolic Models from Deep Learning with Inductive Biases"!
We describe an approach to convert a deep model into an equivalent symbolic equation.
Blog/code:
Paper:
Thread👇
1/n
After two months of studying, I have just passed my comprehensive exam at Princeton 🎉 Officially a PhD candidate!
Excited to get back to doing research!
Made a functional SymPy->JAX converter equivalent :)
Works with grad, vmap, jit, etc. PySR/SymbolicRegression.jl can automatically convert discovered expressions to vectorized JAX models now; will add PyTorch soon...
Put together a micro-library for turning SymPy expressions into PyTorch Modules.
Symbols becomes inputs, and floats become trainable parameters. Train your SymPy expressions by gradient descent!
Are there any review articles which study the importance of open-source software for the sciences?
Relatedly, here's a great quote from Freeman Dyson, which I think also underlines the importance of free and open-source software.
Happy to share I will be doing a research internship at
@DeepMind
from July-November with
@PeterWBattaglia
and
@DaniloJRezende
. Excited to work on some new approaches to AI for Physics!
PyTorch Lightning’s greatest strength is that it implements a vast amount of deep learning tips and tricks which would typically take years to pick up.
e.g., previously I'd never heard of gradient clipping. I turned it on and my model's NaNs vanished!
I have been playing around with
@PyTorchLightnin
and I am pleasantly surprised!
Very good level of abstraction if you want full control over the model & some production-level tools, eg, many loggers and quick debug iterations.
Kudos to the team! Looking forward to 1.0. 💪
Our new paper on "The Bayesian Learning Rule" is now on arXiv, where we provide a common learning-principle behind a variety of learning algorithms (optimization, deep learning, and graphical models).
Guess what, the principle is Bayesian. A very long🧵
The more I use XLA and JAX, the more I see the true potential of its python API: you can do all the crazy pure-python meta-programming you want, so long as the moving parts depend on static arguments, and the optimizer boils it down to the actual tensor operations. So nice!
Regarding Dalle and Imagen:
These systems are *amazing*. However,
I (selfishly) wish that all of that ML expertise and compute was focused on solving scientific problems, rather than generating panda art!
Yes, it advances the field, but why not solve science simultaneously?
Giving mock general exams today at Princeton Astro (oral), and reviewing my favorite tricks:
km/s ≈ pc/Myr
year ≈ 10^7.5 seconds
1" ≈ 5 μrad
R_earth ≈ R_jup/10 ≈ R_sun/100
G ≈ 40 AU^3/(Msun year^2)
m_e ≈ 0.5 MeV/c^2 ≈ m_p/2000
1200 nm => 1 eV
What are other good ones?
Interested in doing a PhD on AI for the physical sciences at Cambridge? I am taking PhD students for 2024!! Please find information below, including a list of projects:
(Deadline typically early December or January, depending on program)
Required reading for anybody using PINNs:
I think PINNs are an exciting idea but many use cases are perhaps better suited to learned NN prediction (for unresolved scales), or just standard numerical integrators (resolved scales).
(1/2)
My Simons Presidential Lecture is up on YouTube!
In this talk I make the argument that 'The Next Great Scientific Theory is Hiding Inside a Neural Network'.
Giving the Presidential Lecture tomorrow at
@SimonsFdn
@FlatironInst
:
"The Next Great Scientific Theory is Hiding Inside a Neural Network"
Will be in NYC until the 10th – please get in touch if you would like to chat!
Words cannot express the perfection of
@TuringLang
for probabilistic inference. It's somehow both intuitive and concise without sacrificing any expressiveness. (Also blazingly fast, of course)
Doing my first real project with it and having a blast.
ML-accelerated scientific discovery in action!
This new paper in ApJ Letters uses PySR to discover a new relation between supermassive black hole mass and properties of its host spiral galaxy:
Extremely cool work!!
1/10 This was a phenomenal discussion. I have many more questions than answers now but I think that's a good thing.
Here's a list of some interesting papers mentioned.
So 1) Lagrangian/Hamiltonian NNs enforce time symmetry, 2) Graph Nets enforce translational symmetry, and 3) Group-CNNs enforce rotational symmetry.
But are there any NNs that can enforce an arbitrary learned symmetry?
@wellingmax
@DaniloJRezende
@KyleCranmer
?
Essential Overleaf trick: you can have Overleaf run *arbitrary* code before each compilation of the PDF!
Write the following code into a file called .latexmkrc in your project, replacing "custom_command" with whatever (e.g., latexdiff).
Very excited to start teaching my deep learning course at Cambridge this week, as part of our Data Intensive Science MPhil!
Teaching the first part from
@SimonPrinceAI
's "Understanding Deep Learning" book, which has quickly become one of my favorite textbooks in *any* field.
Excited to attend JuliaCon for the first time this year!
Will be giving a talk on SymbolicRegression.jl: + uses in science.
This will be the first SR talk where I dive into low-level engineering details. Looking forward to learning from other attendees!
Are you a PhD student who is (1) interested in working on foundation models for science, and (2) experienced with deep learning software?
There is a 1-year internship at Flatiron Institute (NYC) to work on
@PolymathicAI
!
(deadline: Nov 30!)
I am entering the faculty job market for 2023! Very eager to find a position at the intersection of astro/physics and machine learning/data science.
If you happen to see something relevant, please forward to mcranmer
@princeton
.edu - thanks!
Deep learning research seems to suffer from periods of frenzied activity on niche topics.
I think social media worsens the collapse into targeted research problems because it makes FOMO so much stronger. But long-term it is terrible for creativity in the field...
(1/3)
1/2 Why isn't it more common to do explicit Hamiltonian MCMC on a Bayesian Neural Network's weights, with eg the initial condition = the loss minima found via SGD? I'm playing around with one in JAX and it seems to be working reasonably even with 5 chains:
Wish I found this a while ago:
mamba is a much faster backend to conda, with an identical set of commands, same package servers, etc.
My 30-min environment build is now <1 min with zero changes to the yml file...
PyTorch-style deep learning in Julia!
As a longterm PyTorch user I am really happy to see this is possible in
@FluxML
.
The key advantage is that Julia *itself* is autodiff-ready, so you can compute gradients through a complex library without needing a rewrite in a DL framework.
ChatGPT is trained on ~500 GB of text.
~1 byte per character = 5e11 characters
~2000 characters per page = 2.5e8 pages
~0.1 mm thickness per page = 25,000 meters.
So ChatGPT is trained on a book that is 25 km high... (more than double the cruising altitude of commercial planes)
It's amazing how Enzyme is this much faster than JAX for even simple operations!
(Am I doing something wrong, or is differentiating through optimized assembly code really that much faster??)
The idea behind Enzyme differentiation is so cool. It literally performs autodiff through optimized assembly code*, which gives faster derivatives!
Q: Would this let you differentiate in-place array operations?
*(LLVM IR, not machine code)
Okay, here is a function for doing this (modulo shading) in LaTeX, without external illustration tools:
This is what
$$\labmat{2}{3}{X} \cdot \exp(\labmat{3}{2}{Y})$$
looks like:
Thanks
@AgolEric
@rmpnegrinho
for pointers!
PySR paper is coming out tonight.
I'm wondering... should I do a science-themed announcement today, given that ML people are at ICLR, (and then an ML-themed announcement next week)?
Free project idea that I'm too busy to try:
There are a bunch of different preprocessing transformations for ML that try to make non-Gaussian data look more Gaussian (e.g., Yeo-Johnson).
Could you learn a better one with symbolic regression?
1/n
Happy to share our paper on AI for observational astronomy via our new resource allocation algorithm!
"Unsupervised Resource Allocation with Graph Neural Networks"
Blog/code:
Paper:
w/
@peter_melchior
@iamstarnord
Thread 👇
1/n
New PySR release!
The new Python↔Julia interface is massively improved thanks to PythonCall.jl.
Julia can now be used seamlessly as a general backend for writing fast Python libraries!
Excited to give the following talk today at
@YaleAstronomy
's data science seminar: I will play Devil's advocate against my own research area!
(Although in fairness, I will argue that the answer lies in symbolic learning/inductive biases, which I work on)
I feel obligated to retweet this after seeing the 1000th post on stable diffusion…
There are grand challenges of science which are ripe for solving with ML! Don’t get distracted by the latest trendy topic; basic science is a *far* more rewarding application than generative art.
Regarding Dalle and Imagen:
These systems are *amazing*. However,
I (selfishly) wish that all of that ML expertise and compute was focused on solving scientific problems, rather than generating panda art!
Yes, it advances the field, but why not solve science simultaneously?
PySR 0.6.0 released!
This brings efficient *multi-output* symbolic expression searches, as well as ability to export to JAX, PyTorch, and numpy.
JAX/PyTorch expressions have trainable parameters, so you can tune discovered expressions in some deep model!
This is really nice work. Though for research purposes, keep in mind that use of dimensional analysis in symbolic regression is often too strong a prior. It works well for re-discovery, because prior knowledge of the physical constants greatly shrinks the search space. But .../
After 1.5 years of hard work, I am thrilled to share with you Φ-SO - a Physical Symbolic Optimization package that uses deep reinforcement learning to discover physical laws from data . Here is Φ-SO discovering the analytical expression of a damped harmonic oscillator👇
[1/6]
Extremely cool economics paper applying PySR + GNNs to learn symbolic models for international trade!
By Sergiy Verstyuk and Michael R. Douglas (
@HarvardCMSA
)
Interpretable ML on steroids: just launched a 512-worker symbolic regression search with PySR/SymbolicRegression.jl.
It's amazing how stable the pipeline is from IPython=>PyJulia=>Julia=>ClusterManagers.jl. I've never had a single hiccup for one of these massive searches.
🔥Announcing Python Symbolic Regression v0.10!🔥
New features:
1. LaTeX tables (h/t
@SymPy
)
This makes it *really easy* to include discovered analytic models in a research paper. Example 👇
(bonus: which law is this? [P/days])
Someone asked if PySR can rediscover the Mandelbrot set's recursive definition from random elements of the set.
Turns out that you can totally do that!
See code on for examples in both Python and Julia.