New Paper Out.
Evolutionary Optimization of Model Merging Recipes
Our goal is not about training any particular individual foundation model. Instead, we think it makes more sense to create the machinery to automatically generate foundation models for us!
Introducing Evolutionary Model Merge: A new approach bringing us closer to automating foundation model development. We use evolution to find great ways of combining open-source models, building new powerful foundation models with user-specified abilities!
Pushing around these little robot soccer players, from DeepMind’s “Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning” paper.
A
#StableDiffusion
model trained on images of Japanese Kanji characters came up with “Fake Kanji” for novel concepts like Skyscraper, Pikachu, Elon Musk, Deep Learning, YouTube, Gundam, Singularity, etc.
They kind of make sense. Not bad!
Interesting physics analogy of ML from the viewpoint of compression (
@elonmusk
)
“Physics formulas are compression algorithms for reality…If you ran physics simulation of the universe, eventually you will have sentience…At what point from hydrogen to us did it become sentient?”
A fun way to learn about neural networks and AI is to implement a simulation game giving your agents little neural net brains, and training them using a simple method like evolution.
This demo trains a small neural network to drive around the track after only a few generations:
One of the most well-known pieces of software for downloading YouTube videos, “youtube-dl” was removed from GitHub following a takedown notice from the Recording Industry Association of America, or RIAA.
Someone encoded the source code into two images and put it on Twitter:
Personal Announcement! I’m launching
@SakanaAILabs
together with my friend, Llion Jones (
@YesThisIsLion
).
is a new R&D-focused company based in Tokyo, Japan.
We’re on a quest to create a new kind of foundation model based on nature-inspired intelligence!
Some personal news: After six years at Google, I decided it was time for me to leave and try something new again.
I had a fantastic time at Google Brain, and I’ll miss my friends, collaborators, and hanging out at the microkitchens!
Artificial lifeforms are super fascinating to watch.
These self-organizing, self-replicating, “lifeforms” emerged from a continuous time cellular automata system called Flow-Lenia.
Lenia is a family of CAs generalizing Conway’s Game of Life to continuous space, time and states.
New blog post: Collective Intelligence for Deep Learning
Recently,
@yujin_tang
and I published a paper about how ideas like swarm behavior, self-organization, emergence are gaining traction in deep learning.
I wrote a blog post summarizing the key ideas:
Unlike other weeding technologies, this
#robot
utilizes high-power lasers to eradicate weeds, without disturbing the soil... And, avoiding the use of herbicides!
It leverages
#AI
to instantly identify and target weeds while rolling, days and night
By Carbon Robotics
#green
Weight Agnostic Neural Networks 🦎
Inspired by precocial species in biology, we set out to search for neural net architectures that can already (sort of) perform various tasks even when they use random weight values.
Article:
PDF:
MIT offers an excellent course on Deep Learning for Art, Aesthetics, and Creativity. All of the lecture videos are available on YouTube, with a fantastic list of speakers:
How do you skim a research paper?
I usually read (in order):
1) abstract
2) 1st paragraph of the intro
3) last paragraph of intro (for contributions)
4) 1st paragraph of the conclusion (it's usually one paragraph anyways)
5) figures / tables of results, and read their captions.
If Google doesn’t get their act together and start shipping, they will go down in history as the company who nurtured and trained an entire generation of machine learning researchers and engineers who went on to deploy the technology at other companies… The modern day Bell Labs.
Neural network video streaming SDK from
@NVIDIAAI
can compress video conference data like these at ~0.1KB / frame, roughly 1000x better than H.264 (MPEG-4) compression on the same data (~100KB / frame).
This amazing book on the foundations of machine learning is now available for free from Microsoft as a PDF download. I learned so much from this book over the years, and I feel that much of the material is still relevant. The solutions to the exercises also seem to be available!
"Pattern Recognition and Machine Learning" by
@ChrisBishopMSFT
is now available as a free download. Download your copy today for an introduction to the fields of pattern recognition & machine learning:
#ML
#Insights
Papers with Code: A searchable site that links machine learning papers on ArXiv with code on GitHub. They also tag any framework libraries used, along with other info like GitHub stars. I think such a feature would be a nice addition to ArXiv-Sanity.
Fooling Facial Detection with Fashion
Nice article surveying common face detection methods, and tests practical implementations of adversarial patches on a face mask for fooling them. h/t
@MelMitchell1
High resolution inpainting experiment with
#StableDiffusion2
Transporting the famous Futaba Sushi restaurant in Ginza, Tokyo, to other cities, countries, planets, and finally, to a galaxy far far away…
#StableDiffusion
#AIart
OpenAI, Google & Anthropic ban the use of the generated output content from their AI models to train other AI models, under their terms-of-service. However, they’ve been using other online content for their own model training. They can’t have it both ways.
Reinforcement Learning for Improving Agent Design: What happens when we let an agent learn a better body design together with learning its task?
article:
pdf:
LIMA, a 65B LLaMa fine-tuned only with supervised learning on 1000 curated examples, without any RLHF, demonstrates remarkably strong performance, generalizes well to unseen tasks not in training data. Comparable to GPT-4, Bard, DaVinc003 in human studies.
I’m super excited to see ideas from complex systems such as swarm intelligence, self-organization, and emergent behavior gain traction again in AI research. We wrote a survey of recent developments that combine ideas from deep learning and complex systems:
Jupyter notebooks with Python examples for reproducing examples from each chapter of Christopher Bishop's “Pattern Recognition and Machine Learning” textbook (also available for free in link above)
The map of the brain, created by an aerospace engineer.
These are the result of six years of research. It’s always interesting to me to view the perspective of one challenging scientific field through the lens of an expert from another field. 🧠
Source:
Self-attention mechanism can be viewed as the update rule of a Hopfield network with continuous states.
Deep learning models can take advantage of Hopfield networks as a powerful concept comprising pooling, memory, and attention.
Dive into Deep Learning: An interactive deep learning book with code, math, and discussions, based on the NumPy interface. I really like the format of the textbook!
TinyML and Efficient Deep Learning Computing
MIT 6.5940 ()
“This course will introduce efficient AI computing techniques that enable powerful deep learning applications on resource-constrained devices. Topics include model compression, pruning,
Conventional thinking: Build a robot to solve the problem.
Out-of-the-box thinking: Get the problem to solve itself.
Example: Self-solving Rubik's Cube by
@takashikaburagi
Japan recently reaffirmed that it will not enforce copyrights on data used in AI training.
The policy allows AI to use any data “regardless of whether it is for non-profit or commercial purposes, whether it is an act other than reproduction, or whether it is content obtained
Falcon-40B, a state-of-the-art large language model, trained on a massive English web dataset called RefinedWeb, has now been released by
@TIIuae
under a truly permissive open-source license (Apache 2.0)‼️
This is a big step forward for Open-Source AI. 🎊
Recently I’ve been tweeting more about the
#HongKongProtests
and less about my views on machine learning and artificial intelligence.
I apologise for the inconvenience ...
Really striking how every AI hype tweet is explicitly trying to induce FOMO. "You're getting left behind. All your competitors are using this. Everyone else is making more money than you. Everyone else is more productive. If you're not using the latest XYZ you're missing out."
In other news, someone crashed the Chanel show on the final day of Paris Fashion Week. She blended in so well the security had a difficult time finding her.