Introducing Stable Audio 2.0 – a new model capable of producing high-quality, full tracks with coherent musical structure up to three minutes long at 44.1 kHz stereo from a single prompt.
Explore the model and start creating for free at:
Read the
Audio diffusion is powerful. Progress is fast. STOKED to hear what the music community will do. Beware! You will be handed the easy route. Don't be stifled. STEP UP and (1) invent new subgenres & build communities around them (2) break the code, make it do things no one imagined
Our most powerful 3 minute music generating model was using 32GB of VRAM. Just did some optimizations to fit under 8GB. Insane. It will run even if your GPU is mid.
World is about to witness cambrian explosion of new music styles (b/c of neural synthesis) and I couldn't be more stoked. Secondary driver: synthesis finally sounds organic (e.g. full bands & instruments in any genre). Primary driver: it can do weird combinations & stuff beyond
Our most powerful 3 minute music generating model was using 32GB of VRAM. Just did some optimizations to fit under 8GB. Insane. It will run even if your GPU is mid.
Today Zack Z officially joins Harmonai! 🥳🎉Both Dadabots are in. We're unstoppable. Get ready for the awesome sauce, get your . .get your snorkels ready, or scuba gear, cuz. . we swim w/ the best Zacks in town and the best Zacks all around! 🌊🌊🌊🤿
Just finished assembling
#DadaGP
v1.0 --- a tokenized symbolic music dataset of 26181 GuitarPro songs. Totaling 115M tokens, about as big as WikiText-103. Includes GuitarPro5 encoder/decoder. Who wants to train a generator?
#nlp
#mir
#languagemodel
#transformer
@huggingface
Announcing Stable Audio 2.0 paper!
- DiT beats U-net
- Autoencoder compression ftw
- Achieved 4:45 long window (better at song structure than LMs)
- is fast
We've released our paper on the model behind Stable Audio 2.0!
Our model can generate high-fidelity music with lengths up to 4 minutes 45 seconds.
Paper:
Demos:
SoundCloud:
Just realized how much I resonate with this Miles Davis quote. . in terms of using ML. From bands, to electronic music, to algorithms, to AI, to me it's obvious this is the trajectory of sounding more like myself. Poignant b/c seems ridiculous to some. Anyone else feel this way?
✨ Introducing Maroofy
Search for any song & it'll use the song's audio to find similar-sounding music.
🧠 Powered by an AI model trained on 120M+ songs, for 🔥 recommendations.
@_buildspace
@fdotinc
@FarzaTV
🔊 Demo + link below!
Meta releases a music generator. It's a similar class of model to MusicLM: transformer makes a sequence of VQVAE tokens. It's 32kHz mono.
Samples:
Code:
Paper:
Code MIT license. Model is CCBYNC.
Sad to announce, due to economic and US regulatory conditions, Dadabots will be shutting down. Our runway dried up Q4 and we’ve been running on fumes. After a long discussion with the board and investors, we decided it was best to sunset the company. 😭😭
It's official. We're PhD advisors at Queen Mary University! Stoked to make fricking awesome AI music projects with
@umpedronosapato
at
@c4dm
@CDT_AI_Music
for
@ismir2021
& beyond -- Check out the program here, they have fully funded PhD programs>>>
while you guys were in Marfa or Lost Lands I was here in this corner at an empty rave in an abandoned hotel, listening to harsh noise, deep in the liminal spaces
JK zack and I are a band. We’re not a company. We make music out of compulsion. It’s the only thing that makes us happy. We’ll play until we die. And we’ll find a way after we die too 💀
Grimes: "The most most important issue with AI music isn't who gets paid (which is what most music producers have been vocally concerned about) . . but the atrophy of human learning"
Our latest paper just got accepted via peer review. It details a new scientific procedure combining neurofeedback, RLHF, raw audio neural synthesis, and Rick Rubin's brain. Here's a clip from the preprint on arXiv
Two years ago
@EMostaque
rented a huge GPU cluster & pulled the DIY ML community onto it, boosting the efforts of self-motivated, resource-constrained creative coders like us working on open source. Legend.
🎉🥳 Woooooooooo 🎉🥳
We made it into Time magazine's Best Inventions of 2023 🙏 There were 14x AI inventions, 3x of which were music along w/
@AudioshakeAI
(congrats Jessica!) and So-VITS-SVC (singing voice deep fakes)
One of the most annoying parts of popular press AI narratives is a belief corporations are leading the culture
. . yuck! 🤢
have you not paid attention to how the culture evolves out from open source communities, github repos, hackathons, artist communities, hackerspaces?
🧵
I've learned many instruments, but voice is the most immediately expressive. In this sick video, Ummet Ozcan uses Musicfy's couple dozen (RVC?) instrument models.
Lately I've been beatboxing into our prompt models that generate full bands/subgenres. Can't wait to share more!!
voice to music
we just launched a feature that allows you to sing and turn your notes into any instrument you want
pretty cool to see how AI is giving humans the ability to do things they never could before
Free to remix, Public Domain, 12 hours of ai FUNK
For every music producer that got started sampling funk records, but couldn't afford licensing fees, and had to change their style out of fear of getting sued
#PublicEnemy
@keyonchrist
THIS IS AWESOME. Nendo is like the Echonest Remix API, but updated to modern neural nets. It's what it would've become if it weren't shutdown in the Spotify acquisition. I highly recommend playing w/ Nendo colab and digging into automatic-remix plugin design. BRING ON THE BOTS.
Introducing Remix - a tool made with Nendo that generates remixes of any song in any style. Upload a song or YT video & have fun! (For research purposes only. Usage is at your own risk)
Colab:
Repo:
Examples in this thread
I chatted w/
@Grimezsz
in the kitchen at the AGI House hackathon. She has thought more deeply about AI/art/music than almost everyone I've talked to.
The thoughts of grimes:
👇
oh fuck i just invented a new kind of audio synthesis.
it's a program that generates music, but for a unique kind of microprocessor
and the music that comes out is unlike most music (yet), catchy in its own way.
made some beautiful pieces already! stay tuned
My Artist Brain is now available for download. 🧠
Combobulator by DataMind Audio is a real-time neural audio synthesis plugin. Input any audio signal, and the AI, ethically sourced and trained on artist-created datasets, will recreate the timbre in real-time.
DONT prompt for what you want. A secret to getting a unique sound in a subgenre (that doesn't sound like others in the subgenre) is to NOT prompt for that subgenre, but instead triangulate it from the flanks by mixing opposing influences which can derive it.
Special Week One Announcement ❤️🔥
Bonus Drop from audio visionary
@keyonchrist
The first sounds made with quantum machine learning are Shōyu's first audio-based NFTs.
Listen to history, next week 🎧🎚
Falling asleep to a neverending stream of new music generated from my own GPU — in any weird fusion genre i think of — 🥲 I found my happy place, I can drop out of society, I don’t need the internet anymore
I used to think pop music was too boring, simple, uninteresting. Until I tried producing it myself at the same level of quality. The uninitiated ear isn't yet tuned to appreciate the audio engineering. Big dunning-kruger effect. (In that way it's just like jazz & classical :P)
The judges voted us
#1
, we won the Jury Vote for AI Song Contest 2023! 🥳 But congrats to "How would you touch me?" for surpassing us on points w/ the televote🎉
Proud of everyone who worked hard to make awesome music. Fun hanging w/ the teams, hope to see you meatspace!!
📢 New model! 📢
As a festive treat, we’re giving pro users access to the beta version of our improved model. 🎄🎵
Generate your best music tracks yet from short and long descriptive prompts.
Currently outputs are 45 seconds. Much longer soon...
🧵
The advances in audio models are crazy, a real quantum leap (also quantum music models exist..).
I think 2022 was year of image, 2023 text, 2024 3D & audio, 2025 all of them come together.
Since 2019, it's possibly the longest continuous livestream on YouTube. I just ssh'ed into the server to render some WAVs, accidentally rendered an INFINITE amount, filled the disk to 100%, almost f^&*ed up the whole experiment 💀 Saved it tho. All good!
audio ml scene these days
> we won’t release the model checkpoints because users can clone voice, use highly realistic voice
> here’s an api that does the same with no checks involved, for less than a $
ffs.
The Creative Adversarial Network is kinda like the opposite of a conditional GAN -- it tries to generate realistic examples that don't fall into known classes/styles/genres -- here Nao Tokui applies it to beats -- and they sound crazy
Most ppl listen to generative music outputs through lens of expectation w/ what they already know -- but the new breed of artists are hearing it for what it is and running with it. Understanding the feeling, matching visuals, context, story. Artists who are based I salute you🫡
Writing neural network code using ChatGPT, asking it to explain the code, asking it to add features, having the code it writes works-- giving me a "this is a spooky good" moment
Made a script for automating
@udiomusic
from python.
- download the py file here
- paste in your login cookie (instructions in the file)
- in a terminal window:
python3
I ruptured my ACL in an archspire moshpit. Just received an allograph to replace it with donor tissue from a dead body. I am now part cadaver. Also part metal. Thank you 🙏
Dedicated to anyone afraid of generative AI, dreading a degenerate dystopia of fast food AI art, hating on musicians using machine learning in their production, or anxious about all-powerful machines surpassing the human capacity for imagination
"fine-tuning MusicGen without prompts to generate music with a specific style .... thanks to Dadabots for the inspiration." thanks for the cool repo Jonas Massa!
genai +apps+ are only temporary. the inevitable equilibrium of AI is open source. you will run them on your own computer or phone. you will not even need internet.
you will run open LLMs not chatgpt.
you will tune bespoke ai music models and listen to their infinite streams.
ChatGPT is awesome. From matlab to numpy to pytorch, impenetrably written DSP functions like this are everywhere. What does it do and why? I have no f^%$ing idea. Paste it into chatGPT and ask it to re-write it clearly with comments. It can explain the PURPOSE of each line. Voila
We just used neural nets to fuse the production of deathcore and with the rhythm funk -- and more -- results are fucking amazing -- over a dozen albums coming soon
I'm surprised the Stable Diffusion / ChatGPT / Generative AI projects haven't come hard for (at least instrumental) music, yet. If they can cop Caravaggio, surely they can remix chiptunes.
I like AI music tracks that include a healthy amount of "AI weirdness". Because SURE it's a benchmark to create something indistinguishable from real, and it's fun to revel in the mischief, but it's beautiful to take music somewhere it hasn't gone yet. Keep ai music weird.