Chris Donahue
@chrisdonahuey
Followers
5K
Following
6K
Media
100
Statuses
1K
GenAI for *human* creativity in music + more. Assistant prof at CMU CSD, ๐ผ G-CLef lab. Part time Google DeepMind, Magenta (views my own)
Pittsburgh, PA
Joined January 2012
Excited to share SingSong, a system which can generate instrumental accompaniments to pair with input vocals! ๐ https://t.co/1mRUaXvqVy ๐ https://t.co/8RGezPu5YQ Work co-led by myself, @antoine_caillon, and @ada_rob as part of @GoogleMagenta and the broader MusicLM project ๐งต
138
722
3K
See https://t.co/Xj92707LZU to ๐ณ๏ธ vote on your favorite, or to download a comprehensive data release for the preferences we've collected so far Thanks @yonghyunk1m @SonyAI_global @arena !
huggingface.co
0
1
3
๐ตMusic Arena โ๏ธ was accepted to the NeurIPS 2025 Creativity Track, and we've released a big update to celebrate! Includes new models from @SonautoAI and @elevenlabsio. Also, Music Arena is now available as a ๐ค @huggingface space and dataset!
1
8
40
For more information, we've prepared a ๐ฐ blog post with all of these findings and more: https://t.co/k8SN2pShW0 โ๏ธ Music Arena: https://t.co/1sTVjMfH09 ๐ Paper: https://t.co/HqTg5PHxZm Data:
arxiv.org
We present Music Arena, an open platform for scalable human preference evaluation of text-to-music (TTM) models. Soliciting human preferences via listening studies is the gold standard for...
0
0
4
Thanks to blog post co-authors (@yonghyunk1m, Nathan Pruyne), all our collaborators (@iamwaynechi @ml_angelopoulos @infwinston @koichi__saito @shinjiw_at_cmu @mittu1204), and others (@sonyai_global, @lmarena_ai, @riffusionai_ )!
1
0
4
We will be continuously updating Music Arena with new systems! Please reach out if you are interested in evaluating your music generation model on our platform
1
0
3
We collect natural language feedback in addition to binary prefs. Users tend to comment on both generation quality and prompt adherence. Sentiment analysis on this feedback is correlated with win rates, though also reveals new system-specific strengths and weaknesses!
1
0
3
Most of our users write their own prompts, as opposed to using one of our built in suggestions. User prompts emphasize genres, instruments, and modes, and most are very short (median length 7).
1
0
3
We also observe a weak (ฯ=0.082) but significant (p=0.012) correlation between the amount of time a user spends listening to a pair of outputs, and the overall "difficulty" of the comparison (codified by negative absolute difference in Arena score)
1
0
3
We collect nuanced listening behavior on Music Arena, revealing new insights E.g.: listening behavior differs dramatically between the 1st and 2nd tracks a user observes. Users listen to the 1st at length, then decide their preference after only the first few seconds of the 2nd
1
2
8
We aim for *comprehensive* transparency of the Music Arena platform. To this end, this first update comes paired with a comprehensive data release, to be updated on a rolling basis. โ๏ธ https://t.co/00HvKj3zPx โญ๏ธ https://t.co/9f3mvsfco2
github.com
Contribute to gclef-cmu/music-arena development by creating an account on GitHub.
1
0
6
Music Arena went into public beta on July 28. In the first ~month of use, we collected 1051 votes on 1420 battles. We compile two leaderboards for instrumental-only (2/3 of votes, first tweet) and w/ vocals (1/3 of votes, below)
1
0
2
Imagine AI that cares about your health. WaterMinderโs AI Gulp Detection is like having a hydration coach in your pocket. Sip water โ App detects it โ Earn Rewards. Download now and see it work in real time.
4
6
31
Sharing our initial leaderboard and open data release for ๐ถMusic Arenaโ๏ธ! Music is subjective and multi-dimensional. A key goal of Music Arena is to provide insights beyond binary preferences! ๐งต
2
21
87
Thanks to our team Yichen (Will) Huang, @zacknovack @Koichi__Saito @jiatongshi @_shinjiwatanabe @mittu1204 @jwthickstun and @SonyAI_global for the support! So don't be sad about your music gen evals, get MAD! ๐บ๐ก๐บ https://t.co/2U2Oz4sHpS
github.com
Contribute to i-need-sleep/mad development by creating an account on GitHub.
0
0
4
To facilitate robust + reliable music gen eval, we release MAD as a drop-in replacement (w/MIT license!) for metrics like FAD/MMD, and MusicPrefs on HF for further eval research and preference modeling! https://t.co/2U2Oz4sHpS
https://t.co/p9GJpLZbOl
huggingface.co
1
0
1
We find that MAD has much stronger rank correlation (๐=0.62) with human preferences than FAD (๐=0.14)!
1
0
0
QM LIVE: @qmbigbeat and @cryptohondo with YOUR MORNING LOOK AT THE MARKETS
1
0
12
Second, we measure correlation between automatic metrics and MusicPrefs: a dataset of pairwise human prefs for text-to-music that we collect via MTurk. https://t.co/p9GJpLZJDT
1
0
0
We find that FAD lacks sensitivity to important desiderata like musicality. We propose an alternative based on our meta evaluation findings: FAD: *VGGish* embeddings + *Frechet Distance* divergence โฌ๏ธ MAD: *MERT* embeddings + *MAUVE* divergence
1
0
0
We meta evaluate reference-based music evaluation metrics in two stages. First, we systematically search over a design space of embeddings and divergences (inclusive of FAD), using synthetic datasets capturing sensitivity to specific desiderata (e.g., musicality, diversity).
1
0
0
The capabilities of music generation models continue to improve, but progress on evaluation lags behind. Popular automatic metrics such as FAD were developed for different tasks (music enhancement) and their relevance to music gen is unclear. So how can we do better?
1
0
0
Concerned about biases and politics in your children's education? Be an advocate with these 6 simple steps. Follow THINC to learn more about what you can do as a parent.
1
2
6
Eval for music generation is notoriously ill-defined, but no fear! Presenting MAD, a new metric for music quality with stronger alignment to human preferences. Appearing at ISMIR this week! โญ: https://t.co/2U2Oz4sHpS ๐: https://t.co/GxMDGmbSt1 ๐: https://t.co/gpw7OrEfz0 ๐งต
1
6
43