zacknovack Profile Banner
Zachary Novack Profile
Zachary Novack

@zacknovack

Followers
649
Following
3K
Media
40
Statuses
282

efficient + controllable music gen | phd @ucsd_cse | research intern @sonyai_global | prev @adoberesearch @stabilityai @mldcmu | teaching drums @pulsepercussion

Joined June 2022
Don't wanna be here? Send us removal request.
@zacknovack
Zachary Novack
2 months
Releasing Stable Audio Open Small!.75ms GPU latency!.7s *mobile* CPU latency!.How?. w/Adversarial Relativistic Contrastive (ARC) Post-Training!.📘:🥁:🤗: Here’s how we made the fastest TTA out there🧵
2
15
84
@zacknovack
Zachary Novack
6 days
RT @xunhuang1995: We should have called it "scaling up rollout", not RL. RL is a necessary evil for the discrete nature of language. My int….
0
16
0
@zacknovack
Zachary Novack
7 days
RT @thepatch_kev: made a @huggingface space for custom sample generation using stable-audio-open-small. already had an api in my backend,….
0
3
0
@zacknovack
Zachary Novack
13 days
We're organizing the AI for Music workshop at @NeurIPSConf in San Diego!. We'll be accepting both papers + demos w/an initial deadline of August 22, well timed for early visibility on your ICASSP/ICLR drafts 👀. Check out the website for more:.
aiformusicworkshop.github.io
NeurIPS 2025 Workshop on AI for Music
@hermanhwdong
Hao-Wen (Herman) Dong 董皓文
13 days
🔥Happy to announce that the AI for Music Workshop is coming to #NeurIPS2025!. We have an amazing lineup of speakers! We call for papers & demos (due on August 22)!. See you in San Diego!🏖️. @chrisdonahuey @Ilaria__Manco @zawazaw @huangcza @McAuleyLabUCSD @zacknovack @NeurIPSConf
Tweet media one
3
12
53
@zacknovack
Zachary Novack
18 days
Stable Audio Open Small is accepted at #WASPAA2025 @IEEE_WASPAA ! . Can't wait to share the latest in blazingly fast, on-device text-to-audio in Lake Tahoe 🏞️.
@_akhaliq
AK
2 months
Stability AI just dropped Stable Audio Open Small on Hugging Face. Fast Text-to-Audio Generation with Adversarial Post-Training
4
10
66
@zacknovack
Zachary Novack
18 days
RT @yupenghou97: Did you know tokenization for generative recommendation today looks a lot like LLM tokenization did *10 years* ago?. Meet….
0
29
0
@zacknovack
Zachary Novack
21 days
RT @MardaniMorteza: 📢📢 Elucidated Rolling Diffusion Models (ERDM). How can we stably roll out diffusion models for sequence generation in d….
0
23
0
@zacknovack
Zachary Novack
23 days
I always like those paper/author visualizations for other conferences, so I ~vibe coded~ up an interactive one for #ISMIR2025 @ISMIRConf ! Go check it out at:. Will hopefully add paper links and other metadata in the coming weeks :)
0
5
29
@zacknovack
Zachary Novack
26 days
RT @wuyusongwys: It’s been a thrilling journey building FLAM! 🚀 Super proud of what we achieved open‑vocabulary audio event detection using….
arxiv.org
We present BraWl, a Fortran package implementing a range of conventional and enhanced sampling algorithms for exploration of the phase space of the Bragg-Williams model, facilitating study of...
0
11
0
@zacknovack
Zachary Novack
1 month
RT @thepatch_kev: stable audio open small is great for stacking multiple generations. @zacknovack @_lyraaaa_ the ux speriments continue. c….
0
3
0
@zacknovack
Zachary Novack
1 month
RT @thepatch_kev: live coding with stable audio open small?. let the vibes begin lol. i love having a bunch endpoints already functioning….
0
1
0
@zacknovack
Zachary Novack
1 month
RT @ArxivSound: ``Video-Guided Text-to-Music Generation Using Public Domain Movie Collections,'' Haven Kim, Zachary Novack, Weihan Xu, Juli….
arxiv.org
Despite recent advancements in music generation systems, their application in film production remains limited, as they struggle to capture the nuances of real-world filmmaking, where filmmakers...
0
4
0
@zacknovack
Zachary Novack
1 month
OSSL Dataset is out and accepted at #ISMIR2025 🇰🇷! High quality soundtrack+movie paired data, all public domain, perfect for your V2M tasks 📽️🎶. Led by the titan @havenpersona, check out the full thread below for more info!.
@havenpersona
Haven Kim
1 month
🎼 Open Screen Sound Library Version 1 Released 🎥 .Hi folks, we've just released a music-video dataset, sourced from public domain films, introduced in our paper "Video-guided text-to-music generation using public domain movie collections" accepted at #ISMIR2025.
0
1
10
@zacknovack
Zachary Novack
1 month
Presenting RUListening! we edit Music-QA benchmarks to *actually* assess audio perception, using text-only LLMs to generate unimodally-hard distractors. Been super excited about this one (led by the beast @yongyi_zang), check out the full thread below!. And at ISMIR 2025!🇰🇷.
@yongyi_zang
Yongyi Zang
1 month
🚨New Audio Benchmark 🚨We find standard LLMs can solve Music-QA benchmarks by just guessing from text only, + LALMs can still answer well when given noise instead of music!. Presenting RUListening: A fully automated pipeline for making Audio-QA benchmarks *actually* assess.
0
6
26
@zacknovack
Zachary Novack
2 months
RT @dadabots: yup, just compiled it & tested. Stable Audio Open Small runs faster than realtime on a mac **CPU**. on a m1 chip you have thr….
0
8
0
@zacknovack
Zachary Novack
2 months
RT @niloofar_mire: We (w @zacknovack @JaechulRoh et al.) are working on #memorization in #audio models & are conducting a human study on ge….
0
7
0
@zacknovack
Zachary Novack
2 months
RT @rajammanabrolu: The checklist bureaucracy creep is real.
0
2
0
@zacknovack
Zachary Novack
2 months
RT @StabilityAI: Today we’re open-sourcing Stable Audio Open Small, a 341M-parameter text-to-audio model optimized to run entirely on @Arm….
0
196
0
@zacknovack
Zachary Novack
2 months
RT @ArxivSound: ``Fast Text-to-Audio Generation with Adversarial Post-Training,'' Zachary Novack, Zach Evans, Zack Zukowski, Josiah Taylor,….
arxiv.org
Text-to-audio systems, while increasingly performant, are slow at inference time, thus making their latency unpractical for many creative applications. We present Adversarial...
0
4
0