
Yusong Wu
@wuyusongwys
Followers
628
Following
217
Media
15
Statuses
58
PhD student at Mila & University of Montreal.
Montréal, Québec
Joined December 2016
We are presenting our poster soon at West Exhibition Hall B2-B3 W-502, at 3:30-4:30pm!. Check it out online:
It’s been a thrilling journey building FLAM! 🚀 Super proud of what we achieved open‑vocabulary audio event detection using calibrated frame‑wise modeling. FLAM will be presented at ICML 2025, come check it out!. 📄 Paper: 🎧 Demo:
2
0
14
It’s been a thrilling journey building FLAM! 🚀 Super proud of what we achieved open‑vocabulary audio event detection using calibrated frame‑wise modeling. FLAM will be presented at ICML 2025, come check it out!. 📄 Paper: 🎧 Demo:
I think we finally cracked it? FLAM can detect *any* sound via text prompts. arXiv (ICML'25): demos: @AdobeResearch+@MIT+@Mila_Quebec led by @wuyusongwys w/@tsirigoc @Kotentorothy @huangcza @AaronCourville @urinieto @pseetharaman
4
11
66
RT @justin_salamon: I think we finally cracked it? FLAM can detect *any* sound via text prompts. arXiv (ICML'25): d….
0
39
0
RT @sunjiao123sun_: Mitigating racial bias from LLMs is a lot easier than removing it from humans! . Can’t believe this happened at the bes….
0
813
0
Our poster is happening today 1.30PM at poster session 2!.
We built a real-time music jamming system using RL and generative models -- you can play along with this model and learn more about our work at #ICML2024 🎶!. 📄 paper: 🌐 website: 🕐 Tue 23 Jul 1:30 - 3 p.m. CEST.📍 Hall C 4-9. 🧵
0
0
23
@koning_robot @kastnerkyle @ada_rob @iansimon @chrisdonahuey @pcastr @natashajaques @huangcza Also work done with @ajscarlatos. The interface and real-time interactive system is built by @ajscarlatos!.
0
0
6
Work done with @koning_robot @kastnerkyle @ada_rob @iansimon @chrisdonahuey @pcastr @natashajaques @huangcza and others!.
1
0
5
We finetune the MLE model with RL, allowing it to adapt to user input and recover from errors. If you want to play with the first RL-powered real-time jamming model, don't miss our poster!. Also check out the full video of @pcastr playing with our system!
1
0
5
We built a real-time music jamming system using RL and generative models -- you can play along with this model and learn more about our work at #ICML2024 🎶!. 📄 paper: 🌐 website: 🕐 Tue 23 Jul 1:30 - 3 p.m. CEST.📍 Hall C 4-9. 🧵
13
36
126
RT @tianyu_zh: [1/n] We are happy to announce our new VLM task: Visual Caption Restoration along with datasets: ht….
0
12
0
RT @huangcza: I’m excited I’ll be joining MIT next fall, for a shared interdisciplinary faculty position between Music (@MIT_MTA @MIT_SHASS….
0
78
0
RT @LiuHaohe: Can't wait to share a sneak peek of AudioLDM 2! .🔊AudioLDM 2 is a versatile framework that can generate sound effects/music/i….
0
73
0
@dengyi0307 and I developed this project to elevate the frontend experience for music interaction with generative models. PianorollVis.js features:. ✅ User-friendly API. ✅ Supports change in piano layouts & animation types. May this library bring harmony to your application! 🎶.
0
1
1
🎵 Introducing PianorollVis.js 🎹🖥️ - A simple JavaScript library to visualize MIDI notes in a piano roll! Perfect for frontend of symbolic music interaction. Check it out & play the demos: #WebDevelopment #MusicTech #MIDI #JavaScript
1
4
44
Throwback to my submission for the AI Song Contest in 2022, where we made the finalist list! Registration is now open at I highly recommend you join. Submissions are due in 1 month (Sept 4)! @aisongcontest #AI #music #generativeAI.
0
0
10
Excited to be at #ICASSP2023! @Kotentorothy and I will be presenting CLAP at poster session on 7th morning 8:15AM. Feel free to DM me to have a chat!.
0
0
14
RT @art_zucker: CLAP is to Audio what CLIP is to Images 🎙️.Here's how easy it is to run zero-shot audio classification in transformers 🤗 :….
0
67
0
We just released three new model checkpoints of CLAP trained on additional music and/or speech dataset! Hope it will be useful for everyone who is using our model! .Details:
Our text-audio contrastive model (CLAP) now supports pip install and inference through API! . You can calculate audio embedding and text embedding in just few lines of code. For more info:
2
11
92