Pooneh Mousavi
@MousaviPooneh
Followers
166
Following
435
Media
3
Statuses
123
Montreal,Canada
Joined January 2019
βEver tried. Ever failed. No matter. Try again. Fail again. Fail better.β Samuel Becket
0
0
1
π’ Join our Conversational AI Reading Group! π
Thursday, Nov 6th | 11 AM - 12 PM EST π Speaker: Emmanouil Benetos (@emmanouilb) - Queen Mary University of London π Topic: "Machine learning paradigms for music and audio understanding" π Details: ( https://t.co/oxEZtla7O1)
0
1
2
π’ Join our Conversational AI Reading Group to know more about Google Gemini 2.5, a natively multimodal audio model developed over the past year. π
Thursday, Oct 30th | 11 AM - 12 PM EST π Speaker: Michael Han - Google DeepMind π Details: ( https://t.co/oxEZtla7O1)
0
1
5
π’ Join our Conversational AI Reading Group! π
Thursday, Oct 23rd | 11 AM - 12 PM EST π Speaker: Joan SerrΓ @serrjoa - Sony AI π Topic: "Supervised contrastive learning from weakly-labeled audio segments for musical version matching" π Details: ( https://t.co/oxEZtla7O1)
0
2
4
π’ Schedule Update! The Oct 16th session will start at 12PM . Please make sure to mark this change in your calendar so you donβt miss this great talk!
π’This week, our Conversational AI Reading Group is excited to have Jinyu Li from Microsoft. Please note: This weekβs session will start one hour later than usual, at 12:00 PM instead of 11:00 AM. π
Thursday, Oct 16th | 12:00 - 13:00 EST π Topic: The development of spoken LM
0
0
0
This Thursday Oct 2nd , our Conversational AI RG is honored to host @Yoshua_Bengio , one of the worldβs leading pioneers in AI. He will present: βA Safety Case for the Scientist AI.β Donβt miss this unique opportunity to join us online ! π Details:
0
1
1
π’ Join our Conversational AI Reading Group! π
Thurs, Sep 25th | 11 AM - 12 PM EST π Speaker: Themos Stafylakis @themosst π Topic: "Advances in Speaker Recognition: Pruning, Deepfake Detection, and Learning without Temporal Labels" π Details: ( https://t.co/oxEZtl9zYt)
0
2
1
If you missed my session presenting our recent work βDiscrete Audio Tokens: More Than a Survey!β, you can now find the recording on our YouTube channel and the slides on our website: βΆοΈ YouTube: https://t.co/E4Mqn1sjiS π Website:
π’ Our Conversational AI Reading Group is back! Join the first Fall 2025 session! π€ π
Thursday, Sept 18 | 11 AMβ12 PM EST π Speaker: Pooneh Mousavi (Mila) @MousaviPooneh π Topic: βDiscrete Audio Tokens: More Than a Survey!β π Details:
0
0
2
Iβll be presenting our survey paper βDiscrete Audio Tokens: More Than a Survey!β at the first Fall 2025 session of the Conversational AI Reading Group. Looking forward to seeing you there and discussing ideas!
π’ Our Conversational AI Reading Group is back! Join the first Fall 2025 session! π€ π
Thursday, Sept 18 | 11 AMβ12 PM EST π Speaker: Pooneh Mousavi (Mila) @MousaviPooneh π Topic: βDiscrete Audio Tokens: More Than a Survey!β π Details:
0
1
8
Weβre back with a new series of Conversational AI Talks. Everyoneβs invited! Feel free to share with your network. π Every Thursday, 11:00 AM β 12:00 PM EDT π Kicking off on September 18th with an exciting lineup of speakers. π Β More details:
0
3
11
Iβm happy to share that our paper, "Discrete Audio Tokens: More Than a Survey!", has been accepted at TMLR. π π Read: https://t.co/QogseMicwf π Explore our tokenizer database & submit yours:
ππ₯³ I am thrilled to share that our work on audio tokenisers has been accepted to #TMLR The tokeniser DB is ever updating so submit your new tokenisers πͺ https://t.co/g5RU2jtSRZ
0
4
20
π’ Presenting our paper βLiSTEN: Learning Soft Token Embeddings for Neural Audio LLMsβ β an interpretable fine-tuning method for spoken language understanding. π Wed, Aug 20 | 08:30β10:30 π A11-P2B-03 Hope to see you there! π https://t.co/GEhsFFqeLy
@ISCAInterspeech
1
2
6
Our pick of the week by @beomseok_lee_: "ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs" by Pooneh Mousavi, @yingzhi_wang, @mirco_ravanelli, and @CemSubakan (2025) https://t.co/8UsRFvV83N
#SLU #speech #multimodal #LLM
arxiv.org
Large Language Models (LLMs) are increasingly used in Spoken Language Understanding (SLU), where effective multimodal learning depends on the alignment between audio and text. Despite various...
Speech-language models show promise in multimodal tasksβbut how well are speech & text actually aligned? π€ This paper https://t.co/B9z1j7L4IO proposes a new metric to measure layer-wise correlation between the two, with a focus on SLU tasks. ππ£οΈπ
0
7
11
π’ Join our Conversational AI Reading Group! π
Thursday, June 19th | 11 AM - 12 PM EST π Speaker: Yuki Mitsufuji (@mittu1204) - SonyAI π Topic: "AI for Creators: Pushing Creative Abilities to the Next Level" π Details: ( https://t.co/oxEZtl9zYt)
0
3
4
``Discrete Audio Tokens: More Than a Survey!,'' Pooneh Mousavi, Gallil Maimon, Adel Moumen, Darius Petermann, Jiatong Shi, Haibin Wu, Haici Yang, Anastasia Kuznetsova, Artem Ploujnikov, Ricard Marxer, Bhuvana Ramabhadran, Benjamin Elizalde, Loren Lugoschβ¦
arxiv.org
Discrete audio tokens are compact representations that aim to preserve perceptual quality, phonetic content, and speaker characteristics while enabling efficient storage and inference, as well as...
0
9
63
π΅π¬ If you are interested in Audio Tokenisers, you should check out our new work! We empirically analysed existing tokenisers from every way - reconstruction, downstream, LMs and more. Grab yourself a β/πΊ and sit down for a read!
1
26
103
ππ Great collaboration, with a diverse all-star team led by @MousaviPooneh - check it outπ πPaper - https://t.co/BUVLYLNGoe πWebsite (+updating tokeniser DB!) -
arxiv.org
Discrete audio tokens are compact representations that aim to preserve perceptual quality, phonetic content, and speaker characteristics while enabling efficient storage and inference, as well as...
0
1
9
π We're excited to announce our latest work: "Discrete Audio Tokens: More Than a Survey!" It presents a comprehensive survey and benchmark of audio tokenizers across speech, music, and general audio. preprint: https://t.co/QogseMhEGH website:
1
14
34
π’ Join our Conversational AI Reading Group! π
Thursday, June 12th | 11 AM - 12 PM EST π Speaker: Andros Tjandra π Topic: "Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound" π Details: ( https://t.co/oxEZtla7O1)
0
3
9
π’ Join our Conversational AI Reading Group! π
Thursday, May 29th | 11 AM - 12 PM EST π Speaker: Yossi Adi @adiyossLC π Topic: "On The Landscape of Spoken Language Models" π Details: ( https://t.co/oxEZtl9zYt)
0
2
10
Learn about speaker diarization, the science behind it, and the future of diarization at β¦@pyannoteAIβ© research labs
0
3
11