
Joan Serrà
@serrjoa
Followers
2K
Following
9K
Media
83
Statuses
4K
Does research on machine learning at Sony AI, Barcelona. Works on audio analysis, synthesis, and retrieval. Likes tennis, music, and wine.
Barcelona, Catalonia
Joined September 2015
Got a "too familiar" tune from your generative model? Try checking for musical version matching (MVM)!. But MVM works with full tracks, and your tune is just a segment. Well, in our latest work we tackle precisely this issue, and achieve SOTA results even on full tracks!. 1/4
1
13
62
RT @HubertSiuzdak: Cool work and a nice read. But why we rebranded masked language models to diffusion language models 😭.
0
1
0
RT @docmilanfar: some of the most hyped use cases of Al aren’t use cases at all - they're parlor tricks that serve no purpose.
0
5
0
RT @kenneth0stanley: Those who intuit something is lacking in LLMs struggle to pinpoint the gap beyond inadequate metaphors like “stochasti….
0
103
0
RT @HannesStaerk: Tomorrow we discuss diffusion models for sampling unnormalized densities "Adjoint Sampling: Highly Scalable Diffusion Sam….
0
31
0
RT @hungchiayu123: 🚨 New model alert! Meet 🎵JAM🎵: our tiny AI song generator that turns your lyrics into songs with precise word-level timi….
0
9
0
RT @docmilanfar: MIT has a long tradition of hacks. One of the best was in 1985 when Ted Larkin, a freshman, returned to his dorm room one….
0
3
0
RT @CVC_UAB: 🔈 Keynote Session with Joan Serrà (@SonyAI_global ) on 'Supervised contrastive learning from weakly-labeled audio segments for….
0
1
0
RT @ArxivSound: Nao Tokui, Tom Baker, "Latent Granular Resynthesis using Neural Audio Codecs,"
arxiv.org
We introduce a novel technique for creative audio resynthesis that operates by reworking the concept of granular synthesis at the latent vector level. Our approach creates a "granular codebook" by...
0
7
0
RT @chrisdonahuey: Excited to share our beta release of Music Arena, a live evaluation platform for state-of-the-art AI music generation mo….
0
57
0
RT @makingAGI: 🚀Introducing Hierarchical Reasoning Model🧠🤖. Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoni….
0
646
0
RT @s_scardapane: *Diffusion Models are Evolutionary Algorithms*.by @YanboZhang3 @drmichaellevin et al. They develop novel evolutionary al….
0
100
0
RT @rasbt: From GPT to MoE: I reviewed & compared the main LLMs of 2025 in terms of their architectural design from DeepSeek-V3 to Kimi 2.….
magazine.sebastianraschka.com
From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design
0
417
0