serrjoa Profile Banner
Joan Serrà Profile
Joan Serrà

@serrjoa

Followers
2K
Following
9K
Media
83
Statuses
4K

Does research on machine learning at Sony AI, Barcelona. Works on audio analysis, synthesis, and retrieval. Likes tennis, music, and wine.

Barcelona, Catalonia
Joined September 2015
Don't wanna be here? Send us removal request.
@serrjoa
Joan Serrà
6 months
Got a "too familiar" tune from your generative model? Try checking for musical version matching (MVM)!. But MVM works with full tracks, and your tune is just a segment. Well, in our latest work we tackle precisely this issue, and achieve SOTA results even on full tracks!. 1/4
Tweet media one
1
13
62
@serrjoa
Joan Serrà
16 hours
Just embrace luck. You submit and some random people will either like it or not. Then finished, either resubmit somewhere else or leave it on ArXiv. Time will judge you and them. Let it flow.
0
0
5
@grok
Grok
2 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
600
2K
8K
@serrjoa
Joan Serrà
16 hours
The answer to an overcomplicated hyper-cumbersome disfunctional peer-review system is not adding things to it, but removing them.
1
0
4
@serrjoa
Joan Serrà
2 days
RT @zhaisf: Unlike an RNN, one attention block alone cannot model anything interesting. And it’s the stacking of it that does wonders. Unde….
0
67
0
@serrjoa
Joan Serrà
2 days
RT @HubertSiuzdak: Cool work and a nice read. But why we rebranded masked language models to diffusion language models 😭.
0
1
0
@serrjoa
Joan Serrà
3 days
RT @docmilanfar: some of the most hyped use cases of Al aren’t use cases at all - they're parlor tricks that serve no purpose.
0
5
0
@serrjoa
Joan Serrà
5 days
RT @kenneth0stanley: Those who intuit something is lacking in LLMs struggle to pinpoint the gap beyond inadequate metaphors like “stochasti….
0
103
0
@serrjoa
Joan Serrà
7 days
RT @HannesStaerk: Tomorrow we discuss diffusion models for sampling unnormalized densities "Adjoint Sampling: Highly Scalable Diffusion Sam….
0
31
0
@serrjoa
Joan Serrà
8 days
RT @dlbcnai: Call for presenters is out ! This year, we replace the orals by spotlight + posters. We will also have more invited talks. ht….
0
4
0
@serrjoa
Joan Serrà
10 days
RT @hungchiayu123: 🚨 New model alert! Meet 🎵JAM🎵: our tiny AI song generator that turns your lyrics into songs with precise word-level timi….
0
9
0
@serrjoa
Joan Serrà
13 days
RT @drscotthawley: .@elonmusk apparently missed my talk to @xianityplus in 2018.
Tweet media one
0
3
0
@serrjoa
Joan Serrà
13 days
RT @docmilanfar: MIT has a long tradition of hacks. One of the best was in 1985 when Ted Larkin, a freshman, returned to his dorm room one….
0
3
0
@serrjoa
Joan Serrà
13 days
RT @CVC_UAB: 🔈 Keynote Session with Joan Serrà (@SonyAI_global ) on 'Supervised contrastive learning from weakly-labeled audio segments for….
0
1
0
@serrjoa
Joan Serrà
14 days
RT @jaseweston: 🌿Introducing MetaCLIP 2 🌿.📝: code, model: After four years of advancements….
0
67
0
@serrjoa
Joan Serrà
15 days
RT @chrisdonahuey: Excited to share our beta release of Music Arena, a live evaluation platform for state-of-the-art AI music generation mo….
0
57
0
@serrjoa
Joan Serrà
15 days
RT @makingAGI: 🚀Introducing Hierarchical Reasoning Model🧠🤖. Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoni….
0
646
0
@serrjoa
Joan Serrà
15 days
RT @s_scardapane: *Diffusion Models are Evolutionary Algorithms*.by @YanboZhang3 @drmichaellevin et al. They develop novel evolutionary al….
0
100
0
@serrjoa
Joan Serrà
15 days
RT @cloneofsimo: Very nice blogpost on RoPE variants by @jerryx314
0
92
0
@serrjoa
Joan Serrà
17 days
RT @2prime_PKU: Anyone knows adam?
Tweet media one
0
463
0
@serrjoa
Joan Serrà
18 days
RT @rasbt: From GPT to MoE: I reviewed & compared the main LLMs of 2025 in terms of their architectural design from DeepSeek-V3 to Kimi 2.….
Tweet card summary image
magazine.sebastianraschka.com
From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design
0
417
0