Joan Serrà @serrjoa X Profile

Joan Serrà

@serrjoa

Followers

2K

Following

9K

Media

83

Statuses

4K

Does research on machine learning at Sony AI, Barcelona. Works on audio analysis, synthesis, and retrieval. Likes tennis, music, and wine.

Barcelona, Catalonia

Joined September 2015

Don't wanna be here? Send us removal request.

Joan Serrà

@serrjoa

6 months

Got a "too familiar" tune from your generative model? Try checking for musical version matching (MVM)!. But MVM works with full tracks, and your tune is just a segment. Well, in our latest work we tackle precisely this issue, and achieve SOTA results even on full tracks!. 1/4

1

13

62

Joan Serrà

@serrjoa

16 hours

Just embrace luck. You submit and some random people will either like it or not. Then finished, either resubmit somewhere else or leave it on ArXiv. Time will judge you and them. Let it flow.

0

5

Grok

@grok

2 days

Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.

600

2K

8K

Joan Serrà

@serrjoa

16 hours

The answer to an overcomplicated hyper-cumbersome disfunctional peer-review system is not adding things to it, but removing them.

1

0

4

Joan Serrà

@serrjoa

2 days

RT @zhaisf: Unlike an RNN, one attention block alone cannot model anything interesting. And it’s the stacking of it that does wonders. Unde….

0

67

0

Joan Serrà

@serrjoa

2 days

RT @HubertSiuzdak: Cool work and a nice read. But why we rebranded masked language models to diffusion language models 😭.

0

1

0

Joan Serrà

@serrjoa

3 days

RT @docmilanfar: some of the most hyped use cases of Al aren’t use cases at all - they're parlor tricks that serve no purpose.

0

5

0

Joan Serrà

@serrjoa

5 days

RT @kenneth0stanley: Those who intuit something is lacking in LLMs struggle to pinpoint the gap beyond inadequate metaphors like “stochasti….

0

103

0

Joan Serrà

@serrjoa

7 days

RT @HannesStaerk: Tomorrow we discuss diffusion models for sampling unnormalized densities "Adjoint Sampling: Highly Scalable Diffusion Sam….

0

31

0

Joan Serrà

@serrjoa

8 days

RT @dlbcnai: Call for presenters is out ! This year, we replace the orals by spotlight + posters. We will also have more invited talks. ht….

0

4

0

Joan Serrà

@serrjoa

10 days

RT @hungchiayu123: 🚨 New model alert! Meet 🎵JAM🎵: our tiny AI song generator that turns your lyrics into songs with precise word-level timi….

0

9

0

Joan Serrà

@serrjoa

13 days

RT @drscotthawley: .@elonmusk apparently missed my talk to @xianityplus in 2018.

0

3

0

Joan Serrà

@serrjoa

13 days

RT @docmilanfar: MIT has a long tradition of hacks. One of the best was in 1985 when Ted Larkin, a freshman, returned to his dorm room one….

0

3

0

Joan Serrà

@serrjoa

13 days

RT @CVC_UAB: 🔈 Keynote Session with Joan Serrà (@SonyAI_global ) on 'Supervised contrastive learning from weakly-labeled audio segments for….

0

1

0

Joan Serrà

@serrjoa

14 days

RT @jaseweston: 🌿Introducing MetaCLIP 2 🌿.📝: code, model: After four years of advancements….

0

67

0

Joan Serrà

@serrjoa

14 days

RT @ArxivSound: Nao Tokui, Tom Baker, "Latent Granular Resynthesis using Neural Audio Codecs,"

arxiv.org

We introduce a novel technique for creative audio resynthesis that operates by reworking the concept of granular synthesis at the latent vector level. Our approach creates a "granular codebook" by...

0

7

0

Joan Serrà

@serrjoa

15 days

RT @chrisdonahuey: Excited to share our beta release of Music Arena, a live evaluation platform for state-of-the-art AI music generation mo….

0

57

0

Joan Serrà

@serrjoa

15 days

RT @makingAGI: 🚀Introducing Hierarchical Reasoning Model🧠🤖. Inspired by brain's hierarchical processing, HRM delivers unprecedented reasoni….

0

646

0

Joan Serrà

@serrjoa

15 days

RT @s_scardapane: *Diffusion Models are Evolutionary Algorithms*.by @YanboZhang3 @drmichaellevin et al. They develop novel evolutionary al….

0

100

0

Joan Serrà

@serrjoa

15 days

RT @cloneofsimo: Very nice blogpost on RoPE variants by @jerryx314

0

92

0

Joan Serrà

@serrjoa

17 days

RT @2prime_PKU: Anyone knows adam?

0

463

0

Joan Serrà

@serrjoa

18 days

RT @rasbt: From GPT to MoE: I reviewed & compared the main LLMs of 2025 in terms of their architectural design from DeepSeek-V3 to Kimi 2.….

magazine.sebastianraschka.com

From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design

0

417

0