Eduardo Fonseca @edfonseca_ X Profile

Eduardo Fonseca

@edfonseca_

Followers

1K

Following

1K

Media

23

Statuses

183

Research Scientist @GoogleDeepMind. Sound Understanding. Previously @GoogleAI and @mtg_upf. He/him.

https://t.co/1wt08RMDUQ

NYC

Joined October 2017

Don't wanna be here? Send us removal request.

Eduardo Fonseca

@edfonseca_

2 months

🔊New paper! Recomposer allows editing sound events within complex scenes based on textual descriptions and event roll representations. And we discuss the details that matter! Work led by Dan Ellis w/ a bunch of Sound Understanding folks @GoogleDeepMind https://t.co/J6F57BqSMn

0

4

42

Vivek Kumar

@vivek_kumar

2 months

Excited to share our work from Sound Understanding team at @GoogleDeepMind! Ever wanted to remove a single cough from a recording or make a faint doorbell louder? Recomposer makes editing complex audio scenes possible! Paper: https://t.co/bd6lx5b938 #AudioEditing #GenerativeAI

arxiv.org

Editing complex real-world sound scenes is difficult because individual sound sources overlap in time. Generative models can fill-in missing or corrupted details based on their strong prior...

Eduardo Fonseca

@edfonseca_

2 months

🔊New paper! Recomposer allows editing sound events within complex scenes based on textual descriptions and event roll representations. And we discuss the details that matter! Work led by Dan Ellis w/ a bunch of Sound Understanding folks @GoogleDeepMind https://t.co/J6F57BqSMn

1

5

24

arXiv Sound

@ArxivSound

2 months

Daniel P. W. Ellis, Eduardo Fonseca, Ron J. Weiss, Kevin Wilson, Scott Wisdom, Hakan Erdogan, John R. Hershey, Aren Jansen, R. Channing Moore, Manoj Plakal, "Recomposer: Event-roll-guided generative audio editing,"

arxiv.org

Editing complex real-world sound scenes is difficult because individual sound sources overlap in time. Generative models can fill-in missing or corrupted details based on their strong prior...

0

7

23

Yuma Koizumi

@yuma_koizumi

7 months

New multilingual speech restoration paper out Miipher-2 🚀! The RTF on a TPU is 0.0078: 1 million hours of data can be cleaned in 3 days using just 100 TPUs! Paper: https://t.co/lohyU54t4a Demo: https://t.co/LQuVMgJChJ

arxiv.org

Training data cleaning is a new application for generative model-based speech restoration (SR). This paper introduces Miipher-2, an SR model designed for million-hour scale data, for training data...

3

31

84

Annamaria Mesaros

@AnnamariaMsros

1 year

I'm looking for a PhD student to work on continual learning for audio. Funding available for 2 years to start with, to be extended to 4 later. Contact me through email if interested! If you participated to @DCASE_Challenge or are coming to @DCASE_Workshop even better!

3

32

113

Eduardo Fonseca

@edfonseca_

2 years

🔊 We've released pre-trained models & code for our ICCV23 paper, Audiovisual Masked Autoencoders!! GitHub: https://t.co/NDbPZgefCo Paper: https://t.co/rtLhWOq872 Work led by Lili Georgescu and @anuragarnab , with Radu Ionescu, @MarioLucic_ and @CordeliaSchmid

arxiv.org

Can we leverage the audiovisual information already present in video to improve self-supervised representation learning? To answer this question, we study various pretraining architectures and...

0

5

46

Vivek Kumar

@vivek_kumar

2 years

It's so awesome to see the impact of the computational audio capabilities we developed featured in @madebygoogle 🎉 🎉 🎉 Congrats to John Hershey, @ScottTWisdom, @PGetreuer & everyone who contributed for pioneering new computational audio capabilities in Pixel8 #MadeByGoogle

Google Photos

@googlephotos

2 years

Check out the 4 new Google Photos features coming first to Pixel 8 and 8 Pro ↓ Whether it’s noise from wind, traffic, or barking dogs, Audio Magic Eraser in Google Photos reduces distracting sounds in your video in just a few taps! 🪄

4

16

60

Eduardo Fonseca

@edfonseca_

2 years

🔊New paper out: Do you use data balancing in your AudioSet experiments? It gets you a little mAP boost? It might work differently than you think...😅 You might want to check our last paper, led by @ChannningMoore

Channing Moore

@ChannningMoore

2 years

Our 2023 ICASSP paper is now up on arXiv: Dataset balancing can hurt model performance https://t.co/TyawmB1OWK Dataset balancing works differently than you might assume: - can cause overfitting; - doesn’t improve performance on rare classes; - speeds up training convergence.

0

1

16

Eduardo Fonseca

@edfonseca_

2 years

Not sure my reviews were that "outstanding"... 😅, but the recognition is nice... Thanks to the @ieeeICASSP committee. #ICASSP2023

0

28

Annamaria Mesaros

@AnnamariaMsros

3 years

📢 The countdown to DCASE Challenge 2023 deadline is on! 🗓️ Deadlines: * System submission: 15/5 - 23.59 AoE * Technical reports: 22/5 - 23.59 AoE

0

2

6

Romain Serizel

@RSerizel

3 years

📣 The results for @DCASE_Challenge task 4 are finally out! 🥳 https://t.co/KSgFwNq4K0 @Nicoturpo @FraRonchini @edfonseca_ @SamueleCornell

1

4

18

Eduardo Fonseca

@edfonseca_

3 years

Aaand that was after defending my PhD thesis some months ago "Training Sound Event Classifiers Using Different Types of Supervision" & taking some time off :) Thesis/video/slides & a quick summary available here: https://t.co/HrmsUURYv9 SUPER thankful to all @mtg_upf folks!!🙌

0

14

Eduardo Fonseca

@edfonseca_

3 years

🔊 A bit late, but happy to announce that I recently joined Google Research! I’m working in the Sound Understanding Group based out of NYC! https://t.co/kbD1U0okFc

9

7

158

HEAR Benchmark

@hearbenchmark

3 years

HEAR PMLR journal submissions are open until 2022-06-30. https://t.co/URhf1PPgrY Besides that, people have asked if they can run HEAR benchmarks, get on the leaderboard, cite us in the future. Yes! HEAR is here to stay. See our updated website: https://t.co/VuSnPYF095

0

5

17

Romain Serizel

@RSerizel

3 years

📣 We have a (super cool) PhD position in speech enhancement for patients with auditory neuropathy spectrum disorders. If you're interested in Audio Signal Processing/MAchine Learning/Audiology, contact us! More info ⤵️ https://t.co/VVXWsIz4Fb

1

10

14

Romain Serizel

@RSerizel

4 years

Looking forward to seeing you all here! 🥳

DCASE Workshop

@DCASE_Workshop

4 years

📢The #DCASE2022 workshop call for papers is out 🥳🎉 https://t.co/2rU1IzJkMG The abstract submission deadline is on 7th of July and the workshop will be held in presence from 3rd to 4th of November in Nancy. Looking forward to see you there! 😉

0

1

2

daisukelab

@nizumical

4 years

Our new paper is out! We explored a simple masked patch modeling w/o augmentation to learn a latent that describes the input spectrogram as it is. “Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation” https://t.co/kWIEMsGzNZ

arxiv.org

Recent general-purpose audio representations show state-of-the-art performance on various audio tasks. These representations are pre-trained by self-supervised learning methods that create...

2

29

daisukelab

@nizumical

4 years

https://t.co/voGOpmu0pe For BYOL for Audio, an updated paper is out (submitted last year, still under review). It extends the initial BYOL-A for network architecture and data augmentation. We compare with 8 models (11 representations) using a benchmark with 10 tasks.

arxiv.org

Pre-trained models are essential as feature extractors in modern machine learning systems in various domains. In this study, we hypothesize that representations effective for general audio tasks...

arXiv Sound

@ArxivSound

4 years

``BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations. (arXiv:2204.07402v1 [ https://t.co/3pcQCkeyAA]),'' Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino,

1

8

Annamaria Mesaros

@AnnamariaMsros

4 years

#DCASE2022 Challenge is officially open! You can now check the task descriptions and development data. Some tasks have delays with the baseline system, but those will be ready soon too. https://t.co/0aOOLZIt9B @DCASE_Challenge #machinelistening #DCASE

0

11

21

DCASE Challenge

@DCASE_Challenge

4 years

📢 DCASE challenge 2022 task descriptions are out!! Enjoy ➡️

0

12

22