
took
@wataru9871
Followers
1K
Following
19K
Media
254
Statuses
6K
takeの過去形 長岡高専➡︎東大シス創➡︎東大院情報理工
麺屋 松
Joined November 2012
エッホ エッホ エッホ エッホ .残響を保持した音声復元ができるって伝えなきゃ .エッホ エッホ エッホ エッホ .残響の制御もできるって伝えなきゃ. エッホ みんなに伝えなきゃ.paper: demo:
2
42
203
Here are some interesting result with sidon.
sarulab-speech-sidon-demo-beta.hf.space
Click to try out the app!
0
3
3
🚀 We just released Sidon — a multilingual speech restoration model built on the Miipher & Miipher-2 resynthesis framework!.Trained on 103 languages and robust to real-world artifacts like wind noise & packet loss 🌍.🔧 Try Sidon with your speech samples!.
huggingface.co
1
21
53
RT @hyama5_: 来月のSpeech Synthesis Workshop 2025 (SSW13)で発表します!.韻律ラベルつきTTSのために、HuBERT、Whisperの音響モデルとPnG BERTなどの言語モデルを使うと、音声のアクセントや境界強度の推定精度が上….
0
11
0
🚀 We just released MSR-UTMOS — a powerful model for speech quality prediction that supports 16kHz, 24kHz, and 48kHz audio!.🔍 Powered by a sampling frequency-independent convolutional layer on top of SSL models. 🎧 Upload your own samples and try it now:
huggingface.co
1
26
48
RT @ArxivSound: Wataru Nakata, Yuma Koizumi, Shigeki Karita, Robin Scheibler, Haruko Ishikawa, Adriana Guevara-Rukoz, Heiga Zen, Michiel Ba….
arxiv.org
Reverberation encodes spatial information regarding the acoustic source environment, yet traditional Speech Restoration (SR) usually completely removes reverberation. We propose ReverbMiipher, an...
0
3
0
RT @ArxivSound: Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari, "Active Learning for Text-to-Speech Synthesis with I….
arxiv.org
The construction of high-quality datasets is a cornerstone of modern text-to-speech (TTS) systems. However, the increasing scale of available data poses significant challenges, including storage...
0
4
0
RT @yuma_koizumi: All three papers from our project have been accepted to WASPAA⛰️!!. Miipher-2.ReverbMiipher.https….
0
14
0
RT @ysaito_human: M2 淺井さんの発表「話者オーバーラップ音声からの特徴抽出に向けた自己教師あり学習モデルの検討」が音響学会 2025年春季研究発表会で学生優秀発表賞を受賞しました.おめでとうございます!👏
0
3
0
RT @trgkpc: Our paper is now available on arXiv!.We propose TTSOps, a closed-loop framework for building multi-speaker TTS from noisy web d….
0
15
0
RT @hsaruwatari727: Our paper titled "Language-Queried Target Speech Extraction Using Para-linguistic and Non-linguistic Prompts" has been….
0
11
0
RT @trgkpc: LASS(言語クエリ音源分離)に基づくTSE(目標音声抽出)の論文がacceptされました!.こちらの内容は秋ASJにて発表させていただきますので、ぜひご議論いただけますと幸いです。.
0
6
0