William Chen @chenwanch1 X Profile

William Chen

@chenwanch1

Followers

783

Following

873

Media

38

Statuses

282

PhD Student @LTIatCMU @SCSatCMU | Masters @LTIatCMU | Formerly @TXInstruments | @UCF ‘21

Joined June 2021

Don't wanna be here? Send us removal request.

William Chen

@chenwanch1

2 months

What happens if you scale Whisper to billions of parameters?. Our #ICML2025 paper develops scaling laws for ASR/ST models, training models with up to 18B params and 360K hours of data, and 100+ languages. Joint work b/w @LTIatCMU and @nvidia.

4

32

104

William Chen

@chenwanch1

4 hours

RT @cterdam: The week ahead: 20250721-20250727. [Tweet] OpenAI’s new model achieves gold medal-level performance in IMO. [Blog] Calvin’s t….

llz.info

Personal page for 李良澤 Liangze Li.

0

1

0

William Chen

@chenwanch1

3 days

One of my favorite moments at #ICML2025 was being able to witness @_albertgu and the @cartesia_ai team’s reaction to Mamba being on the coffee sign. Felt surreal seeing someone realize their cultural impact.

1

8

87

William Chen

@chenwanch1

4 days

RT @chenwanch1: I’ll be presenting this Thursday 4:30pm at the West hall, poster 418. Drop by to learn more about our latest experience i….

0

5

0

William Chen

@chenwanch1

4 days

RT @chenwanch1: What happens if you scale Whisper to billions of parameters?. Our #ICML2025 paper develops scaling laws for ASR/ST models,….

0

32

0

William Chen

@chenwanch1

5 days

RT @liweiche77: Presenting our #ICML2025 poster today!. Discover our continuous, end-to-end approach that helps speech language models proc….

0

2

0

William Chen

@chenwanch1

5 days

I’ll be presenting this Thursday 4:30pm at the West hall, poster 418. Drop by to learn more about our latest experience in burning compute!.

William Chen

@chenwanch1

2 months

What happens if you scale Whisper to billions of parameters?. Our #ICML2025 paper develops scaling laws for ASR/ST models, training models with up to 18B params and 360K hours of data, and 100+ languages. Joint work b/w @LTIatCMU and @nvidia.

0

5

8

William Chen

@chenwanch1

6 days

RT @liweiche77: Thrilled to share our #ICML2025 paper!. We introduce a variational approach for speech language models, automating speech a….

arxiv.org

The success of large language models in text processing has inspired their adaptation to speech modeling. However, since speech is continuous and complex, it is often discretized for...

0

3

0

William Chen

@chenwanch1

6 days

Not advertised yet, but we figured out how to do this too. And we release how exactly you can do it 👀. With the right training techniques, you can inject audio understanding and generation into an LLM with almost no loss in text perf. Details at

arxiv.org

This paper presents Open Unified Speech Language Models (OpusLMs), a family of open foundational speech language models (SpeechLMs) up to 7B. Initialized from decoder-only text language models,...

Vaibhav (VB) Srivastav

@reach_vb

6 days

the best part about the mistral release is that the models don't loose as much on text - this has been a biggest pain point for a audioLMs for a long while

1

5

31

William Chen

@chenwanch1

16 days

RT @awawawhoami: how do yall think current day google translate works?? everyone's just stupid now i guess.

0

384

0

William Chen

@chenwanch1

17 days

What is it with speech reviewers on openreview?. In my past 3 submissions (EMNLP 24, ICML 25, EMNLP 25), I have gotten only 1 reply to a rebuttal, out of a total of 11 reviews. Very frustrating, esp since they ask for more results and analyses that take a lot of time/compute.

2

0

33

William Chen

@chenwanch1

1 month

RT @jiatongshi: 🔊 New release: #ARECHO -> Autoregressive Evaluation via Chain-based Hypothesis Optimization. • 87-metric coverage in one mo….

0

3

0

William Chen

@chenwanch1

2 months

RT @mmiagshatoy: 🚀 Happy to share our #INTERSPEECH2025 paper:. Using speaker & acoustic context, we dynamically　adjust model paths, resulti….

arxiv.org

Speech foundation models achieve strong generalization across languages and acoustic conditions, but require significant computational resources for inference. In the context of speech foundation...

0

10

0

William Chen

@chenwanch1

2 months

RT @jiatongshi: 🚀 Introducing Uni-VERSA: a unified model for multi-dimensional speech evaluation-naturalness, intelligibility, noise, proso….

huggingface.co

0

9

0

William Chen

@chenwanch1

2 months

I’ll be interning at Adobe Research in San Francisco this summer, working on audio generation. HMU if you’re in the area and want to chat about speech / audio AI!

2

90

William Chen

@chenwanch1

2 months

7/7 papers accepted to #Interspeech2025 🎉. Lots of interesting work from my fantastic co-authors on long-form processing, multilingualism, and multi-modal foundation models. See y’all in Rotterdam 🇳🇱.

4

7

79

William Chen

@chenwanch1

2 months

RT @cromz22: Excited to share our survey paper accepted to #ACL2025NLP Findings:.When Large Language Models Meet Speech: A Survey on Integr….

0

7

0

William Chen

@chenwanch1

2 months

RT @arouditchenko: Do you really need audio to fine-tune your Audio LLM? 🤔 Answer below:. Introducing Omni-R1, a simple GRPO fine‑tuning me….

arxiv.org

We propose Omni-R1 which fine-tunes a recent multi-modal LLM, Qwen2.5-Omni, on an audio question answering dataset with the reinforcement learning method GRPO. This leads to new State-of-the-Art...

0

36

0

William Chen

@chenwanch1

2 months

RT @huckiyang: We are happy that🦉 OWLS, 18B to 0.25B open ASR/AST limited data scaling laws, has been accepted to @icmlconf 2025 led by @c….

0

10

0

William Chen

@chenwanch1

2 months

More analyses can be found in our pre-print: All models will be released on @huggingface: Many thanks to my wonderful co-authors and mentors: @shinjiw_at_cmu, @huckiyang, @brianyan918, @MXzBFhjFpS1jyMI . See y'all in Vancouver!.

huggingface.co

0

5

William Chen

@chenwanch1

2 months

We combine these lessons and scale a single model to 18B params and 360K hours of training data. Its competitive with / outperforms SOTA models like Seamless, Whisper, SenseVoice, and Qwen2 Audio.

1

0

3