ZD1908 @ZDi____ X Profile

ZD1908

@ZDi____

Followers

231

Following

34K

Media

370

Statuses

3K

(mostly) Audio/TTS ML research & LSTM enjoyer; by myself | 🇦🇷 25M | DMs open

https://t.co/YLbv8jzQxI

Latent space

Joined June 2024

Don't wanna be here? Send us removal request.

ZD1908

@ZDi____

15 days

Well, I can't improve my thing further, so I'm releasing it just to document the process. I tried a making an efficient neural audio codec by combining 16KHz STFT-VQGAN, and a Wave U-Net to correct artifacts and upsample to 44.1KHz. (Substack link in replies)

1

0

5

ZD1908

@ZDi____

18 minutes

I downloaded Soulseek and it's music communism.

0

ZD1908

@ZDi____

6 hours

It's funny how the paradigm in seq2seq went from encoder<-cross attention->decoder to just tokenize both input and output sequence together, concatenate them and train a pure self-attention decoder, and it works. Decoder-only transformer is truly something.

0

ZD1908

@ZDi____

15 hours

Paired training. Forward sum loss my beloved.

0

ZD1908

@ZDi____

2 days

AI bros be like "(fire emoji) (fire emoji) Hollywood is FINISHED! AI X is the future!" and it's the sloppiest slop in the history of slop. Like, come on.

Matt Shumer

@mattshumer_

2 days

AI games are going to be amazing (sound on)

0

2

ZD1908

@ZDi____

2 days

Seems to be hipBLASLt shitting the bed on a BF16 matmul. Turning mixed precision off removes the crash. Maybe it's the Tensile backend? Will have to retry with ROCBLAS_USE_HIPBLASLT=1 Thank God for TensorFloat32 tho.

ZD1908

@ZDi____

3 days

This has been preventing me from achieving anything in the last 2 days. It doesn't go away no matter what I try, and is completely random.

0

1

ZD1908

@ZDi____

3 days

HEVC moment for ViTs.

Kwang Moo Yi

@kwangmoo_yi

3 days

Choudhury and Kim et al., "Accelerating Vision Transformers With Adaptive Patch Sizes" Transformer patches don't need to be of uniform size -- choose sizes based on entropy --> faster training/inference. Are scale-spaces gonna make a comeback?

0

ZD1908

@ZDi____

3 days

@AnushElangovan TF32 seems stable enough on MI300X, why is it not on by default?

1

0

1

ZD1908

@ZDi____

3 days

This has been preventing me from achieving anything in the last 2 days. It doesn't go away no matter what I try, and is completely random.

0

ZD1908

@ZDi____

4 days

how a diffusion researches sees the world

Mustafa

@oprydai

7 days

how a mathematician sees the world

0

5

ZD1908

@ZDi____

4 days

A ver el nuevo tuit de Cristina Kirchner

zero

@zxroprograma

6 days

Hace un año me dejó mi ex, este fue el último mensaje que le mandé. Me clavo el visto y me bloqueó 😐

0

ZD1908

@ZDi____

5 days

Yeah, now it's fixed.

ZD1908

@ZDi____

5 days

I was wondering why my decoder was miserably failing to reduce loss. I just realized I forgot to tell Qwen code to make my transformer pre-norm.

0

1

ZD1908

@ZDi____

5 days

I was wondering why my decoder was miserably failing to reduce loss. I just realized I forgot to tell Qwen code to make my transformer pre-norm.

0

ZD1908

@ZDi____

6 days

Lazy way to make a dataloader efficient: just load the entire dataset into CPU RAM.

0

1

Casa Rosada

@CasaRosada

6 days

Este 19 de octubre conmemoramos los 111 años del Paso a la Inmortalidad de Julio Argentino Roca, prócer nacional, dos veces Presidente de la Nación y figura clave en la consolidación del Estado argentino. Bajo su liderazgo se llevó a cabo la Campaña del Desierto, hito decisivo

423

2K

11K

ZD1908

@ZDi____

7 days

SUCESSFULLY WASTED 25 YEARS OF MY LIFE AWARD

1

0

1

vin

@vin_acct

7 days

i cant ever look at graphs like this the same again

martin

@_martinsit

17 days

4 months is all you need decided to nyc/sf or bust for my 2nd coop

3

1

54

Yann LeCun

@ylecun

7 days

@SebastienBubeck Hoisted by their own GPTards

69

78

2K

ZD1908

@ZDi____

7 days

Pretraining both encoder and decoder to build a rich prior for text and audio for later finetuning. It also allows me to take advantage of fixed-length training. Container is rocm/pytorch-training:v25.8, everything works out of the box.

0

1

ZD1908

@ZDi____

7 days

Pretraining decoder on unconditional AR modeling of 4B audio tokens and encoder on char-level masked language modeling. 28% MFU on 355M params after fused AdamW, torch.compile, Flash Att 2 on 1x@HotAisle MI300X. Later I'll connect the two on a small amount of paired data for TTS.

1

0

1

ZD1908

@ZDi____

8 days

The official PyTorch documentation says TensorFloat32 is not available on ROCm, but this is a lie: it's disabled unless HIPBLASLT_ALLOW_TF32=1, hiding a 2.8x speedup in FP32 matmuls. HIPBLASLT_ALLOW_TF32 should be on by default.

1

0

1