ZD1908 Profile
ZD1908

@ZDi____

Followers
231
Following
34K
Media
370
Statuses
3K

(mostly) Audio/TTS ML research & LSTM enjoyer; by myself | 馃嚘馃嚪 25M | DMs open

Latent space
Joined June 2024
Don't wanna be here? Send us removal request.
@ZDi____
ZD1908
15 days
Well, I can't improve my thing further, so I'm releasing it just to document the process. I tried a making an efficient neural audio codec by combining 16KHz STFT-VQGAN, and a Wave U-Net to correct artifacts and upsample to 44.1KHz. (Substack link in replies)
1
0
5
@ZDi____
ZD1908
18 minutes
I downloaded Soulseek and it's music communism.
0
0
0
@ZDi____
ZD1908
6 hours
It's funny how the paradigm in seq2seq went from encoder<-cross attention->decoder to just tokenize both input and output sequence together, concatenate them and train a pure self-attention decoder, and it works. Decoder-only transformer is truly something.
0
0
0
@ZDi____
ZD1908
15 hours
Paired training. Forward sum loss my beloved.
0
0
0
@ZDi____
ZD1908
2 days
AI bros be like "(fire emoji) (fire emoji) Hollywood is FINISHED! AI X is the future!" and it's the sloppiest slop in the history of slop. Like, come on.
@mattshumer_
Matt Shumer
2 days
AI games are going to be amazing (sound on)
0
0
2
@ZDi____
ZD1908
2 days
Seems to be hipBLASLt shitting the bed on a BF16 matmul. Turning mixed precision off removes the crash. Maybe it's the Tensile backend? Will have to retry with ROCBLAS_USE_HIPBLASLT=1 Thank God for TensorFloat32 tho.
@ZDi____
ZD1908
3 days
This has been preventing me from achieving anything in the last 2 days. It doesn't go away no matter what I try, and is completely random.
0
0
1
@ZDi____
ZD1908
3 days
HEVC moment for ViTs.
@kwangmoo_yi
Kwang Moo Yi
3 days
Choudhury and Kim et al., "Accelerating Vision Transformers With Adaptive Patch Sizes" Transformer patches don't need to be of uniform size -- choose sizes based on entropy --> faster training/inference. Are scale-spaces gonna make a comeback?
0
0
0
@ZDi____
ZD1908
3 days
@AnushElangovan TF32 seems stable enough on MI300X, why is it not on by default?
1
0
1
@ZDi____
ZD1908
3 days
This has been preventing me from achieving anything in the last 2 days. It doesn't go away no matter what I try, and is completely random.
0
0
0
@ZDi____
ZD1908
4 days
how a diffusion researches sees the world
@oprydai
Mustafa
7 days
how a mathematician sees the world
0
0
5
@ZDi____
ZD1908
4 days
A ver el nuevo tuit de Cristina Kirchner
@zxroprograma
zero
6 days
Hace un a帽o me dej贸 mi ex, este fue el 煤ltimo mensaje que le mand茅. Me clavo el visto y me bloque贸 馃槓
0
0
0
@ZDi____
ZD1908
5 days
Yeah, now it's fixed.
@ZDi____
ZD1908
5 days
I was wondering why my decoder was miserably failing to reduce loss. I just realized I forgot to tell Qwen code to make my transformer pre-norm.
0
0
1
@ZDi____
ZD1908
5 days
I was wondering why my decoder was miserably failing to reduce loss. I just realized I forgot to tell Qwen code to make my transformer pre-norm.
0
0
0
@ZDi____
ZD1908
6 days
Lazy way to make a dataloader efficient: just load the entire dataset into CPU RAM.
0
0
1
@CasaRosada
Casa Rosada
6 days
Este 19 de octubre conmemoramos los 111 a帽os del Paso a la Inmortalidad de Julio Argentino Roca, pr贸cer nacional, dos veces Presidente de la Naci贸n y figura clave en la consolidaci贸n del Estado argentino. Bajo su liderazgo se llev贸 a cabo la Campa帽a del Desierto, hito decisivo
423
2K
11K
@ZDi____
ZD1908
7 days
SUCESSFULLY WASTED 25 YEARS OF MY LIFE AWARD
1
0
1
@vin_acct
vin
7 days
i cant ever look at graphs like this the same again
@_martinsit
martin
17 days
4 months is all you need decided to nyc/sf or bust for my 2nd coop
3
1
54
@ylecun
Yann LeCun
7 days
@SebastienBubeck Hoisted by their own GPTards
69
78
2K
@ZDi____
ZD1908
7 days
Pretraining both encoder and decoder to build a rich prior for text and audio for later finetuning. It also allows me to take advantage of fixed-length training. Container is rocm/pytorch-training:v25.8, everything works out of the box.
0
0
1
@ZDi____
ZD1908
7 days
Pretraining decoder on unconditional AR modeling of 4B audio tokens and encoder on char-level masked language modeling. 28% MFU on 355M params after fused AdamW, torch.compile, Flash Att 2 on 1x@HotAisle MI300X. Later I'll connect the two on a small amount of paired data for TTS.
1
0
1
@ZDi____
ZD1908
8 days
The official PyTorch documentation says TensorFloat32 is not available on ROCm, but this is a lie: it's disabled unless HIPBLASLT_ALLOW_TF32=1, hiding a 2.8x speedup in FP32 matmuls. HIPBLASLT_ALLOW_TF32 should be on by default.
1
0
1