
Mathurin Videau
@mathuvu_
Followers
113
Following
24
Media
12
Statuses
30
Joined October 2024
RT @ni_jovanovic: There's a lot of work now on LLM watermarking. But can we extend this to transformers trained for autoregressive image ge….
0
54
0
RT @iScienceLuvr: From Bytes to Ideas: Language Modeling with Autoregressive U-Nets. "Byte Pair Encoding (BPE) and similar schemes split te….
0
83
0
RT @arankomatsuzaki: From Bytes to Ideas: Language Modeling with Autoregressive U-Nets. Presents an autoregressive U-Net that processes raw….
0
54
0
We present an Autoregressive U-Net that incorporates tokenization inside the model, pooling raw bytes into words then word-groups. AU-Net focuses most of its compute on building latent vectors that correspond to larger units of meaning. Joint work with @byoubii 1/8
14
47
195
RT @KrunoLehman: 1/ Happy to share my first accepted paper as a PhD student at @Meta and @ENS_ULM which I will present at @iclr_conf: . 📚 O….
0
13
0
RT @TimDarcet: Want strong SSL, but not the complexity of DINOv2?. CAPI: Cluster and Predict Latents Patches for Improved Masked Image Mode….
0
108
0
RT @cloneofsimo: Goddamn, this repo is true beauty. simple (not bloated).effective, scalable.elegant, just the right amount of abstraction.….
0
23
0
RT @andrew_n_carr: A great example of FlexAttention used in a reasonably modern code base is Lingua. Which is designed to reproduce Llama 2….
0
11
0