Mark Profile
Mark

@_M_Weber

Followers
362
Following
900
Media
27
Statuses
291

PhD student at the Dynamic Vision and Learning Group, and the Computer Vision Group, TU Munich; Computer Vision research

Joined November 2020
Don't wanna be here? Send us removal request.
@_M_Weber
Mark
28 days
➡️ For workshop specifics, including the evaluation package, baseline code, and more, visit: ➡️ Eval server (sign-up now!): #ICCV2025
0
1
0
@_M_Weber
Mark
28 days
Exciting news! We're happy to announce our challenge / workshop at this year @ICCVConference focusing on Spatiotemporal Action Grounding in Videos. Here are the details:. 🔷 Watch the video below for a demo. 🔷 The eval server is open until 09/19!.🔷 Links incl. code below. #ICCV
Tweet media one
1
2
12
@_M_Weber
Mark
1 month
RT @hannan_tanveer: 🎯 Challenge Launch Announcement.We are pleased to announce the launch of the MOT25 Challenge, to be held in conjunction….
0
1
0
@_M_Weber
Mark
4 months
RT @NandoDF: A beautiful article by D. Graham Burnett. ´The A.I. is huge. A tsunami. But it’s not me. It can’t touc….
0
4
0
@_M_Weber
Mark
4 months
Heading off to ICLR to present our work on image generation and latent spaces! If you're interested in tokenization or generation drop by at our poster. Also, if you'd like to chat about any of these topics, feel free to ping me! #ICLR2025.
@_M_Weber
Mark
8 months
🧵1/9 Happy to share our paper "MaskBit: Embedding-free Image Generation via Bit Tokens" got published in TMLR with featured (aka spotlight) & reproducibility certifications!. I'm especially excited about the disentangled visual concepts in our shared latent space. Details below!
1
0
14
@_M_Weber
Mark
6 months
RT @bopanc: Our exclusive with @JDVance ahead of @MunSecConf: -On Ukraine, he says there will be a good peace deal that will guarantee the….
0
164
0
@_M_Weber
Mark
8 months
RT @zhou_xian_: Everything you love about generative models — now powered by real physics!. Announcing the Genesis project — after a 24-mon….
0
3K
0
@_M_Weber
Mark
8 months
9/9 If you like this research and are hiring, I will be on the job market next summer!. Big thanks to my collaborators: @yucornetto1 @xueqingdeng77 @lcchen_jay @tumcvg @LijunYu0.
0
0
2
@_M_Weber
Mark
8 months
7/9 MaskBit achieves state-of-the-art performance with up to 1.52 FID on ImageNet 256×256, using just 305M parameters. That's better than prior diffusion and autoregressive models! 🔥.
1
0
0
@_M_Weber
Mark
8 months
6/9 For our generation model MaskBit, the key innovation is: We are the first to utilise the SAME (bit) token representation for both tokenizer and generator, unlike prior methods that require separate embedding tables! No need to (re-)learn codebooks anymore!.
1
0
0
@_M_Weber
Mark
8 months
5/9 Our analysis reveals fascinating properties of bit tokens: Most channels appear to capture different visual concepts, making the representation more interpretable! Flipping individual bits leads to systematic changes in attributes like texture, color, and style. 🎨.
1
0
0
@_M_Weber
Mark
8 months
4/9 Our tokenizer learns a semantically structured latent space! Checkout what happens when we flip bits in each channel. Having a consist visual interpretable latent space could be a gamechanger for control!
1
0
1
@_M_Weber
Mark
8 months
3/9 We carefully revisit the VQGAN design and provide a complete, reproducible recipe for building a modern tokenizer. Our VQGAN+ improves reconstruction FID from 7.94 to 1.66 in the low vocabulary, low resolution setting - a huge 6.28 gain! 📈 See chapter 2 for details.
1
0
0
@_M_Weber
Mark
8 months
2/9 We introduce three main contributions: . 1️⃣ A systematic study modernizing VQGAN (VQGAN+). 2⃣ A shared latent space generator and tokenizer based on bit tokens between with disentagled visual concepts. 3⃣ A new embedding-free image generation approach based solely bit tokens.
1
0
0
@_M_Weber
Mark
8 months
🧵1/9 Happy to share our paper "MaskBit: Embedding-free Image Generation via Bit Tokens" got published in TMLR with featured (aka spotlight) & reproducibility certifications!. I'm especially excited about the disentangled visual concepts in our shared latent space. Details below!
1
4
61
@_M_Weber
Mark
9 months
Project page:
0
0
0
@_M_Weber
Mark
9 months
I'm happy to present our work "An Image is Worth 32 Tokens for Reconstruction and Generation" at #NeurIPS2024. Would you like to learn more about efficient image tokenization and generation? Then join us on Friday 11am-2pm and save the poster #1604 to your schedule!
Tweet media one
5
9
71
@_M_Weber
Mark
10 months
Tweet media one
0
0
2
@_M_Weber
Mark
11 months
If you want to learn even more about efficient image tokenization, also checkout our work "An Image is Worth 32 Tokens for Reconstruction and Generation"!. ➡️ Project page:
0
0
2