BingchenZhao Profile Banner
Bingchen Zhao Profile
Bingchen Zhao

@BingchenZhao

Followers
354
Following
945
Media
16
Statuses
95

PhD student at the University of Edinburgh @ancAtEd @EdinburghVision. https://t.co/WDUG64sGBu

Joined April 2022
Don't wanna be here? Send us removal request.
@BingchenZhao
Bingchen Zhao
6 days
Great work! Our concurrent work on Semanticist ( gives a proof for the PCA structure and also some insight towards why semantic meaningful structure seems to emerge (It's all about the diffusion-based decoder!).
@zamir_ar
Amir Zamir
15 days
We open-sourced the codebase of Flextok. Flextok is an image tokenizer that produces flexible-length token sequences and represents image content in a compressed coarse-to-fine way. Like in PCA: the 1st token captures the most compressed representation of the image, the 2nd
Tweet media one
0
0
8
@BingchenZhao
Bingchen Zhao
8 days
RT @yorambac: AI Research Agents are becoming proficient at machine learning tasks, but how can we help them search the space of candidate….
0
62
0
@BingchenZhao
Bingchen Zhao
10 days
RT @SamuelAlbanie: TLDR: LLM agents don't yet crush the NanoGPT speedrun. interesting benchmarking study from @MinqiJiang and others https….
0
20
0
@BingchenZhao
Bingchen Zhao
15 days
RT @karpathy: Love this project: nanoGPT -> recursive self-improvement benchmark. Good old nanoGPT keeps on giving and surprising :). - Fi….
0
689
0
@BingchenZhao
Bingchen Zhao
15 days
RT @iScienceLuvr: The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements. "To evaluate the ability of AI agents to repr….
0
34
0
@BingchenZhao
Bingchen Zhao
15 days
RT @j_foerst: The AIRA team @metaai has the ambitious goal of building/training an agent that can do frontier AI research to help the open-….
0
10
0
@BingchenZhao
Bingchen Zhao
15 days
RT @MinqiJiang: Recently, there has been a lot of talk of LLM agents automating ML research itself. If Llama 5 can create Llama 6, then sur….
0
190
0
@BingchenZhao
Bingchen Zhao
2 months
RT @HaoqiFan: 🚀 BAGEL — the Unified Multimodal Model with emergent capabilities and production-ready performance — is finally live!. Dive i….
0
25
0
@BingchenZhao
Bingchen Zhao
2 months
RT @j_foerst: Hello World: My team at FAIR / @metaai (AI Research Agent) is looking to hire contractors across software engineering and ML.….
0
23
0
@BingchenZhao
Bingchen Zhao
2 months
RT @xuandongzhao: 🚀 Excited to share the most inspiring work I’ve been part of this year:. "Learning to Reason without External Rewards"….
0
511
0
@BingchenZhao
Bingchen Zhao
2 months
RT @AI4VAWorkshop: 🎨 The AI for Visual Arts (AI4VA) Workshop is back for its 2nd edition at #ICCV2025 in Honolulu, HI, USA! 📢 Now accepting….
0
5
0
@BingchenZhao
Bingchen Zhao
3 months
RT @natanielruizg: I'm sharing the product of an exciting collaboration between UC Santa Cruz, Google and others. The first of its kind: a….
0
20
0
@BingchenZhao
Bingchen Zhao
4 months
7️⃣ Try It Yourself!. 📄 Paper: 💻 Code: ⚡ Demos: .🔹 Tokenizer: 🔹 Autoregressive Gen: 💻 Webpage: If you're interested in efficient & interpretable image.
0
0
2
@BingchenZhao
Bingchen Zhao
4 months
5️⃣ Results: State-of-the-Art Performance. 🏆 Lowest reconstruction FID among visual tokenizers.🎨 Better generation quality with far fewer tokens, thanks to our PCA-like hierarchical structure.📈 Higher linear probing accuracy due to structured, interpretable tokenization.💡 Why
Tweet media one
Tweet media two
1
0
2
@BingchenZhao
Bingchen Zhao
4 months
4️⃣ Why This Matters. 🎯 Semantic-Spectrum Coupling Limits Existing Tokenizers.In VQ-VAE and TiTok, increasing the number of tokens simultaneously affects both:.🔹 Power spectrum (low-level intensity details).🔹 Semantic content (high-level image meaning).This means early tokens
Tweet media one
Tweet media two
1
0
1
@BingchenZhao
Bingchen Zhao
4 months
3️⃣ How It Works. 💡 Causal Tokenization with ViT → 1D sequence where early tokens capture key semantics, later ones refine. 💡 Nested Classifier-Free Guidance → Enforces a PCA-like hierarchy, prioritizing important details first. 💡 Diffusion-Based Reconstruction → DiT decoder.
1
0
1
@BingchenZhao
Bingchen Zhao
4 months
2️⃣ Our Solution: Semanticist. 🔹 PCA-Guided Tokenization: Tokens are causally ordered, ensuring earlier tokens capture more salient features. 🔹 Semantic-Spectrum Decoupling: Tokens separate semantics from spectral details, avoiding inefficiencies. 🔹 Diffusion-Based Decoding: A
Tweet media one
Tweet media two
1
0
1
@BingchenZhao
Bingchen Zhao
4 months
1️⃣ The Problem. Modern visual tokenizers (e.g., VQ-VAE, SD-VAE) optimize for reconstruction fidelity but ignore semantic structure. ❌ Tokens are arbitrarily learned.❌ High-level semantics & low-level details are entangled.We ask: Can we tokenize images like PCA—compact,.
1
0
1
@BingchenZhao
Bingchen Zhao
4 months
📢 New Paper Alert! 📢. "Principal Components" Enable A New Language of Images ✨ We introduce Semanticist, a PCA-guided tokenizer that revolutionizes visual tokenization for generative models!. 🧵 Thread below! 👇
Tweet media one
1
2
4
@BingchenZhao
Bingchen Zhao
7 months
RT @JeffDean: @moderncpp7 @clu_cheng @NeurIPSConf @drfeifei @jhyuxm @edchi I didn't see the talk, but the images I've seen of the slide see….
0
158
0