Bingchen Zhao @BingchenZhao X Profile

Bingchen Zhao

@BingchenZhao

Followers

354

Following

945

Media

16

Statuses

95

PhD student at the University of Edinburgh @ancAtEd @EdinburghVision. https://t.co/WDUG64sGBu

Joined April 2022

Don't wanna be here? Send us removal request.

Bingchen Zhao

@BingchenZhao

6 days

Great work! Our concurrent work on Semanticist ( gives a proof for the PCA structure and also some insight towards why semantic meaningful structure seems to emerge (It's all about the diffusion-based decoder!).

Amir Zamir

@zamir_ar

15 days

We open-sourced the codebase of Flextok. Flextok is an image tokenizer that produces flexible-length token sequences and represents image content in a compressed coarse-to-fine way. Like in PCA: the 1st token captures the most compressed representation of the image, the 2nd

0

8

Bingchen Zhao

@BingchenZhao

8 days

RT @yorambac: AI Research Agents are becoming proficient at machine learning tasks, but how can we help them search the space of candidate….

0

62

0

Bingchen Zhao

@BingchenZhao

10 days

RT @SamuelAlbanie: TLDR: LLM agents don't yet crush the NanoGPT speedrun. interesting benchmarking study from @MinqiJiang and others https….

0

20

0

Bingchen Zhao

@BingchenZhao

15 days

RT @karpathy: Love this project: nanoGPT -> recursive self-improvement benchmark. Good old nanoGPT keeps on giving and surprising :). - Fi….

0

689

0

Bingchen Zhao

@BingchenZhao

15 days

RT @iScienceLuvr: The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements. "To evaluate the ability of AI agents to repr….

0

34

0

Bingchen Zhao

@BingchenZhao

15 days

RT @j_foerst: The AIRA team @metaai has the ambitious goal of building/training an agent that can do frontier AI research to help the open-….

0

10

0

Bingchen Zhao

@BingchenZhao

15 days

RT @MinqiJiang: Recently, there has been a lot of talk of LLM agents automating ML research itself. If Llama 5 can create Llama 6, then sur….

0

190

0

Bingchen Zhao

@BingchenZhao

2 months

RT @HaoqiFan: 🚀 BAGEL — the Unified Multimodal Model with emergent capabilities and production-ready performance — is finally live!. Dive i….

0

25

0

Bingchen Zhao

@BingchenZhao

2 months

RT @j_foerst: Hello World: My team at FAIR / @metaai (AI Research Agent) is looking to hire contractors across software engineering and ML.….

0

23

0

Bingchen Zhao

@BingchenZhao

2 months

RT @xuandongzhao: 🚀 Excited to share the most inspiring work I’ve been part of this year:. "Learning to Reason without External Rewards"….

0

511

0

Bingchen Zhao

@BingchenZhao

2 months

RT @AI4VAWorkshop: 🎨 The AI for Visual Arts (AI4VA) Workshop is back for its 2nd edition at #ICCV2025 in Honolulu, HI, USA! 📢 Now accepting….

0

5

0

Bingchen Zhao

@BingchenZhao

3 months

RT @natanielruizg: I'm sharing the product of an exciting collaboration between UC Santa Cruz, Google and others. The first of its kind: a….

0

20

0

Bingchen Zhao

@BingchenZhao

4 months

7️⃣ Try It Yourself!. 📄 Paper: 💻 Code: ⚡ Demos: .🔹 Tokenizer: 🔹 Autoregressive Gen: 💻 Webpage: If you're interested in efficient & interpretable image.

0

2

Bingchen Zhao

@BingchenZhao

4 months

5️⃣ Results: State-of-the-Art Performance. 🏆 Lowest reconstruction FID among visual tokenizers.🎨 Better generation quality with far fewer tokens, thanks to our PCA-like hierarchical structure.📈 Higher linear probing accuracy due to structured, interpretable tokenization.💡 Why

1

0

2

Bingchen Zhao

@BingchenZhao

4 months

4️⃣ Why This Matters. 🎯 Semantic-Spectrum Coupling Limits Existing Tokenizers.In VQ-VAE and TiTok, increasing the number of tokens simultaneously affects both:.🔹 Power spectrum (low-level intensity details).🔹 Semantic content (high-level image meaning).This means early tokens

1

0

1

Bingchen Zhao

@BingchenZhao

4 months

3️⃣ How It Works. 💡 Causal Tokenization with ViT → 1D sequence where early tokens capture key semantics, later ones refine. 💡 Nested Classifier-Free Guidance → Enforces a PCA-like hierarchy, prioritizing important details first. 💡 Diffusion-Based Reconstruction → DiT decoder.

1

0

1

Bingchen Zhao

@BingchenZhao

4 months

2️⃣ Our Solution: Semanticist. 🔹 PCA-Guided Tokenization: Tokens are causally ordered, ensuring earlier tokens capture more salient features. 🔹 Semantic-Spectrum Decoupling: Tokens separate semantics from spectral details, avoiding inefficiencies. 🔹 Diffusion-Based Decoding: A

1

0

1

Bingchen Zhao

@BingchenZhao

4 months

1️⃣ The Problem. Modern visual tokenizers (e.g., VQ-VAE, SD-VAE) optimize for reconstruction fidelity but ignore semantic structure. ❌ Tokens are arbitrarily learned.❌ High-level semantics & low-level details are entangled.We ask: Can we tokenize images like PCA—compact,.

1

0

1

Bingchen Zhao

@BingchenZhao

4 months

📢 New Paper Alert! 📢. "Principal Components" Enable A New Language of Images ✨ We introduce Semanticist, a PCA-guided tokenizer that revolutionizes visual tokenization for generative models!. 🧵 Thread below! 👇

1

2

4

Bingchen Zhao

@BingchenZhao

7 months

RT @JeffDean: @moderncpp7 @clu_cheng @NeurIPSConf @drfeifei @jhyuxm @edchi I didn't see the talk, but the images I've seen of the slide see….

0

158

0