Cumquaaa Profile Banner
Junhao Chen Profile
Junhao Chen

@Cumquaaa

Followers
14
Following
32
Media
4
Statuses
12

Senior @Tsinghua_Uni, previously interned @tsvetshop. My interests lie in NLP, CV and RL.

Joined August 2023
Don't wanna be here? Send us removal request.
@Cumquaaa
Junhao Chen
6 days
🚀 Training an image generation model and picking sides between autoregressive (AR) and diffusion? Why not both? Check out MADFormer with half of the model layers for AR and half for diffusion. AR gives a fast guess for the next patch prediction while diffusion helps refine the
Tweet media one
4
4
15
@Cumquaaa
Junhao Chen
4 days
RT @yuqirose: How do we ground #LLMs for Scientific Problems to mitigate the issue of hallucination? Check out our #icml2025 paper on ``Ad….
0
3
0
@Cumquaaa
Junhao Chen
6 days
🎯 Takeaway: Tune your AR⇄Diffusion balance to match your compute budget, more AR for speed and more diffusion for quality. 🙏 Huge thanks to @XiaochuangHan and Yulia @tsvetshop for the collaboration! Keen to hear your thoughts!. ( 4 /🧵).
0
0
2
@Cumquaaa
Junhao Chen
6 days
💡 Key findings:. - Under a low inference budget/NFE, simply allocating more layers to AR and less to diffusion (i.e., an AR-heavy design) yields up to 75% FID gains; with high inference budget, diffusion-heavy designs win for finer details. - More on optimal AR block size,
Tweet media one
1
0
2
@Cumquaaa
Junhao Chen
6 days
@tsvetshop @XiaochuangHan 🔍 MADFormer explores the Mixed AR + Diffusion design space in image generation Transformers. - Token axis: split an image into blocks of patches (e.g. 4/16/64 blocks), with AR conditioning across the blocks and diffusion within the block. - Layer axis: allocate the first (N–D)
Tweet media one
1
0
2
@Cumquaaa
Junhao Chen
1 month
RT @stingning: Our framework supports various online RL algorithms. In our experiments, we use GRPO with the following optimizations:.1️⃣ P….
0
2
0
@Cumquaaa
Junhao Chen
1 month
RT @stingning: SOTA: SimpleVLA-RL achieves 98.4% on LIBERO. 🎯With only 1 trajectory/task for SFT:. 🚀LIBERO-Avg: 48.9%→94.1%. 🚀LIBERO-Long….
0
1
0
@Cumquaaa
Junhao Chen
1 month
RT @stingning: Moonlighting a bit: we implement Online RL for VLA models with @verl🤖, and find simple outcome rewards can work surprisingly….
0
8
0
@Cumquaaa
Junhao Chen
3 months
RT @ZhiyuanZeng_: 🛠️ Build your own LLM (or benchmark) analysis/debugging tool 🚀 Try our demo .
0
17
0
@Cumquaaa
Junhao Chen
4 months
RT @ZhiyuanZeng_: Is a single accuracy number all we can get from model evals?🤔.🚨Does NOT tell where the model fails.🚨Does NOT tell how to….
0
87
0