lawrence_cjs Profile Banner
Junsong_Chen Profile
Junsong_Chen

@lawrence_cjs

Followers
203
Following
79
Media
9
Statuses
53

HKU Ph.D, NVIDIA Research Internship

Hong Kong
Joined February 2022
Don't wanna be here? Send us removal request.
@xieenze_jr
Enze Xie
7 days
We (@lawrence_cjs, @yuyangzhao_ , @shanasaimoe) from the SANA team just posted a blog on the core of Linear Attention: how it achieves infinite context lengths with global awareness but constant memory usage! We explore state accumulation mechanics, the evolution from Softmax to
@xieenze_jr
Enze Xie
1 month
The training/ Inference code and checkpoints are released. Welcome to try!
4
34
179
@lawrence_cjs
Junsong_Chen
7 days
How Linear Attention and Softmax Attention differ in compute and KV-Cache for LLMs and long-video generation. Let's start with this blog. https://t.co/Ja5El08muf
@xieenze_jr
Enze Xie
7 days
We (@lawrence_cjs, @yuyangzhao_ , @shanasaimoe) from the SANA team just posted a blog on the core of Linear Attention: how it achieves infinite context lengths with global awareness but constant memory usage! We explore state accumulation mechanics, the evolution from Softmax to
0
1
1
@xieenze_jr
Enze Xie
2 months
Sora 2 is amazing!, But AI video generation inference speed is too slow. Try our Deep Compression Autoencoder + Linear Attention! πŸš€πŸ”₯ https://t.co/ooNowz8HH7 https://t.co/PU8oUI2hsU
github.com
DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder - dc-ai-projects/DC-VideoGen
@xieenze_jr
Enze Xie
2 months
πŸš€ SANA-Video: Linear Attention + Constant-Memory KV Cache = Fast Long Videos πŸ’₯ Key Features 🌟 🧠 Linear DiT everywhere β†’ O(N) complexity on video-scale tokens 🧰 Constant-memory Block KV cache β†’ store cumulative states only (no growing KV) πŸ”„ 🎯 Temporal Mix-FFN + 3D RoPE
1
8
71
@lawrence_cjs
Junsong_Chen
2 months
Thanks so much @_akhaliq for sharing our recent work. Our homepage is here:
@_akhaliq
AK
2 months
SANA-Video Efficient Video Generation with Block Linear Diffusion Transformer
0
0
1
@hancai_hm
Han Cai
2 months
Changing the autoencoder in latent diffusion models is easier than you think. πŸš€ Introducing DC-Gen – a post-training acceleration framework that works with any pre-trained diffusion model, boosting efficiency by transferring it into a deeply compressed latent space with
5
38
222
@hancai_hm
Han Cai
2 months
We release DC-VideoGen, a new post-training framework for accelerating video diffusion models. Key features: 🎬 Supports video generation up to 2160Γ—3840 (4K) resolution on a single H100 GPU ⚑ Delivers 14.8Γ— faster inference than the base model while achieving comparable or
2
28
145
@xieenze_jr
Enze Xie
2 months
πŸš€ SANA-Video: Linear Attention + Constant-Memory KV Cache = Fast Long Videos πŸ’₯ Key Features 🌟 🧠 Linear DiT everywhere β†’ O(N) complexity on video-scale tokens 🧰 Constant-memory Block KV cache β†’ store cumulative states only (no growing KV) πŸ”„ 🎯 Temporal Mix-FFN + 3D RoPE
3
20
126
@lawrence_cjs
Junsong_Chen
2 months
Finally, 36s for 5s 720p on H100; 4Γ— speedup vs vanilla attention at 720p 29s on RTX 5090 with NVFP4 (2.4x faster) Fixed VRAM vs sequence length; strong text–video alignment
0
0
0
@lawrence_cjs
Junsong_Chen
2 months
3. Temporal Mix-FFN+3D RoPE β†’ local fidelity + temporal coherence 🎯 4. AR block training with Self rollout β†’ minute-length generation πŸ“Š
1
0
0
@lawrence_cjs
Junsong_Chen
2 months
2. Constant-Memory Block KV cacheβ†’cumulative states only (no growing KV) πŸ”„
1
0
0
@lawrence_cjs
Junsong_Chen
2 months
Keys🌟 1. Linear DiT everywhere β†’ O(N) complexity on video-scale tokens
1
0
0
@lawrence_cjs
Junsong_Chen
2 months
πŸš€ SANA-Video: Linear Attention + Constant-Memory KV Cache = Fast Long Videos πŸ’₯ It's time for a new SANA family member! Links 🌐 πŸ“– Paper: https://t.co/snV4bF8jUM πŸ’» Project Page: https://t.co/9WZIp7ryX6
1
1
3
@lawrence_cjs
Junsong_Chen
2 months
Explore recent work from our team. Long-Live generates minute-length videos and interacts as you want with real-time fast speed! Very cool project. πŸŽ‰
@yukangchen_
Yukang Chen
2 months
πŸš€ We open-sourced LongLive β€” interactive, real-time long-video generation. πŸ‘₯Generates video in real time as users enter text prompts. ⚑️20.7 FPS on a single H100,⏱️up to 240s per clip. 🎬Fine-tunes SOTA short-video models (e.g., Wan) into long-video generators. 🌍One step
0
0
1
@songhan_mit
Song Han
3 months
Explore Deep Compression Autoencoder (DC-AE) 1.5 with higher token compression ratio (64x) for faster visual generation:
@hancai_hm
Han Cai
3 months
πŸš€ Excited to announce DC-AE 1.5! With a spatial compression ratio boosted to f64, it accelerates high-res diffusion models while preserving text-to-image quality. Key innovation: channel-wise latent structure for faster convergence with many latent channels. πŸ“ Catch us at
1
2
24
@RisingSayak
Sayak Paul
9 months
The best few-step sampling model across the speed-memory frontier? 😱 Introducing SANA-Sprint in collaboration with the great SANA team! Beyond the results, perhaps more importantly, the work is about the recipe of SANA-Sprint. Code & model will be open ❀️ Let's go ⬇️
12
26
162
@_akhaliq
AK
9 months
SANA-Sprint One-Step Diffusion with Continuous-Time Consistency Distillation
10
66
425
@songhan_mit
Song Han
9 months
Explore our one-step diffusion model, SANA-Sprint. Very fast:
0
5
36
@clu_cheng
Cheng Lu
9 months
Still think consistency models are bad at scale? In fact, sCM can be stably scaled to modern text-to-image diffusion models and greatly improve the generation speed and 1-step generation quality!
3
4
55
@lawrence_cjs
Junsong_Chen
9 months
Excited for πŸƒSANA-Sprint. πŸš€Code and weights will be released very soon along with diffusers. Study tuned!❀️
0
0
3
@lawrence_cjs
Junsong_Chen
10 months
Introducing Sana-1.5. Model scaling up, then scaling down. Also inference time scaling is working as an auto end to end pipeline.
@xieenze_jr
Enze Xie
10 months
πŸ”₯ SANA 1.5: A linear Diffusion Transformer pushes SOTA in text-to-image generation! Key innovations: β€’ Depth-growth training: 1.6B β†’ 4.8B params β€’ Memory-efficient 8-bit optimizer β€’ Flexible model pruning β€’ Inference scaling for better quality Achieves 0.80 on GenEval! πŸš€
0
0
2