XingangP Profile Banner
Xingang Pan Profile
Xingang Pan

@XingangP

Followers
3K
Following
322
Media
21
Statuses
75

Assistant Professor at Nanyang Technological University @NTUsg @MMLabNTU - Computer Vision, Deep Learning, Computer Graphics

Singapore
Joined May 2018
Don't wanna be here? Send us removal request.
@XingangP
Xingang Pan
10 hours
RT @_akhaliq: STream3R. Scalable Sequential 3D Reconstruction with Causal Transformer
0
4
0
@XingangP
Xingang Pan
10 hours
Cool work that connects the idea of volume rendering with image diffusion!.
@xiaohangzhan
Xiaohang Zhan
1 day
Our paper LaRender received full marks at ICCV 2025 and was selected as oral! This paper enables control of occlusion relationships among objects and visual effects in a training-free manner for diffusion-based image generation. Project page:
0
0
0
@grok
Grok
5 days
The most fun image & video creation tool in the world is here. Try it for free in the Grok App.
0
37
338
@XingangP
Xingang Pan
10 hours
Introducing ๐—ฆ๐—ง๐—ฟ๐—ฒ๐—ฎ๐—บ๐Ÿฏ๐—ฅ, a new 3D geometric foundation model for efficient 3D reconstruction from streaming input. Similar to LLMs, STream3R uses casual attention during training and KVCache at inference. No need to worry about post-alignment or reconstructing from scratch.
@GROS17121524
Yushi LAN
11 hours
๐Ÿ”ฅStreaming-based 3D/4D Foundation Model๐Ÿ”ฅ. We present STream3R, which reformulates dense 3D/4D reconstruction into a sequential registration task with **causal attention**. - Projects: - Code: - Model:
4
16
92
@XingangP
Xingang Pan
10 hours
RT @_akhaliq: Grok 4 one shots building a gemma-3-270m chatbot with transformers.js. one click deploy in anycoder
0
9
0
@XingangP
Xingang Pan
17 days
Directly training Video Diffusion Models on long videos faces huge memory and learning challenges. How do we model long-range temporal distribution then?. Our ICCV 2025 work, ๐ŸŽž๏ธ๐—ง๐—ผ๐—ธ๐—ฒ๐—ป๐˜€๐—š๐—ฒ๐—ป, offers a solution. We compress videos into a highly condensed token space, enabling
0
24
102
@XingangP
Xingang Pan
4 months
๐—ช๐—ผ๐—ฟ๐—น๐—ฑ๐— ๐—ฒ๐—บ is mainly created by @zeqi_xiao .Project page: ArXiv: Github: Demo:
1
0
4
@XingangP
Xingang Pan
4 months
Synthesizing worlds with video diffusion models is often inconsistent โ€” moving the camera back and forth leads to different scenes. We propose ๐ŸŒ๐—ช๐—ผ๐—ฟ๐—น๐—ฑ๐— ๐—ฒ๐—บ, a memory-based approach that ensures consistent world simulation without relying on explicit 3D reconstruction.
@zeqi_xiao
Zeqi Xiao
4 months
While recent works like Genie 2, The Matrix, and Navigation World Models explore video generative models as world simulators, world consistency remains underexplored. In this work, we propose ๐ŸŒWorldMem๐ŸŒ, introducing a memory mechanism for long-term consistent world simulation.
2
26
148
@XingangP
Xingang Pan
5 months
Diffusion models are sensitive to small changes in the input noise. We introduce Alias-Free Latent Diffusion Models (๐—”๐—™-๐—Ÿ๐——๐— ) at #CVPR2025. It achieves shift-equivariance and generates consistent outputs. Project: arXiv:
8
62
411
@XingangP
Xingang Pan
5 months
The Bokeh Effect is so important in photography, yet existing text2image diffusion models do not support controling bokeh strength. We introduce ๐—•๐—ผ๐—ธ๐—ฒ๐—ต ๐——๐—ถ๐—ณ๐—ณ๐˜‚๐˜€๐—ถ๐—ผ๐—ป, a T2I diffusion model that supports flexible background blur control!. Project:
Tweet media one
Tweet media two
Tweet media three
1
9
43
@XingangP
Xingang Pan
5 months
RT @TheYihangLuo: ๐Ÿ’ฅ Consistent Multi-View Diffusion for 3D Enhancement ๐Ÿ’ฅ. Introducing our work #3DEnhancer @CVPR: a multi-view diffusion moโ€ฆ.
0
9
0
@XingangP
Xingang Pan
8 months
RT @he_zexin: ๐ŸŽ‰Excited to share Neural LightRig!๐ŸŽ‰. It allows for accurate and fast estimation of surface normals and PBR materials from jusโ€ฆ.
0
20
0
@XingangP
Xingang Pan
9 months
RT @zeqi_xiao: Introducing ๐Ÿ’กTrajectory Attention for Fine-grained Video Motion Control๐Ÿ’ก. By augmenting attention along predefined trajectorโ€ฆ.
0
10
0
@XingangP
Xingang Pan
9 months
Introducing ๐’๐€๐‘๐Ÿ‘๐ƒ, which tokenizes 3D objects into multiscale tokens and generates 3D objects by autoregressive next-scale prediction. ๐’๐€๐‘๐Ÿ‘๐ƒ enables fast 3D generation and comprehensive 3D understanding. arXiv: Project:
2
51
239
@XingangP
Xingang Pan
9 months
Project page:
0
0
9
@XingangP
Xingang Pan
9 months
Introducing ๐†๐š๐ฎ๐ฌ๐ฌ๐ข๐š๐ง ๐€๐ง๐ฒ๐ญ๐ก๐ข๐ง๐ , a new 3D generative model with two key properties:.- A structured point-cloud latent space enabling flexible editing!.- Support multi-modal conditions, e.g., point cloud, text, single/multi-view images. arXiv:
9
49
301
@XingangP
Xingang Pan
10 months
Can we drag 3D objects with large structure changes, like what we can do with DragGAN? .Our work ๐Ÿค๐Œ๐•๐ƒ๐ซ๐š๐ ๐Ÿ‘๐ƒ๐Ÿฆ makes this possible. Now you can open a lion's mouth or spread a bird's wings in 3D!. arXiv: Project:
1
28
141