XingangP Profile Banner
Xingang Pan Profile
Xingang Pan

@XingangP

Followers
3K
Following
324
Media
22
Statuses
76

Assistant Professor at Nanyang Technological University @NTUsg @MMLabNTU - Computer Vision, Deep Learning, Computer Graphics

Singapore
Joined May 2018
Don't wanna be here? Send us removal request.
@XingangP
Xingang Pan
3 days
Introducing ๐Ÿ“ฆ๐—”๐—ฟ๐˜๐—ถ๐—Ÿ๐—ฎ๐˜๐—ฒ๐—ป๐˜๐Ÿ”ง (SIGGRAPH Asia 2025) โ€” a high-quality 3D diffusion model that explicitly models object articulation, paving the way for richer, more realistic assets in embodied AI and simulation: โ€“ Generates fully articulated 3D objects โ€“ Physically
2
37
167
@_akhaliq
AK
3 months
STream3R Scalable Sequential 3D Reconstruction with Causal Transformer
4
14
110
@XingangP
Xingang Pan
3 months
Cool work that connects the idea of volume rendering with image diffusion!
@xiaohangzhan
Xiaohang Zhan
3 months
Our paper LaRender received full marks at ICCV 2025 and was selected as oral! This paper enables control of occlusion relationships among objects and visual effects in a training-free manner for diffusion-based image generation. Project page: https://t.co/XzjMZuJ4a4
0
1
8
@XingangP
Xingang Pan
3 months
Introducing ๐—ฆ๐—ง๐—ฟ๐—ฒ๐—ฎ๐—บ๐Ÿฏ๐—ฅ, a new 3D geometric foundation model for efficient 3D reconstruction from streaming input. Similar to LLMs, STream3R uses casual attention during training and KVCache at inference. No need to worry about post-alignment or reconstructing from scratch.
@GROS17121524
Yushi LAN
3 months
๐Ÿ”ฅStreaming-based 3D/4D Foundation Model๐Ÿ”ฅ We present STream3R, which reformulates dense 3D/4D reconstruction into a sequential registration task with **causal attention**. - Projects: https://t.co/zrLlvxJ0FJ - Code: https://t.co/ONYaJDrjhF - Model:
5
58
320
@_akhaliq
AK
3 months
Grok 4 one shots building a gemma-3-270m chatbot with transformers.js one click deploy in anycoder
9
13
106
@XingangP
Xingang Pan
4 months
Directly training Video Diffusion Models on long videos faces huge memory and learning challenges. How do we model long-range temporal distribution then? Our ICCV 2025 work, ๐ŸŽž๏ธ๐—ง๐—ผ๐—ธ๐—ฒ๐—ป๐˜€๐—š๐—ฒ๐—ป, offers a solution. We compress videos into a highly condensed token space, enabling
1
25
106
@XingangP
Xingang Pan
7 months
๐—ช๐—ผ๐—ฟ๐—น๐—ฑ๐— ๐—ฒ๐—บ is mainly created by @zeqi_xiao Project page: https://t.co/vi78xdY2TT ArXiv: https://t.co/Cu4YwGy7YP Github: https://t.co/3PEcUJDYCw Demo:
1
0
4
@XingangP
Xingang Pan
7 months
Synthesizing worlds with video diffusion models is often inconsistent โ€” moving the camera back and forth leads to different scenes. We propose ๐ŸŒ๐—ช๐—ผ๐—ฟ๐—น๐—ฑ๐— ๐—ฒ๐—บ, a memory-based approach that ensures consistent world simulation without relying on explicit 3D reconstruction.
@zeqi_xiao
Zeqi Xiao
7 months
While recent works like Genie 2, The Matrix, and Navigation World Models explore video generative models as world simulators, world consistency remains underexplored. In this work, we propose ๐ŸŒWorldMem๐ŸŒ, introducing a memory mechanism for long-term consistent world simulation.
2
27
148
@XingangP
Xingang Pan
8 months
Diffusion models are sensitive to small changes in the input noise. We introduce Alias-Free Latent Diffusion Models (๐—”๐—™-๐—Ÿ๐——๐— ) at #CVPR2025. It achieves shift-equivariance and generates consistent outputs. Project: https://t.co/nehjzSFAVU arXiv: https://t.co/CksgC8A0Ph
8
63
409
@XingangP
Xingang Pan
8 months
The Bokeh Effect is so important in photography, yet existing text2image diffusion models do not support controling bokeh strength. We introduce ๐—•๐—ผ๐—ธ๐—ฒ๐—ต ๐——๐—ถ๐—ณ๐—ณ๐˜‚๐˜€๐—ถ๐—ผ๐—ป, a T2I diffusion model that supports flexible background blur control! Project: https://t.co/YlnSETImsz
1
10
44
@TheYihangLuo
Yihang Luo
9 months
๐Ÿ’ฅ Consistent Multi-View Diffusion for 3D Enhancement ๐Ÿ’ฅ Introducing our work #3DEnhancer @CVPR: a multi-view diffusion model that enhances multi-view images to improve 3D models. ๐Ÿ“ฐarXiv: https://t.co/eNvgSTsKWN ๐Ÿ”ฅProject: https://t.co/VDPG5NvRSt
1
10
24
@he_zexin
Zexin He
11 months
๐ŸŽ‰Excited to share Neural LightRig!๐ŸŽ‰ It allows for accurate and fast estimation of surface normals and PBR materials from just one image. We achieve this by generating multi-light images with a diffusion model, overcoming the estimation ambiguity of inverse rendering.๐Ÿš€ Page:
1
21
66
@zeqi_xiao
Zeqi Xiao
1 year
Introducing ๐Ÿ’กTrajectory Attention for Fine-grained Video Motion Control๐Ÿ’ก. By augmenting attention along predefined trajectories, our approach empowers tasks such as camera motion control in images and videos, as well as video editing.
1
11
62
@XingangP
Xingang Pan
1 year
Introducing ๐’๐€๐‘๐Ÿ‘๐ƒ, which tokenizes 3D objects into multiscale tokens and generates 3D objects by autoregressive next-scale prediction. ๐’๐€๐‘๐Ÿ‘๐ƒ enables fast 3D generation and comprehensive 3D understanding. arXiv: https://t.co/xIKWx8o8I4 Project: https://t.co/8hUptJOubR
2
53
240
@XingangP
Xingang Pan
1 year
Project page:
0
0
9
@XingangP
Xingang Pan
1 year
Introducing ๐†๐š๐ฎ๐ฌ๐ฌ๐ข๐š๐ง ๐€๐ง๐ฒ๐ญ๐ก๐ข๐ง๐ , a new 3D generative model with two key properties: - A structured point-cloud latent space enabling flexible editing! - Support multi-modal conditions, e.g., point cloud, text, single/multi-view images arXiv: https://t.co/fahQOFeDAa
9
50
300