Hao Zhao Profile
Hao Zhao

@HaoZhao_AIRSUN

Followers
608
Following
959
Media
11
Statuses
58

https://t.co/l52lTpdO16 Computer vision is good, have fun.

Tsinghua University
Joined July 2024
Don't wanna be here? Send us removal request.
@HaoZhao_AIRSUN
Hao Zhao
5 days
๐Ÿš€ GaussianArt (3DV 2026) is here! A single-stage unified geometryโ€“motion model that finally scales articulated reconstruction to 20+ parts with order-of-magnitude higher accuracy. Evaluated on MPArt-90, the largest articulated benchmark to date. Code + project page below ๐Ÿ‘‡ ๐Ÿ”—
1
23
158
@HaoZhao_AIRSUN
Hao Zhao
25 days
If youโ€™re excited by Teslaโ€™s new world model, meet OmniNWMโ€”our research take on panoramic, controllable driving world models โ€ข Ultra-long demos โ€ข precise camera control โ€ข RGB/semantics/depth/occupancy โ€ข intrinsic closed-loop rewards Arxiv: https://t.co/CKa5Bd9bqr Watch:
10
47
319
@HaoZhao_AIRSUN
Hao Zhao
1 month
InvRGB+L (ICCVโ€™25): inverse rendering of large, dynamic scenes from a single RGB+LiDAR sequence. We add a specular LiDAR reflectance model + RGBโ†”LiDAR material consistency, yielding reliable albedo/roughness, relighting, night sim, and realistic object insertion. Paper:
2
20
140
@HaoZhao_AIRSUN
Hao Zhao
2 months
TA-VLA: Torque-aware Vision-Language-Action models We show how to inject force cues into VLAs for contact-rich manipulation. Three takeaways: โœ…Where: put torque adapters in the decoder, not the encoder. โœ…How: use a single-token summary of torque history. โœ…Why: jointly
9
57
452
@HaoZhao_AIRSUN
Hao Zhao
4 months
Thrilled to share that our paper "OnePoseViaGen" received three strong accepts at #CoRL2025! ๐Ÿš€๐ŸŽ‰ Check it out ๐Ÿ‘‰ https://t.co/U54kgzIXGy Congrats to the amazing team! ๐Ÿ”ฅ
Tweet card summary image
github.com
[CORL 2025 Oral]One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation. - GZWSAMA/OnePoseviaGen
@ychngji6
Chongjie(CJ) Ye
4 months
Excited to share that our RGB-D version of OnePoseViaGen has been accepted to #CoRL2025 with three strong accepts! ๐Ÿค– Code coming soon at https://t.co/1D2sT9l9P2 6D pose estimation is crucial for robotics, but generalizing to generated objects remains challenging. We introduce
0
1
15
@HaoZhao_AIRSUN
Hao Zhao
4 months
๐Ÿš€ Mind-blowing! OnePoseViaGen can track 6D object poses directly from any input video โ€“ no special setup needed! From just an RGB video ๐ŸŽฅ, it reconstructs 2D depth, 3D shape, 4D point dynamics, and 6D pose. ๐Ÿ‘‰ Try it here:
Tweet card summary image
huggingface.co
@ychngji6
Chongjie(CJ) Ye
4 months
๐Ÿš€We are thrilled to release an alpha version of OnePoseViaGen - a system for Panoptic 4D Scene Reconstruction from RGB video! ๐Ÿ‘‰Try it at : https://t.co/RFKddVaVCV #AI #3D #GenerativeAI
1
14
58
@HaoZhao_AIRSUN
Hao Zhao
4 months
๐Ÿš€ New code release! Weโ€™ve open-sourced DiST-4D โ€” the first feed-forward world model that simultaneously handles temporal prediction and spatial novel-view synthesis for autonomous driving scenes. โ€ข Disentangled spatio-temporal diffusion โ€ข Metric-depth bridge for 4D RGB-D
0
9
21
@CSProfKGD
Kosta Derpanis (sabbatical @ CMU)
5 months
The legend Takeo Kanade
6
46
387
@HaoZhao_AIRSUN
Hao Zhao
5 months
๐Ÿ™Œ Great to see @dreamingtulpa showcasing SyncTalk++! Weโ€™ve level-led up the original NeRF-based SyncTalk with 3D Gaussian Splatting: โ€ข Dynamic Portrait Renderer for sharper, consistent identity โ€ข FaceSync + HeadSync for spot-on lip & pose alignment โ€ข 101 fps real-time
@dreamingtulpa
Dreaming Tulpa ๐Ÿฅ“๐Ÿ‘‘
5 months
is it over yet?
2
16
48
@HaoZhao_AIRSUN
Hao Zhao
5 months
Photometric stereo meets VGGT: LINO leverages geometry backbones + light register tokens to deliver universal, 4K-detailed normal maps under arbitrary lighting. ๐Ÿ‘€ Thanks for the post @zhenjun_zhao
@zhenjun_zhao
Zhenjun Zhao
5 months
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Hong Li, Houyuan Chen, @ychngji6, @Frozen_Burning, Bohan Li, @xshocng1, Xianda Guo, Xuhui Liu, Yikai Wang, Baochang Zhang, Satoshi Ikehata, Boxin Shi, @raoanyi, @HaoZhao_AIRSUN tl;dr: learnable
0
13
46
@HaoZhao_AIRSUN
Hao Zhao
5 months
LINO = VGGT + Learnable Light Tokens + Detail-Aware Losses ๐Ÿ”ฅ Huge thanks to @raoanyi @chen_yuan76802 โ€” loved building this together! Project:
houyuanchen111.github.io
TWITTER BANNER DESCRIPTION META TAG
@raoanyi
Anyi Rao
5 months
Universal Photometric Stereo (PS) aims for robust normal maps under any light. ๐Ÿšจ But big hurdles remain! 1๏ธโƒฃ Deep coupling: Ambiguous intensity - is it the light changing or the surface turning? ๐Ÿค” 2๏ธโƒฃ Detail loss: Complex surfaces (shadows, inter-reflections, fine details) stump
0
15
53
@HaoZhao_AIRSUN
Hao Zhao
5 months
Combining VGGT with lighting registers gives rise to todayโ€™s strongest foundation model for photometric stereo. Thanks @_akhaliq for highlighting our work on LINO: predicting ultra-detailed 4K normal maps from unified features! ๐Ÿ‘€
@_akhaliq
AK
5 months
Light of Normals Unified Feature Representation for Universal Photometric Stereo
1
10
64
@HarryXu12
Huazhe Harry Xu
5 months
Just arrived LA! We will present 4 papers @RoboticsSciSys including award candidate paper Reactive Diffusion Policy @HanXue012 , DemoGen @ZhengrongX , DoGlove @DoubleHan07 , and Morpheus @HaoZhao_AIRSUN robot face. I'll also share thoughts about OOD generalization in workshops.
1
9
75
@HaoZhao_AIRSUN
Hao Zhao
5 months
๐Ÿš—๐Ÿ“ก Simulate Any Radar (SA-Radar) is here! We present a controllable, efficient, and realistic radar simulation system via waveform-parameterized attribute embedding. ๐ŸŒ Supports: โ€ข Cross-sensor simulation โ€ข Attribute editing โ€ข Scene augmentation โ€ข RAD cube generation ๐Ÿ”
0
1
4
@HaoZhao_AIRSUN
Hao Zhao
5 months
๐Ÿค– Just published in Nature Machine Intelligence! F-TAC Hand embeds high-res touch (0.1โ€ฏmm) across 70% of a biomimetic robotic hand โ€” enabling adaptive, human-like grasping across 600 real-world trials.
@ZIHANGZHAO2
Zihang Zhao
5 months
Our paper, published in Nature Machine Intelligence, presents a system with full-hand tactile sensing and sensory-motor feedback for adaptive, human-like grasping, advancing embodied agents in real-world operation. Article: https://t.co/LwN7ZzyOfx Demo: https://t.co/rhi0cztkdp
0
1
8
@HaoZhao_AIRSUN
Hao Zhao
6 months
๐Ÿค– Meet Morpheus โ€” a neural-driven animatronic face that doesnโ€™t just talk, it feels. Hybrid actuation (rigid ๐Ÿ’ช + tendon ๐Ÿงต) makes it expressive and compact. Self-modeling + audio-to-blendshape = real-time emotional reactions ๐Ÿ˜ฎโ€๐Ÿ’จ๐Ÿ˜ ๐Ÿฅน ๐Ÿง ๐Ÿ’ฌ Watch it smile, frown, cringe... all
2
8
20
@HaoZhao_AIRSUN
Hao Zhao
6 months
๐Ÿš€ Looking forward to presenting three works at #CVPR2025 next week! Come check them out if you're interested in dynamic modeling, 3D physical reasoning, or generative driving scenes: ๐Ÿ”น PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model ๐Ÿ‘‰
Tweet card summary image
github.com
Contribute to GasaiYU/PartRM development by creating an account on GitHub.
@CVPR
#CVPR2026
6 months
We are one week away from #computervisionโ€™s largest conference #CVPR2025! ๐Ÿค— What are you most excited to see?
0
1
5
@HaoZhao_AIRSUN
Hao Zhao
6 months
๐Ÿš˜ Tsinghua & Bosch just dropped Impromptu VLA: the SOTA fully open-source, end-to-end Vision-Language-Action model for autonomous driving โ€” no BEV, no planner, just raw video โž natural language โž action. Beats BridgeAD on NeuroNCAP (2.15 vs. 1.60). ๐Ÿ‘‰ https://t.co/Ub2jisUnKG
0
22
101
@HaoZhao_AIRSUN
Hao Zhao
6 months
We propose **Challenger**โ€”a framework to generate **photorealistic adversarial driving videos**! โš ๏ธ๐Ÿš— - ๐Ÿš— Diverse scenarios: **cut-ins, tailgating, blocking**, without human supervision - ๐Ÿ’ฅ **8.6ร— to 26.1ร—** higher collision rates for SOTA AD models - ๐ŸŽฏ **Transferable**
1
7
22
@HaoZhao_AIRSUN
Hao Zhao
8 months
The physical realism here is wild. PhysGen3D easily beats most closed-source stuff (Kling, Runway, Pikaโ€ฆ) when it comes to physics-informed generation. ๐Ÿš€ #CVPR2025
@ShenlongWang
Shenlong Wang
8 months
Can we make images interactive with realistic physics? ๐Ÿš€ Thrilled to share our #CVPR2025 work: PhysGen3D! From just a single image, PhysGen3D creates an interactive, physics-informed 3D scene, enabling us to explore and simulate realistic future scenarios interactively.
1
20
69