Alexey Bokhovkin
@ABokhovkin
Followers
356
Following
68
Media
3
Statuses
44
Computer Vision researcher @ TUM 3D Indoor Understanding
Joined December 2020
๐ขSceneFactor code is released! SceneFactor is a factored latent diffusion for controllable, large-scale scene synthesis and editing! w/ @QTDSMQ, @shubhtuls, @angelaqdai Check out the code here: https://t.co/tJFRAPXKEI. We present SceneFactor at #CVPR2025 on Fri 13, -10:30
0
7
23
๐ขAnimating the Uncaptured ๐ข We animate 3D humanoid meshes using video diffusion priors given a text prompt. ๐ฅ https://t.co/EpFW86gaRw ๐ https://t.co/suMQs8oQCL Realistic motion generation for 3D characters - without motion capture! ๐ Great work by @marcbenedi @angelaqdai
3
40
124
๐ขExCap3D: Multilevel Captioning of Objects in 3D Scenes @chandan__yes generates consistent object and part-level descriptions of objects in 3D scenes, and introduces a new dataset with 190k captions for 34k ScanNet++ objects. Project: https://t.co/6tWzlYsx5F w/ @david_roz_
0
30
110
๐ข ScanNet++ v2 Benchmark Release! ๐ Test your state-of-the-art models on: ๐น Novel View Synthesis ๐ธโก๏ธ๐ผ๏ธ ๐น 3D Semantic & Instance Segmentation ๐ค๐๐ถ๏ธ Shoutout to @chandan__yes and @liuyuehcheng for their incredible work๐ ๐Check it out: https://t.co/SKCGM23hA0
2
42
203
๐ขMeshArt: Generating Articulated Meshes with Structure-guided Transformers @DaoyiGao generates articulated meshes with a hierarchical transformer, modeling articulation-aware structures that guide mesh synthesis. w/ @yawarnihal @craigleili Project: https://t.co/aZPVyn8kQd
2
65
290
Excited to announce ScanNet++ v2!๐ @chandan__yes and @liuyuehcheng have been working tirelessly to bring: ๐น1006 high-fidelity 3D scans ๐น+ DSLR & iPhone captures ๐น+ rich semantics Elevating 3D scene understanding to the next level!๐ w/ @MattNiessner
https://t.co/QayR1S8KZZ
6
113
642
๐ข๐ข๐๐๐
: ๐๐๐ฎ๐ฌ๐ฌ๐ข๐๐ง ๐๐ฏ๐๐ญ๐๐ซ ๐๐๐๐จ๐ง๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ข๐จ๐ง ๐๐ซ๐จ๐ฆ ๐๐จ๐ง๐จ๐๐ฎ๐ฅ๐๐ซ ๐๐ข๐๐๐จ๐ฌ ๐ฏ๐ข๐ ๐๐ฎ๐ฅ๐ญ๐ข-๐ฏ๐ข๐๐ฐ ๐๐ข๐๐๐ฎ๐ฌ๐ข๐จ๐ง๐ข๐ข We reconstruct animatable Gaussian head avatars from monocular videos captured by commodity devices such as
2
31
125
๐ขDNF: Generating 4D animations with dictionary-based neural fields! @xinyi092298 presents a new dictionary-based neural field for unconditional 4D generation of deforming shapes -- generating motions with high-quality shape and temporal consistency. https://t.co/yAZi2k0PjB
0
44
147
I'm so excited to introduce SceneFactor!
๐ขSceneFactor: Generating & editing 3D indoor scenes from text! @ABokhovkin presents a factored latent diffusion for controllable, large-scale scene synthesis -- decomposed into high-level semantic generation + geometric refinement w/ @QTDSMQ, @shubhtuls
https://t.co/WGTw70cKIo
0
1
20
๐ข๐ข ๐๐๐ฎ๐ฌ๐ฌ๐ข๐๐ง๐๐ฉ๐๐๐๐ก: Audio-Driven Gaussian Avatars ๐ข๐ข We synthesize photorealistic and 3D-consistent talking human head avatars driven directly from spoken audio. More specifically, we introduce an efficient 3DGS-based representation, combined with an
2
36
149
How can we generate high-fidelity, complex 3D scenes? @QTDSMQ's LT3SD decomposes 3D scenes into latent tree representations, with diffusion on the latent trees enabling seamless infinite 3D scene synthesis! w/ @craigleili, @MattNiessner
https://t.co/wv9bIhkkYi
3
77
318
Excited to present DiffCAD coming to #SIGGRAPH2024! @DaoyiGao introduces the first probabilistic single-view CAD retrieval & alignment. We train only on synthetic -> generalize robustly to real images! Check out the code: https://t.co/hBCoN0Hx3w w/@david_roz_, @StefanLeuteneg1
0
31
116
Excited to present GenZI at #CVPR2024! @craigleili introduces GenZI, the first zero-shot approach to creating realistic 3D human-scene interactions by leveraging interaction priors from large VLMs. Code and data on our website! https://t.co/hUhMgUoU70
https://t.co/rnn1G5HOuu
1
25
102
AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans A method for unsupervised instance segmentation of 3D outdoor LiDAR scenes. Project: https://t.co/m8DJanWH2T Vid: https://t.co/Z9OyZbskdJ Paper : https://t.co/rrmvQdjmWV
2
6
21
Check out our #CVPR'24 papers on 3D human interactions, generative 3D modeling, and uncertainty-aware and unsupervised 3D semantic scene understanding! Congrats to @craigleili @david_roz_ @chrdiller @yawarnihal @shivangi2201 @jiapeng_tang @AnhQuanCAO for their amazing work!
3
29
118
Check out @chrdiller's CG-HOI :) We generate realistic 3D human-object interactions, from object geometry and text description. A key ingredient is explicit modeling of contact, during training and as guidance during inference. https://t.co/Cl5Jw9oFBO
https://t.co/FVIFqEpjHi
5
53
202
Diffusion models are awesome! Check out our survey on ๐๐ข๐๐๐ฎ๐ฌ๐ข๐จ๐ง ๐๐จ๐๐๐ฅ๐ฌ ๐๐จ๐ซ ๐๐ข๐ฌ๐ฎ๐๐ฅ ๐๐จ๐ฆ๐ฉ๐ฎ๐ญ๐ข๐ง๐ ! We give an introduction to diffusion models and highlight how they are used by state-of-the-art methods in graphics and vision. https://t.co/FqaqF7tMPM
4
87
378
We've released the ScanNet++ data! Check it out: https://t.co/SKCGM23hA0 280 high-fidelity 3D scenes w/ 1mm geometry, DSLR+iPhone images, semantics We're currently beta-testing, please bear with us - approval may initially take up to 2 weeks Test scenes and benchmark to come!
0
41
164
Can we match visual features jointly across multiple frames? Yes! @barbara_roessle's #ICCV2023 paper proposes a differentiable pose optimization for end2end feature matching across multiple frames, thus obtaining better poses! https://t.co/CCYtA5PxCS
https://t.co/cM0gaG3ids
1
92
385