Orest Kupyn
@OKupyn
Followers
79
Following
174
Media
4
Statuses
73
CV Researcher, Ukrainian ๐บ๐ฆ | PhD Student at University of Oxford, Visual Geometry Group
Joined February 2022
Sometimes the best solution is a well-established geometric principle from decades ago. ๐ค ๐ Paper: https://t.co/mZ5MAcx7Tf ๐ป Code: https://t.co/jIwiv0zyrz Check out the project page for comparisons and results!
github.com
Official repo for: Epipolar Geometry Improves Video Generation Models - KupynOrest/epipolar-dpo
0
1
2
Training on static scenes with dynamic cameras surprisingly generalizes to dynamic scenes. This makes sense because stable camera trajectories reduce artifacts across ALL scene types, even with moving objects.
1
0
2
The key insight? Classical geometric constraints provide cleaner optimization signals than modern learned metrics. We found that learnable metrics produce noisy preferences that can compromise alignment - sometimes failing to optimize their own metric! ๐
1
0
2
Our solution: Use Flow-DPO (which only needs relative rankings!) with a surprisingly simple metric from 1982 - the Sampson epipolar error. Add rigorous data filtering (ensure meaningful gaps, remove static scenes) and you get a strong, principled approach to 3D consistency.
1
0
2
Why is enforcing 3D consistency so tricky? 1๏ธโฃ Reconstruct โ measure quality โ use as reward: too slow 2๏ธโฃ Manual labels โ VLM โ reward: noisy and expensive 3๏ธโฃ Classical geometry metrics: often non-differentiable
1
0
2
Video diffusion models have made incredible progress and are increasingly used for 3D tasks like reconstruction and novel view synthesis. But there's still one major problem: they struggle with 3D consistency, producing geometric artifacts and unstable camera trajectories.
1
0
2
๐ฌ Epipolar Geometry Improves Video Generation Models โ๏ธ Excited to share our new work in collaboration with Fabian Manhardt, @fedassa and Christian Rupprecht! ๐ Project Page: https://t.co/2rttYX9LPP Thread ๐งต๐
1
8
67
๐ฆ Released: ๐ Paper: https://t.co/NAzpEcXv26 ๐ Project: https://t.co/rD8HxMtGNb ๐ค Dataset: https://t.co/R6fhxd2gzZ ๐ฎ Demo: https://t.co/WoV7FkDa0t Diffusion models are knowledge engines. Time to tap into them. ๐
0
2
1
Architecture: multi-mask decoder that predicts multiple valid hypotheses. Salient object detection is ambiguous: embrace it, don't average it away. Results: up to 50% error reduction on cross-dataset eval, SOTA on DIS & HR-SOD. Purely synthetic training matches real data. โจ
1
1
0
Generated 139K+ photorealistic samples with occlusions, complex backgrounds, diverse scenes. The pipeline scales with compute - generate as many as you need. The generation adapts to hard samples: evaluate performance โ prioritize challenging categories โ continuous improvement
1
1
0
We extract complementary information from three sources during generation: FLUX DiT features Concept attention maps DINO-v3 representations This extends the diffusion model to auto-generate masks alongside images, streamlining the entire data creation pipeline. ๐ฏ
1
2
0
Modern diffusion models generate complex scenes with detailed lighting, occlusions, photorealistic textures. During generation, they encode massive amounts of spatial & semantic knowledge. Meanwhile, we're manually annotating data at 10 hours per sample. Something's broken. ๐งต
2
2
0
โ๏ธ Paper Release โ๏ธ S3OD: Towards Generalizable Salient Object Detection with Synthetic Data ๐ Project Link:
1
4
8
In Donetsk region, russians killed more than 20 people โ civilians who had gathered at that moment to receive their pensions. An air bomb was dropped on them. Earlier Trump said he had a good conversation with Vladimir Putin and that Putin definitely wants peace.
487
6K
8K
We built a system that detects highly complex objects NO vision model can findโintroducing tool use for vision. ๐งต Say you wanted to detect for the @ycombinator logo in this image. (โถ๏ธ see step-by-step thinking) Using our solution, it's able to detect the logo perfectly,
10
13
38
Excited to share VMem: a novel memory mechanism for consistent video scene generation ๐๏ธโจ VMem evolves its understanding of scene geometry to retrieve the most relevant past frames, enabling long-term consistency ๐ https://t.co/AHBj6j1ecE ๐ค https://t.co/FbUbJHWW4F 1/ ๐งต
huggingface.co
4
12
59
Many Congratulations to @jianyuan_wang, @MinghaoChen23, @n_karaev, Andrea Vedaldi, Christian Rupprecht and @davnov134 for winning the Best Paper Award @CVPR for "VGGT: Visual Geometry Grounded Transformer" ๐ฅ๐ ๐๐ #CVPR2025!!!!!!
17
70
494
We are presenting Dual Point Maps as a #CVPR highlight tomorrow! Learn about our novel, data-efficient representation for 3D/4D deformable objectsโan alternative to classical template shape models. ๐๐ ExHall D, Poster #100, afternoon session ๐ https://t.co/XclARf2SK7
1
7
35
๐ค Do you have a PhD, and want to push the frontier of computer vision and robotics? ๐ค The Visual Geometry Group (VGG) in Oxford is hiring a postdoc! PI: Dr. Joรฃo Henriques. Deadline: 2 June at noon (UK). More details: https://t.co/XTAx9qgFEt
0
14
61
Sunday. Christian holiday before Easter. Barbaric brutal murder of civilians in Sumy. Russians did it intentionally. But never mind, @SteveWitkoff will thank Putin again and lick his bloody hands.
810
10K
15K