
Willi Menapace
@WilliMenapace
Followers
177
Following
95
Media
4
Statuses
27
PhD Student - University of Trento, Italy
Trento, Trentino-South Tyrol
Joined June 2021
Why is progressive generation so complex? 🤔 It doesn't have to be. Our Decomposable Flow Matching (DFM) simplifies the process into a single, straightforward flow model, 🚀 beating prior work in image and video synthesis. #AI #Research #MachineLearning.
Where are good old progressive diffusion models? 🤔. Breaking generation to multiple resolution scales is a great idea, but complexity (multiple models, custom diffusion process, etc) stalled scaling. Our Decomposable Flow Matching packs multi-scale perks into one scalable model.
0
1
6
RT @ashmrz10: [1/9] 🚀 We introduce 4Real-Video-V2, a method that can generate 4D scenes from a simple text prompt, viewable from any angle….
0
25
0
RT @Dazitu_616: 📢 Introducing DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models. Compared to vanilla DPO,….
0
35
0
RT @isskoro: In the past 1.5 weeks, there appeared 2 papers by 2 different research groups which develop the exactly same (and embarrassing….
0
57
0
RT @AbermanKfir: We discovered that imposing a spatio-temporal weight space via LoRAs on DIT-based video models unlocks powerful customizat….
0
88
0
RT @AbdalRameen: What if you could compose videos— merging multiple clips, even capturing complex athletic moves where video models struggl….
0
25
0
Check out Video Alchemist.Our latest work enables Multi-subject open-set personalization with no need for inference-time tuning.👇👇👇.
Introducing.⚗️ Video Alchemist. Our new video model supporting.👪 Multi-subject open-set personalization.🏞️ Foreground & background personalization.🚀 Without the need of inference-time tuning. [Results]. 1. Sora girl rides a dinosaur on a savanna. 🧵👇
0
0
7
Video-to-Audio and Audio-to-Video models struggle with temporal alignment. AV-Link solves the problem by conditioning on diffusion model features. Great collaboration with @moayedhajiali , @siarohin9013 , @isskoro , @alpercanbe , Kwot Sin Lee, Vicente Ordonez and @SergeyTulyakov.
Can pretrained diffusion models connect for cross-modal generation?. 📢 Introducing AV-Link ♾. Bridging unimodal diffusion models in one framework to enable:.📽️ ➡️ 🔊 Video-to-Audio .🔊 ➡️ 📽️ Audio-to-Video. 🌐: 📄: . ⤵️ Results
0
3
10
RT @Dazitu_616: MinT beats Sora in multi-event generation!.One week after the release of MinT, Sora also released a *storyboard* tool that….
0
12
0
RT @Dazitu_616: 📢MinT: Temporally-Controlled Multi-Event Video Generation📢. TL;DR: We identify a fundamental failu….
0
52
0
RT @taiyasaki: 📢📢📢 𝐀𝐂𝟑𝐃: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers. TL;DR: for 3D….
0
26
0
Excited to share our latest work 'Snap Video'!.A great collaboration with @siarohin9013 @isskoro @deyneka_e @tsaishien_chen @anilkagak2 @studyfang_ Aleksei Stoliar @eliricci_ @JianRen_ @SergeyTulyakov .More to come soon!. Project page:
0
6
42
RT @SergeyTulyakov: 2. Want to generate the whole city in 3D? Checkout InfiniCity - a method that does exactly that!. Project: https://t.c….
0
4
0
RT @VGolyanik: "Quantum Multi-Model Fitting ", #CVPR2023 (Highlight). Our formulation can be efficiently sampled by a quantum annealer wi….
0
4
0
It was great to work with @eliricci_ @SergeyTulyakov Aliaksandr Siarohin @Steph_lat @VGolyanik and.Christian Theobalt.
0
0
0
Can we reconstruct a 3D environment and make it playable? In our #CVPR2022 work on Playable Environments we create a 3D representation of an environment. We make it possible to control players and move the camera using a joystick, and to change the style of each object.
1
2
9
RT @ICCV_2021: Honourable Mention @ICCV_2021 . Viewing Graph Solvability via Cycle Consistency. Federica Arrigoni (University of Trento), A….
0
6
0
RT @ElisaRi34560608: I am looking for a new PhD student to join my research group and work on Tiny Machine Learning. Send me a DM or email….
0
101
0