Robin Courant
@robin_courant
Followers
61
Following
66
Media
5
Statuses
31
Happy to present E.T. the Exceptional Trajectories: Text-to-Camera-Trajectory Generation with Character Awareness. ECCV2024 with @nico_dufour, @xiwang92, @MarcChristie4 and @VickyKalogeiton Paper: https://t.co/eNdGW9EOSz Webpage: https://t.co/UqrE6FlXyp
2
5
16
Text-to-Image models don't need 3 training stages anymore! ๐คฏ Our new MIRO method integrates human alignment directly into pretraining. 19x faster convergence โก 370x less compute than FLUX-dev ๐ Train once, align to many rewards. The era of multi-stage training is over!
1
14
31
Introducing Chapter-Llama [#CVPR2025], a framework for ๐ฏ๐ข๐๐๐จ ๐๐ก๐๐ฉ๐ญ๐๐ซ๐ข๐ง๐ using Large Language Models! ๐ฌ๐ฆ Check it out: ๐ Paper: https://t.co/1KhPsgZYUN ๐ Project: https://t.co/68GevYyznx ๐ป Code: https://t.co/MysWVlewRm ๐ค Demo: https://t.co/zKmL6v3PKU
4
37
200
Masked Diffusion Models (MDMs) are a hot topic in generative AI ๐ฅ โ powerful but slow due to multiple sampling steps. We @Polytechnique and @Inria introduce Di[M]O โ a novel approach to distill MDMs into a one-step generator without sacrificing quality.
2
30
195
1/13 ๐ Introducing our latest work on improving relative camera pose regression with a novel pre-training approach Alligat0R ( https://t.co/Mi6iy5rQ1A)!
@GBourmaud @VincentLepetit2
2
8
12
๐จ News! ๐จ We have released the models from our latest paper "How far can we go with ImageNet for text-to-image generation?" Check out the models on HuggingFace: ๐ค https://t.co/jaNyoNDN6u... ๐ https://t.co/gH6gct7lUA
1
3
5
๐ฅ AKiRa provides control over camera motion and optics (focal length, distortion, aperture) in video diffusion, enabling cinematic effects like fisheye, focus shifts, and dolly zoom. ๐ Paper: https://t.co/0ajalZXZ3a ๐ Project Page: https://t.co/uGqwhFPWLK ๐งต๐
2
17
52
๐ Guessing where an image was taken is a hard, and often ambiguous problem. Introducing diffusion-based geolocationโwe predict global locations by refining random guesses into trajectories across the Earth's surface! ๐บ๏ธ Paper, code, and demo: https://t.co/pNRFZk9NYP
6
37
153
I am presenting our paper MusicGen-Style โAudio Conditioning for Music Generation via Discrete Bottleneck Featuresโ at @ISMIRConf this afternoon. The code as well as the weights of the model are available on https://t.co/tSvrr446v3. You can now play with it!
github.com
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable...
1
10
104
#ECCV2024 Oct 2 (PM) E.T. the Exceptional Trajectories: Text-to-Camera-Trajectory Generation with Character Awareness. @robin_courant, @nico_dufour, @xiwang92, @MarcChristie4 and @VickyKalogeiton pdf: https://t.co/ehKsij3e2V webpage: https://t.co/CbimUjiGhR
1
2
13
(1/8) ๐ฌ Introducing the Short Film Dataset (SFD), a long video QA benchmark with 1k short films and 5k questions. Why another videoQA dataset? ๐ Story-level QAs ๐ฅ Publicly available videos ๐ Minimal data leakage โณ Long temporal context questions https://t.co/FJQzIRgDxV
2
12
24
We now have a first version of the 512x512 model in the demo! Still training but we will release weights soon!
๐จWe updated the demo, you can try your favorite text2image prompts: https://t.co/pBv4RHEd6l Be gentle: it's a tiny model by image generation standards (300M params) trained from scratch on a ridiculously small dataset (20M img+txt pairs), so it doesn't have high def capabilities
0
2
9
Very happy to announce that my paper โAudio Conditioning for Music Generation via Discrete Bottleneck Featuresโ done with @honualx @adiyossLC @jadecopet and Axel Roebel has been accepted at ISMIR24. Paper: https://t.co/2KwG6Bk1jH Sample: https://t.co/Dkom70Eoie Code: soon
2
23
97
Paper: https://t.co/eNdGW9EOSz Webpage: https://t.co/UqrE6FlXyp Code: https://t.co/h2PxDUXEYn Dataset: https://t.co/1AFQWtQoU5 Demo:
huggingface.co
0
1
6
Additionally, we train CLaTr, a Contrastive Language-Trajectory embedding, to facilitate the evaluation of camera trajectory generation models.
1
0
1
DIRECTOR exhibits high controllability and diversity, is character-aware, and handles complex input conditions.
1
0
1
We demonstrate the potential of our dataset with DIRECTOR, a camera trajectory diffusion model that leverages both character trajectories and captions.
1
0
1
We introduce a camera trajectory dataset called Exceptional Trajectories (E.T.), extracted from real movies. E.T. includes camera and character trajectories along with textual captions.
1
0
1