
Sihyun Yu
@sihyun_yu
Followers
1K
Following
1K
Media
13
Statuses
159
Ph.D. student @ KAIST | Ex-intern @NVIDIAAI and @GoogleAI | Generative models | https://t.co/wTvMmsks3e
Daejeon
Joined July 2020
Introducing REPA! We show that learning high-quality representations in diffusion transformers is crucial for boosting generation performance. With REPA, we speed up SiT training by 17.5x (without CFG) and achieve state-of-the-art FID = 1.42 using CFG with the guidance interval.
6
46
286
RT @_akhaliq: Enhancing Motion Dynamics of Image-to-Video Models via Adaptive Low-Pass Guidance
0
21
0
RT @SoojungYang2: 🚀 Come check our poster at ICML @genbio_workshop!.We show that pretrained MLIPs can accelerate training of Boltzmann emul….
0
18
0
I’ve wondered why I2V models tend to generate more static videos compared to their T2V counterparts. This project, led by @june_suk_choi, provides an analysis of this phenomenon and introduces a very simple (yet effective) fix to address it! Excited to have been part of this.
Excited to share Adaptive Low-Pass Guidance (ALG): a simple training-free, drop-in fix that brings dynamic motion back to Image-to-Video models! Demo videos, paper, & code below! .(🧵 1/7)
0
2
29
RT @sainingxie: @joserf28323 @CVPR @ICCVConference @nyuniversity Thanks for bringing this to my attention. I honestly wasn’t aware of the s….
0
29
0
Excited to share MDMs for molecule generation led by @bellaseo72 and @taewonKKK!.
Meet MELD: a masked diffusion model (MDMs) designed for de novo molecule generation. MELD assigns per-element learnable noise schedule that tailors noise at the atom & bond level to avoid state-clashing problem. With MELD we achieve state-of-the-art property alignment in
0
1
11
RT @ZhengyangGeng: now the code is up here:
github.com
JAX implementation of MeanFlow. Contribute to Gsunshine/meanflow development by creating an account on GitHub.
0
17
0
RT @wenhaocha1: We introduce LiveCodeBench Pro. Models like o3-high, o4-mini, and Gemini 2.5 Pro score 0% on hard competitive programming p….
0
27
0
RT @ArashVahdat: The slides for my CVPR talks are now available at
latentspace.cc
Arash Vahdat is a Research Director, leading the fundamental generative AI research (GenAIR) team at NVIDIA Research. Before joining NVIDIA, he was a research scientist at D-Wave Systems where he...
0
19
0
RT @RickyTQChen: Padding in our non-AR sequence models? Yuck. 🙅. 👉 Instead of unmasking, our new work *Edit Flows* perform iterative refine….
0
79
0
RT @sainingxie: Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for….
0
66
0
RT @younggyoseo: Excited to present FastTD3: a simple, fast, and capable off-policy RL algorithm for humanoid control -- with an open-sourc….
0
114
0
RT @sainingxie: Indeed. For text-to-image, @xichen_pan had a great summary supporting this decoupled design philosophy: "Render unto diffus….
0
36
0
1. Controllable human generation: led by @cpis9898 .2. Long video tokenization: led by @huiwon0516 and @younggyoseo .3. Long video generation: an internship project at Google Research in collaboration with.
arxiv.org
Diffusion models are successful for synthesizing high-quality videos but are limited to generating short clips (e.g., 2-10 seconds). Synthesizing sustained footage (e.g. over minutes) still...
0
0
5
I'll be at #CVPR2025 to present three papers on controllable human generation, efficient long video tokenization, and long video generation with memory modules. Would love to catch up — feel free to DM me if you're around and up for coffee!
1
0
30
RT @ZhengyangGeng: Excited to share our work with my amazing collaborators, @Goodeat258, @SimulatedAnneal, @zicokolter, and Kaiming. In a….
0
39
0
RT @iScienceLuvr: Mean Flows for One-step Generative Modeling. "We introduce the notion of average velocity to characterize flow fields, i….
0
62
0