Jaihoon Kim Profile
Jaihoon Kim

@KimJaihoon

Followers
69
Following
40
Media
25
Statuses
78

Phd Student @ KAIST

Joined June 2023
Don't wanna be here? Send us removal request.
@yuseungleee
Phillip (Yuseung) Lee
25 days
๐ŸŒดHappy to attend #ICCV2025 in Hawaii! Iโ€™ll be presenting our paper on enabling VLMs to perform spatial reasoning from arbitrary perspectives. ๐Ÿ“” Paper: https://t.co/iX9Pt0AWEh ๐Ÿ–ฅ๏ธ Project Page: https://t.co/sh5W8VLwZO โœ”๏ธ Poster: Oct 21 (Tue) Session 2 & Exhibit Hall, #858
2
7
48
@KimJaihoon
Jaihoon Kim
1 month
๐Ÿ“ข Excited to share that our paper "Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing" has been accepted to #NeurIPS 2025 ๐Ÿ”— https://t.co/DACWrdERIE ๐Ÿ“Œ
0
0
9
@KimJaihoon
Jaihoon Kim
5 months
๐Ÿง Can we define a better initial prior for Sequential Monte Carlo in reward alignment? That's exactly what ฮจ-Sampler ๐Ÿ”ฑ does. Check out the paper for details: ๐Ÿ“Œ
Tweet card summary image
arxiv.org
We introduce $ฮจ$-Sampler, an SMC-based framework incorporating pCNL-based initial particle sampling for effective inference-time reward alignment with a score-based generative model....
@taehoonyoon_
Taehoon Yoon
5 months
We present our paper "ฮจ-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models" Check out more details arXiv: https://t.co/pDSllDC79O Website:
0
0
6
@KimJaihoon
Jaihoon Kim
7 months
๐Ÿ“ˆ Can pretrained flow models generate images from complex compositional promptsโ€”including logical relations and quantitiesโ€”without further fine-tuning? ๐Ÿš€ We have released our code for inference-time scaling for flow models:
Tweet card summary image
github.com
[NeurIPS 2025] Official code for Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing - KAIST-Visual-AI-Group/Flow-Inference-Time-Scaling
@_akhaliq
AK
8 months
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
0
5
29
@MinhyukSung
Minhyuk Sung
7 months
I recently presented our work, โ€œInference-Time Guided Generation with Diffusion and Flow Models,โ€ at HKUST (CVM 2025 keynote) and NTU (MMLab), covering three classes of guidance methods for diffusion models and their extensions to flow models. Slides: https://t.co/yl2KPYGTRc
0
20
108
@yuseungleee
Phillip (Yuseung) Lee
7 months
โ—๏ธVision-Language Models (VLMs) struggle with even basic perspective changes! โœ๏ธ In our new preprint, we aim to extend the spatial reasoning capabilities of VLMs to โญ๏ธarbitraryโญ๏ธ perspectives. ๐Ÿ“„Paper: https://t.co/qq5s8jHtVN ๐Ÿ”—Project: https://t.co/sh5W8VLwZO ๐Ÿงต[1/N]
4
37
151
@MinhyukSung
Minhyuk Sung
7 months
#ICLR2025 Come join our StochSync poster (#103) this morning! We introduce a method that combines the best parts of Score Distillation Sampling and Diffusion Synchronization to generate high-quality and consistent panoramas and mesh textures. https://t.co/5TAJxvEUcL
stochsync.github.io
Hello world!
@KyeongminYeo
Kyeongmin Yeo
7 months
๐ŸŽ‰ Join us tomorrow at the #ICLR2025 poster session to learn about our work, "StochSync," extending pretrained diffusion models to generate images in arbitrary spaces! ๐Ÿ“Œ: Hall 3 + Hall 2B #103 ๐Ÿ“…: Apr. 25. 10AM-12:30PM [1/8]
0
7
21
@KyeongminYeo
Kyeongmin Yeo
7 months
๐ŸŽ‰ Join us tomorrow at the #ICLR2025 poster session to learn about our work, "StochSync," extending pretrained diffusion models to generate images in arbitrary spaces! ๐Ÿ“Œ: Hall 3 + Hall 2B #103 ๐Ÿ“…: Apr. 25. 10AM-12:30PM [1/8]
2
9
13
@KimJaihoon
Jaihoon Kim
7 months
๐Ÿ‡ธ๐Ÿ‡ฌ Attending #ICLR2025 ? Check out how we extend pretrained diffusion models to generate images in arbitrary spaces. ๐Ÿ“Œ: Hall 3 + Hall 2B #103 ๐Ÿ“…: 10AM-12:30PM
0
3
18
@KimJaihoon
Jaihoon Kim
7 months
How can VLM reason in arbitrary perspectives? ๐Ÿ”ฅ Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation proposes a framework that enables spatial reasoning of VLM from arbitrary perspectives
@_akhaliq
AK
7 months
Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
0
0
7
@_akhaliq
AK
7 months
Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
5
19
152
@KimJaihoon
Jaihoon Kim
7 months
๐Ÿ”ฅ KAIST Visual AI Group is hiring interns for 2025 Summer. โ“Can non-KAIST students apply? Yes! โ“Can international students who are not enrolled in any Korean institutions apply? Yes! More info at ๐Ÿ”—
@MinhyukSung
Minhyuk Sung
7 months
๐Ÿš€ Weโ€™re hiring! The KAIST Visual AI Group is looking for Summer 2025 undergraduate interns. Interested in: ๐ŸŒ€ Diffusion / Flow / AR models (images, videos, text, more) ๐Ÿง  VLMs / LLMs / Foundation models ๐ŸงŠ 3D generation & neural rendering Apply now ๐Ÿ‘‰ https://t.co/h7FdzC8Hmt
0
1
9
@myh4832
Yunhong Min
7 months
๐Ÿ”ฅ Grounding 3D Orientation in Text-to-Image ๐Ÿ”ฅ ๐ŸŽฏ We present ORIGEN โ€” the first zero-shot method for accurate 3D orientation grounding in text-to-image generation! ๐Ÿ“„ Paper: https://t.co/x20WdG96Hs ๐ŸŒ Project: https://t.co/fE7ozSbf46
3
19
92
@MinhyukSung
Minhyuk Sung
7 months
Introducing ORIGEN: the first orientation-grounding method for image generation with multiple open-vocabulary objects. Itโ€™s a novel zero-shot, reward-guided approach using Langevin dynamics, built on a one-step generative model like Flux-schnell. Project:
@myh4832
Yunhong Min
7 months
๐Ÿ”ฅ Grounding 3D Orientation in Text-to-Image ๐Ÿ”ฅ ๐ŸŽฏ We present ORIGEN โ€” the first zero-shot method for accurate 3D orientation grounding in text-to-image generation! ๐Ÿ“„ Paper: https://t.co/x20WdG96Hs ๐ŸŒ Project: https://t.co/fE7ozSbf46
0
5
30
@KimJaihoon
Jaihoon Kim
7 months
๐Ÿš€ Check out our inference-time scaling with FLUX. GPT-4o struggles to follow user prompts involving compositional logical relations. Our inference-time scaling enables efficient search to generate samples with precise alignment to the input text. ๐Ÿ”—
@MinhyukSung
Minhyuk Sung
7 months
GPT-4o vs. Our test-time scaling with FLUX (2/2) GPT-4o cannot precisely understand the text (e.g., misinterpreting โ€œoccupying chairsโ€ on the left), while our test-time technique generates an image perfectly aligned with the prompt. Check out more ๐Ÿ‘‡ ๐ŸŒ https://t.co/3zMdsrp1Ln
0
2
9
@TheTuringPost
TuringPost
8 months
Inference-time scaling can work for flow models @kaist_ai proposed 3 key ideas to make it possible: โ€ข SDE-based generation โ€“ Adding controlled randomness allows flow models to explore more outputs, like diffusion models do. โ€ข VP interpolant conversion โ€“ Guides the model from
1
8
29
@MinhyukSung
Minhyuk Sung
8 months
Unconditional Priors Matter! The key to improving CFG-based "conditional" generation in diffusion models actually lies in the quality of their "unconditional" prior. Replace it with a better one to improve conditional generation! ๐ŸŒ
@PrinPhunya
Prin P.
8 months
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models without Additional Training Costs arXiv: https://t.co/sxAHpY5e2P Project: https://t.co/618Ut10yGc
0
4
25
@_akhaliq
AK
8 months
ORIGEN Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
5
35
191
@yuseungleee
Phillip (Yuseung) Lee
8 months
๐Ÿ”Ž Unconditional priors matter! When fine-tuning diffusion models for conditional tasks, the **unconditional** distribution often breaks down. ๐Ÿ”‘ We propose a simple fix: simply mix the predicted noise from the **fine-tuned** model and its **base** model!
1
14
77
@KimJaihoon
Jaihoon Kim
8 months
๐Ÿ“Œ Unconditional Priors Matter! Fine-tuned diffusion models often degrade in unconditional quality โ€”hurting conditional generation. We show that plugging in richer unconditional priors from other models boosts performance. No retraining needed. ๐Ÿš€ ๐Ÿ”—:
@PrinPhunya
Prin P.
8 months
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models without Additional Training Costs arXiv: https://t.co/sxAHpY5e2P Project: https://t.co/618Ut10yGc
0
0
6