Jaihoon Kim @KimJaihoon X Profile

Jaihoon Kim

@KimJaihoon

Followers

69

Following

40

Media

25

Statuses

78

Phd Student @ KAIST

Joined June 2023

Don't wanna be here? Send us removal request.

Phillip (Yuseung) Lee

@yuseungleee

25 days

🌴Happy to attend #ICCV2025 in Hawaii! I’ll be presenting our paper on enabling VLMs to perform spatial reasoning from arbitrary perspectives. 📔 Paper: https://t.co/iX9Pt0AWEh 🖥️ Project Page: https://t.co/sh5W8VLwZO ✔️ Poster: Oct 21 (Tue) Session 2 & Exhibit Hall, #858

2

7

48

Jaihoon Kim

@KimJaihoon

1 month

📢 Excited to share that our paper "Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing" has been accepted to #NeurIPS 2025 🔗 https://t.co/DACWrdERIE 📌

0

9

Jaihoon Kim

@KimJaihoon

5 months

🧐 Can we define a better initial prior for Sequential Monte Carlo in reward alignment? That's exactly what Ψ-Sampler 🔱 does. Check out the paper for details: 📌

arxiv.org

We introduce $Ψ$-Sampler, an SMC-based framework incorporating pCNL-based initial particle sampling for effective inference-time reward alignment with a score-based generative model....

Taehoon Yoon

@taehoonyoon_

5 months

We present our paper "Ψ-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models" Check out more details arXiv: https://t.co/pDSllDC79O Website:

0

6

Jaihoon Kim

@KimJaihoon

7 months

📈 Can pretrained flow models generate images from complex compositional prompts—including logical relations and quantities—without further fine-tuning? 🚀 We have released our code for inference-time scaling for flow models:

github.com

[NeurIPS 2025] Official code for Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing - KAIST-Visual-AI-Group/Flow-Inference-Time-Scaling

AK

@_akhaliq

8 months

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

0

5

29

Minhyuk Sung

@MinhyukSung

7 months

I recently presented our work, “Inference-Time Guided Generation with Diffusion and Flow Models,” at HKUST (CVM 2025 keynote) and NTU (MMLab), covering three classes of guidance methods for diffusion models and their extensions to flow models. Slides: https://t.co/yl2KPYGTRc

0

20

108

Phillip (Yuseung) Lee

@yuseungleee

7 months

❗️Vision-Language Models (VLMs) struggle with even basic perspective changes! ✏️ In our new preprint, we aim to extend the spatial reasoning capabilities of VLMs to ⭐️arbitrary⭐️ perspectives. 📄Paper: https://t.co/qq5s8jHtVN 🔗Project: https://t.co/sh5W8VLwZO 🧵[1/N]

4

37

151

Minhyuk Sung

@MinhyukSung

7 months

#ICLR2025 Come join our StochSync poster (#103) this morning! We introduce a method that combines the best parts of Score Distillation Sampling and Diffusion Synchronization to generate high-quality and consistent panoramas and mesh textures. https://t.co/5TAJxvEUcL

stochsync.github.io

Hello world!

Kyeongmin Yeo

@KyeongminYeo

7 months

🎉 Join us tomorrow at the #ICLR2025 poster session to learn about our work, "StochSync," extending pretrained diffusion models to generate images in arbitrary spaces! 📌: Hall 3 + Hall 2B #103 📅: Apr. 25. 10AM-12:30PM [1/8]

0

7

21

Kyeongmin Yeo

@KyeongminYeo

7 months

🎉 Join us tomorrow at the #ICLR2025 poster session to learn about our work, "StochSync," extending pretrained diffusion models to generate images in arbitrary spaces! 📌: Hall 3 + Hall 2B #103 📅: Apr. 25. 10AM-12:30PM [1/8]

2

9

13

Jaihoon Kim

@KimJaihoon

7 months

🇸🇬 Attending #ICLR2025 ? Check out how we extend pretrained diffusion models to generate images in arbitrary spaces. 📌: Hall 3 + Hall 2B #103 📅: 10AM-12:30PM

0

3

18

Jaihoon Kim

@KimJaihoon

7 months

How can VLM reason in arbitrary perspectives? 🔥 Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation proposes a framework that enables spatial reasoning of VLM from arbitrary perspectives

AK

@_akhaliq

7 months

Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation

0

7

AK

@_akhaliq

7 months

Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation

5

19

152

Jaihoon Kim

@KimJaihoon

7 months

🔥 KAIST Visual AI Group is hiring interns for 2025 Summer. ❓Can non-KAIST students apply? Yes! ❓Can international students who are not enrolled in any Korean institutions apply? Yes! More info at 🔗

Minhyuk Sung

@MinhyukSung

7 months

🚀 We’re hiring! The KAIST Visual AI Group is looking for Summer 2025 undergraduate interns. Interested in: 🌀 Diffusion / Flow / AR models (images, videos, text, more) 🧠 VLMs / LLMs / Foundation models 🧊 3D generation & neural rendering Apply now 👉 https://t.co/h7FdzC8Hmt

0

1

9

Yunhong Min

@myh4832

7 months

🔥 Grounding 3D Orientation in Text-to-Image 🔥 🎯 We present ORIGEN — the first zero-shot method for accurate 3D orientation grounding in text-to-image generation! 📄 Paper: https://t.co/x20WdG96Hs 🌐 Project: https://t.co/fE7ozSbf46

3

19

92

Minhyuk Sung

@MinhyukSung

7 months

Introducing ORIGEN: the first orientation-grounding method for image generation with multiple open-vocabulary objects. It’s a novel zero-shot, reward-guided approach using Langevin dynamics, built on a one-step generative model like Flux-schnell. Project:

Yunhong Min

@myh4832

7 months

🔥 Grounding 3D Orientation in Text-to-Image 🔥 🎯 We present ORIGEN — the first zero-shot method for accurate 3D orientation grounding in text-to-image generation! 📄 Paper: https://t.co/x20WdG96Hs 🌐 Project: https://t.co/fE7ozSbf46

0

5

30

Jaihoon Kim

@KimJaihoon

7 months

🚀 Check out our inference-time scaling with FLUX. GPT-4o struggles to follow user prompts involving compositional logical relations. Our inference-time scaling enables efficient search to generate samples with precise alignment to the input text. 🔗

Minhyuk Sung

@MinhyukSung

7 months

GPT-4o vs. Our test-time scaling with FLUX (2/2) GPT-4o cannot precisely understand the text (e.g., misinterpreting “occupying chairs” on the left), while our test-time technique generates an image perfectly aligned with the prompt. Check out more 👇 🌐 https://t.co/3zMdsrp1Ln

0

2

9

TuringPost

@TheTuringPost

8 months

Inference-time scaling can work for flow models @kaist_ai proposed 3 key ideas to make it possible: • SDE-based generation – Adding controlled randomness allows flow models to explore more outputs, like diffusion models do. • VP interpolant conversion – Guides the model from

1

8

29

Minhyuk Sung

@MinhyukSung

8 months

Unconditional Priors Matter! The key to improving CFG-based "conditional" generation in diffusion models actually lies in the quality of their "unconditional" prior. Replace it with a better one to improve conditional generation! 🌐

Prin P.

@PrinPhunya

8 months

Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models without Additional Training Costs arXiv: https://t.co/sxAHpY5e2P Project: https://t.co/618Ut10yGc

0

4

25

AK

@_akhaliq

8 months

ORIGEN Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

5

35

191

Phillip (Yuseung) Lee

@yuseungleee

8 months

🔎 Unconditional priors matter! When fine-tuning diffusion models for conditional tasks, the **unconditional** distribution often breaks down. 🔑 We propose a simple fix: simply mix the predicted noise from the **fine-tuned** model and its **base** model!

1

14

77

Jaihoon Kim

@KimJaihoon

8 months

📌 Unconditional Priors Matter! Fine-tuned diffusion models often degrade in unconditional quality —hurting conditional generation. We show that plugging in richer unconditional priors from other models boosts performance. No retraining needed. 🚀 🔗:

Prin P.

@PrinPhunya

8 months

Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models without Additional Training Costs arXiv: https://t.co/sxAHpY5e2P Project: https://t.co/618Ut10yGc

0

6