Jaehong Yoon @jaeh0ng_yoon X Profile

Jaehong Yoon

@jaeh0ng_yoon

Followers

2K

Following

1K

Media

38

Statuses

729

Assistant Professor @ntusg CCDS | Prv: @uncnlp @MLAI_KAIST @MSFTResearch | Trustworthy & Continually Adaptable Multimodal AI

https://t.co/6YASYndwAq

Singapore

Joined April 2016

Don't wanna be here? Send us removal request.

Jaehong Yoon

@jaeh0ng_yoon

4 days

🎉 Excited to share that 5/5 of my papers (3 main, 2 findings) have been accepted at #EMNLP2025, in video/multimodal reasoning, instructional video editing, and efficient LLM adaptation & reasoning! 🚨 I’m recruiting Ph.D. students to join the Multimodal AI Group at NTU College

15

31

308

Ran Xu

@ritaranx

2 days

Happy to introduce my internship work at @Google and @GoogleDeepMind, collab w/ @googlecloud. We introduce TIR-Judge, an end-to-end agentic RL framework that trains LLM judges with tool-integrated reasoning 🧠🛠️ 🔗 https://t.co/rtfqlvuzJ0 #Agents #LLMs #Judges #RL #reasoning

11

67

508

Jason Weston

@jaseweston

23 hours

Scaling Agent Learning via Experience Synthesis 📝: https://t.co/3WXayMsHrD Scaling training environments for RL by simulating them with reasoning LLMs! Environment models + Replay-buffer + New tasks = cheap RL for any environments! - Strong improvements over non-RL-ready

7

60

325

Mohit Bansal

@mohitban47

2 days

@cyjustinchen @ArchikiPrasad @swarnaNLP @EliasEskin -- Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning @ZiyangW00 @jaeh0ng_yoon @shoubin621 @mmiemon @gberta227 https://t.co/THxKAhgCPX https://t.co/c6s8hnrKFH

Ziyang Wang

@ZiyangW00

4 months

🚨Introducing Video-RTS: Resource-Efficient RL for Video Reasoning with Adaptive Video TTS! While RL-based video reasoning with LLMs has advanced, the reliance on large-scale SFT with extensive video data and long CoT annotations remains a major bottleneck. Video-RTS tackles

1

3

8

Mohit Bansal

@mohitban47

2 days

@cyjustinchen @ArchikiPrasad @swarnaNLP @EliasEskin @ZiyangW00 @jaeh0ng_yoon @shoubin621 @mmiemon @gberta227 -- RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives @jaeh0ng_yoon @shoubin621 https://t.co/Wx3eIJKDfp https://t.co/2XV9Yhjx9l

arxiv.org

Recent video generative models primarily rely on carefully written text prompts for specific tasks, like inpainting or style editing. They require labor-intensive textual descriptions for input...

Jaehong Yoon

@jaeh0ng_yoon

1 year

🚨New paper👉RACCooN: remove/add/change video content effortlessly/interactively via our MLLM+Video Diffusion (V2P2V) framework with auto-generated descriptions! ▶️ 1. Video-to-Paragraph (V2P): RACCooN first generates well-structured/detailed descriptions of videos with MLLM

1

5

hyunji amy lee

@hyunji_amy_lee

3 days

🚨 Excited to announce Gistify!, where a coding agent must extract the gist of a repository: generate a single, executable, and self-contained file that faithfully reproduces the behavior of a given command (e.g., a test or entrypoint). ✅ It is a lightweight, broadly applicable

4

39

96

Ziyang Wang

@ZiyangW00

3 days

🎉Thanks for the shoutout! I’ll be virtually presenting our new work Video-RTS at #EMNLP2025 (my co-lead @jaeh0ng_yoon will present in person). If you’re into advanced video-reasoning frameworks, check it out: - No SFT, pure RL: trains with simple output-based rewards (GRPO)—no

Mohit Bansal

@mohitban47

3 days

🚨 Check out our awesome students/postdocs' papers at #EMNLP2025 and say hi to them 👋! Also, I will give a keynote (virtually) on "Attributable, Conflict-Robust, and Multimodal Summarization with Multi-Source Retrieval" at the NewSumm workshop. -- Jaehong (in-person) finished

0

6

14

Shoubin Yu @ EMNLP

@shoubin621

3 days

Excited to be at #EMNLP2025 in Suzhou! I’ll present our work: (1) MEXA (Fri 12:30 PM CST) about general multimodal reasoning with dynamic multi-expert aggregation and (2) RACCooN (Wed 4:30 PM CST) about editing videos via auto-generated narratives. Please stop by our

Mohit Bansal

@mohitban47

3 days

🚨 Check out our awesome students/postdocs' papers at #EMNLP2025 and say hi to them 👋! Also, I will give a keynote (virtually) on "Attributable, Conflict-Robust, and Multimodal Summarization with Multi-Source Retrieval" at the NewSumm workshop. -- Jaehong (in-person) finished

1

9

25

Mohit Bansal

@mohitban47

3 days

🚨 Check out our awesome students/postdocs' papers at #EMNLP2025 and say hi to them 👋! Also, I will give a keynote (virtually) on "Attributable, Conflict-Robust, and Multimodal Summarization with Multi-Source Retrieval" at the NewSumm workshop. -- Jaehong (in-person) finished

2

29

63

AK

@_akhaliq

4 days

Revisiting Multimodal Positional Encoding in Vision-Language Models

3

29

210

Jaehong Yoon

@jaeh0ng_yoon

4 days

Details available on my website: https://t.co/3y0FFlP4E4 ▶️ Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning (Main) 🔗: https://t.co/9g45H9qBX7 ▶️ RACCooN: Versatile Instructional Video Editing with Auto-Generated

sites.google.com

Motivation

0

3

Mohit Bansal

@mohitban47

9 days

It was an honor and pleasure to give a keynote at the 28th European Conference on Artificial Intelligence (#ECAI2025) in beautiful Bologna, and engage in enthusiastic discussions about trustworthy + calibrated agents, collaborative reasoning + privacy, and controllable multimodal

1

26

68

Vaidehi Patil

@vaidehi_patil_

11 days

🥳🥳 Honored and grateful to be awarded a 2025 Google PhD Fellowship in Machine Learning and ML Foundations for my research on machine unlearning, defenses against adversarial attacks, and multi-agent privacy! ✨ Deep gratitude to my advisor @mohitban47 for his constant

Google.org

@Googleorg

15 days

🎉 We're excited to announce the 2025 Google PhD Fellows! @GoogleOrg is providing over $10 million to support 255 PhD students across 35 countries, fostering the next generation of research talent to strengthen the global scientific landscape. Read more: https://t.co/0Pvuv6hsgP

30

18

138

AK

@_akhaliq

18 days

OmniVinci Enhancing Architecture and Data for Omni-Modal Understanding LLM

4

16

108

Ziwei Liu

@liuziwei7

15 days

📢 #ICCV2025 Welcome to check out our work @ICCVConference📢 * 3D/4D - DPoser-X: https://t.co/v0VFv8ZDj3 - Free4D: https://t.co/EKsPhiOLg7 * Video - DCM: https://t.co/zOxqu1vpyt - FreeScale / FreeMorph * VLM - Video-TT: https://t.co/LvyNtbKGUK - MM-SAE: https://t.co/Z6FexYSxOF

1

7

90

AK

@_akhaliq

17 days

MoGA Mixture-of-Groups Attention for End-to-End Long Video Generation

4

13

126

Jaehong Yoon

@jaeh0ng_yoon

15 days

I’m really sad to hear this news... but it’s also a great chance to connect with @yilin_sung, who has brilliant ideas and solid experience in efficient AI, reinforcement learning, and (multimodal) LLMs! Anyone working in this area should definitely chat with him. 💡

Yi Lin Sung

@yilin_sung

15 days

Tough week! I also got impacted less than 3 months after joining. Ironically, I just landed some new RL infra features the day before. Life moves on. My past work spans RL, PEFT, Quantization, and Multimodal LLMs. If your team is working on these areas, I’d love to connect.

0

2

9

Zaid Khan

@codezakh

16 days

🥳 Honored and grateful to be awarded an NDSEG Fellowship in Computer Science! 💫🇺🇸 Big thanks to my advisor @mohitban47 for his guidance, and shoutout to my lab mates at @unc_ai_group, collaborators, internship advisors, and mentors for their support 🤗 Excited to continue

UNC Computer Science

@unccs

16 days

🎉 Congratulations to our student Zaid Khan (advised by @mohitban47) for being awarded a prestigious NDSEG Fellowship for his work on environment generation! Established in 1989, the fellowship has an acceptance rate of <7% and covers diverse science and engineering disciplines.

15

20

49

Saining Xie

@sainingxie

25 days

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)

56

333

2K

Shoubin Yu @ EMNLP

@shoubin621

28 days

🚨 New Paper Alert! Introducing SciVideoBench — a comprehensive benchmark for scientific video reasoning! 🔬SciVideoBench: 1. Spans Physics, Chemistry, Biology & Medicine with authentic experimental videos. 2. Features 1,000 challenging MCQs across three reasoning types:

3

29

39