jaeh0ng_yoon Profile Banner
Jaehong Yoon Profile
Jaehong Yoon

@jaeh0ng_yoon

Followers
2K
Following
1K
Media
38
Statuses
729

Assistant Professor @ntusg CCDS | Prv: @uncnlp @MLAI_KAIST @MSFTResearch | Trustworthy & Continually Adaptable Multimodal AI

Singapore
Joined April 2016
Don't wanna be here? Send us removal request.
@jaeh0ng_yoon
Jaehong Yoon
4 days
🎉 Excited to share that 5/5 of my papers (3 main, 2 findings) have been accepted at #EMNLP2025, in video/multimodal reasoning, instructional video editing, and efficient LLM adaptation & reasoning! 🚨 I’m recruiting Ph.D. students to join the Multimodal AI Group at NTU College
15
31
308
@ritaranx
Ran Xu
2 days
Happy to introduce my internship work at @Google and @GoogleDeepMind, collab w/ @googlecloud. We introduce TIR-Judge, an end-to-end agentic RL framework that trains LLM judges with tool-integrated reasoning 🧠🛠️ 🔗 https://t.co/rtfqlvuzJ0 #Agents #LLMs #Judges #RL #reasoning
11
67
508
@jaseweston
Jason Weston
23 hours
Scaling Agent Learning via Experience Synthesis 📝: https://t.co/3WXayMsHrD Scaling training environments for RL by simulating them with reasoning LLMs! Environment models + Replay-buffer + New tasks = cheap RL for any environments! - Strong improvements over non-RL-ready
7
60
325
@mohitban47
Mohit Bansal
2 days
@cyjustinchen @ArchikiPrasad @swarnaNLP @EliasEskin -- Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning @ZiyangW00 @jaeh0ng_yoon @shoubin621 @mmiemon @gberta227 https://t.co/THxKAhgCPX https://t.co/c6s8hnrKFH
@ZiyangW00
Ziyang Wang
4 months
🚨Introducing Video-RTS: Resource-Efficient RL for Video Reasoning with Adaptive Video TTS! While RL-based video reasoning with LLMs has advanced, the reliance on large-scale SFT with extensive video data and long CoT annotations remains a major bottleneck. Video-RTS tackles
1
3
8
@mohitban47
Mohit Bansal
2 days
Tweet card summary image
arxiv.org
Recent video generative models primarily rely on carefully written text prompts for specific tasks, like inpainting or style editing. They require labor-intensive textual descriptions for input...
@jaeh0ng_yoon
Jaehong Yoon
1 year
🚨New paper👉RACCooN: remove/add/change video content effortlessly/interactively via our MLLM+Video Diffusion (V2P2V) framework with auto-generated descriptions! ▶️ 1. Video-to-Paragraph (V2P): RACCooN first generates well-structured/detailed descriptions of videos with MLLM
1
1
5
@hyunji_amy_lee
hyunji amy lee
3 days
🚨 Excited to announce Gistify!, where a coding agent must extract the gist of a repository: generate a single, executable, and self-contained file that faithfully reproduces the behavior of a given command (e.g., a test or entrypoint). ✅ It is a lightweight, broadly applicable
4
39
96
@ZiyangW00
Ziyang Wang
3 days
🎉Thanks for the shoutout! I’ll be virtually presenting our new work Video-RTS at #EMNLP2025 (my co-lead @jaeh0ng_yoon will present in person). If you’re into advanced video-reasoning frameworks, check it out: - No SFT, pure RL: trains with simple output-based rewards (GRPO)—no
@mohitban47
Mohit Bansal
3 days
🚨 Check out our awesome students/postdocs' papers at #EMNLP2025 and say hi to them 👋! Also, I will give a keynote (virtually) on "Attributable, Conflict-Robust, and Multimodal Summarization with Multi-Source Retrieval" at the NewSumm workshop. -- Jaehong (in-person) finished
0
6
14
@shoubin621
Shoubin Yu @ EMNLP
3 days
Excited to be at #EMNLP2025 in Suzhou! I’ll present our work: (1) MEXA (Fri 12:30 PM CST) about general multimodal reasoning with dynamic multi-expert aggregation and (2) RACCooN (Wed 4:30 PM CST) about editing videos via auto-generated narratives. Please stop by our
@mohitban47
Mohit Bansal
3 days
🚨 Check out our awesome students/postdocs' papers at #EMNLP2025 and say hi to them 👋! Also, I will give a keynote (virtually) on "Attributable, Conflict-Robust, and Multimodal Summarization with Multi-Source Retrieval" at the NewSumm workshop. -- Jaehong (in-person) finished
1
9
25
@mohitban47
Mohit Bansal
3 days
🚨 Check out our awesome students/postdocs' papers at #EMNLP2025 and say hi to them 👋! Also, I will give a keynote (virtually) on "Attributable, Conflict-Robust, and Multimodal Summarization with Multi-Source Retrieval" at the NewSumm workshop. -- Jaehong (in-person) finished
2
29
63
@_akhaliq
AK
4 days
Revisiting Multimodal Positional Encoding in Vision-Language Models
3
29
210
@jaeh0ng_yoon
Jaehong Yoon
4 days
Details available on my website: https://t.co/3y0FFlP4E4 ▶️ Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning (Main) 🔗: https://t.co/9g45H9qBX7 ▶️ RACCooN: Versatile Instructional Video Editing with Auto-Generated
Tweet card summary image
sites.google.com
Motivation
0
0
3
@mohitban47
Mohit Bansal
9 days
It was an honor and pleasure to give a keynote at the 28th European Conference on Artificial Intelligence (#ECAI2025) in beautiful Bologna, and engage in enthusiastic discussions about trustworthy + calibrated agents, collaborative reasoning + privacy, and controllable multimodal
1
26
68
@vaidehi_patil_
Vaidehi Patil
11 days
🥳🥳 Honored and grateful to be awarded a 2025 Google PhD Fellowship in Machine Learning and ML Foundations for my research on machine unlearning, defenses against adversarial attacks, and multi-agent privacy! ✨ Deep gratitude to my advisor @mohitban47 for his constant
@Googleorg
Google.org
15 days
🎉 We're excited to announce the 2025 Google PhD Fellows! @GoogleOrg is providing over $10 million to support 255 PhD students across 35 countries, fostering the next generation of research talent to strengthen the global scientific landscape. Read more: https://t.co/0Pvuv6hsgP
30
18
138
@_akhaliq
AK
18 days
OmniVinci Enhancing Architecture and Data for Omni-Modal Understanding LLM
4
16
108
@liuziwei7
Ziwei Liu
15 days
📢 #ICCV2025 Welcome to check out our work @ICCVConference📢 * 3D/4D - DPoser-X: https://t.co/v0VFv8ZDj3 - Free4D: https://t.co/EKsPhiOLg7 * Video - DCM: https://t.co/zOxqu1vpyt - FreeScale / FreeMorph * VLM - Video-TT: https://t.co/LvyNtbKGUK - MM-SAE: https://t.co/Z6FexYSxOF
1
7
90
@_akhaliq
AK
17 days
MoGA Mixture-of-Groups Attention for End-to-End Long Video Generation
4
13
126
@jaeh0ng_yoon
Jaehong Yoon
15 days
I’m really sad to hear this news... but it’s also a great chance to connect with @yilin_sung, who has brilliant ideas and solid experience in efficient AI, reinforcement learning, and (multimodal) LLMs! Anyone working in this area should definitely chat with him. 💡
@yilin_sung
Yi Lin Sung
15 days
Tough week! I also got impacted less than 3 months after joining. Ironically, I just landed some new RL infra features the day before. Life moves on. My past work spans RL, PEFT, Quantization, and Multimodal LLMs. If your team is working on these areas, I’d love to connect.
0
2
9
@codezakh
Zaid Khan
16 days
🥳 Honored and grateful to be awarded an NDSEG Fellowship in Computer Science! 💫🇺🇸 Big thanks to my advisor @mohitban47 for his guidance, and shoutout to my lab mates at @unc_ai_group, collaborators, internship advisors, and mentors for their support 🤗 Excited to continue
@unccs
UNC Computer Science
16 days
🎉 Congratulations to our student Zaid Khan (advised by @mohitban47) for being awarded a prestigious NDSEG Fellowship for his work on environment generation! Established in 1989, the fellowship has an acceptance rate of <7% and covers diverse science and engineering disciplines.
15
20
49
@sainingxie
Saining Xie
25 days
three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)
56
333
2K
@shoubin621
Shoubin Yu @ EMNLP
28 days
🚨 New Paper Alert! Introducing SciVideoBench — a comprehensive benchmark for scientific video reasoning! 🔬SciVideoBench: 1. Spans Physics, Chemistry, Biology & Medicine with authentic experimental videos. 2. Features 1,000 challenging MCQs across three reasoning types:
3
29
39