MMLab@NTU
@MMLabNTU
Followers
2K
Following
213
Media
12
Statuses
79
Multimedia Laboratory @NTUsg, affiliated with S-Lab. Large Multimodal Models, Computer Vision, Image Processing, Computer Graphics, Deep Learning
Singapore
Joined May 2021
Congratulations to Ziqi and Ziwei! Grateful for the opportunity to work with so many gifted students at @MMLabNTU. Their passion and creativity continue to inspire us! Their achievements are listed here: https://t.co/GMvhTMUl09
Freshly picked: #NTUsg PhD student Huang Ziqi has been selected as one of 21 global recipients of the prestigious 2025 Apple Scholars in AIML PhD Fellowship โ a prestigious programme that supports emerging leaders in AI and machine learning through funding, mentorship, and
0
4
17
RT @ccloy: Congrats to Yuekun @YuekunDai and @ziangcao_ , both from @MMLabNTU , for winning the prestigious Google PhD Fellowship! Yuekunโฆ
0
1
0
Introducing ๐๐ก๐ข๐ง๐ค๐ข๐ง๐ ๐ฐ๐ข๐ญ๐ก ๐๐๐ฆ๐๐ซ๐๐ธ, a unified multimodal model that integrates camera-centric spatial intelligence to interpret and create scenes from arbitrary viewpoints. Project Page: https://t.co/KxwIpuDUBg Code: https://t.co/DO52LFyL9m
14
33
147
Thinking with Camera A Unified Multimodal Model for Camera-Centric Understanding and Generation
3
33
238
๐ธJoin us at #ICCV2025 for the Mobile Intelligent Photography & Imaging (MIPI) Workshop! โจLeading keynotes: Profs. @songhan_mit, Michal Irani, Boxin Shi, and @MingHsuanYang - on intelligent photography and efficient GenAI. ๐Oct 20, 8:50amโ12:30pm HST ๐ https://t.co/CqdCqzdsY1
1
10
28
Congratulations to @liuziwei7 of @MMLabNTU , recipient of the Young Scientist Award, recognised for his impactful contributions to computer vision and generative AI. ๐๐
๐ Congrats to #NTUsg Prof Ng Geok Ing on the ๐ธ๐ฌ Presidentโs Technology Award 2025. A pioneer in Gallium Nitride (#GaN) โ found in fast chargers, EVs, satellites & defence โ he built ๐ธ๐ฌโs global standing in this field and led the creation of the national GaN centre. ๐ We also
2
1
29
Congrats to Weichen ( https://t.co/f6HHqV96CK) and Mutian ( https://t.co/APOYvyVfQg)!
#ICCV2025 Congrats to Weichen ( https://t.co/3EHLciKwgP) and Mutian ( https://t.co/eL1sdcXIuo) being selected as the outstanding reviewers @ICCVConference
https://t.co/dKiLhYJjWG
0
1
5
๐ฅ Video is already a tough modality for reasoning. Egocentric video? Even tougher! It is longer, messier, and harder. ๐ก How do we tackle these extremely long, information-dense sequences without exhausting GPU memory or hitting API limits? We introduce ๐Ego-R1: A framework
7
9
37
๐ฌ ๐๐ฉ๐ฃ๐ฅ ๐ฎ๐ฌ๐ฎ๐ฑ ๐ง๐๐๐ผ๐ฟ๐ถ๐ฎ๐น ๐๐ง๐ค๐ข ๐๐๐๐๐ค ๐๐๐ฃ๐๐ง๐๐ฉ๐๐ค๐ฃ ๐ฉ๐ค ๐๐ค๐ง๐ก๐ ๐๐ค๐๐๐ก ๐ Hosted by MMLab@NTU ร Kuaishou, etc ๐
June 11 | Nashville ๐ https://t.co/YcQ6pb30R0 ๐ง Video is just the start. World modeling is the goal. #CVPR2025 #WorldModel
1
28
138
Aero-1-Audio is out on Hugging Face Trained in <24h on just 16รH100 Handles 15+ min audio seamlessly Outperforms bigger models like Whisper, Qwen-2-Audio & commercial services from ElevenLabs/Scribe
8
64
415
๐ฅ We release Harmon: a unified framework for multimodal understanding & generation with a shared visual encoder (vs. decoupled Janus/-Pro). ๐ฅ SOTA on GenEval, MJHQ, WISE ๐ง Strong understanding performance ๐ Paper: https://t.co/RFhEl9NEN7 ๐ Code:
github.com
[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation - wusize/Harmon
2
1
18
๐ Meet Harmon โ a unified model for both image generation and understanding! Trained with a shared masked autoregressive encoder, it sets new benchmarks on GenEval & MJHQ30K. ๐ผ๏ธ๐ฌ Try the live demo now on Hugging Face: ๐ https://t.co/PX7OkVaZbx Paper:
huggingface.co
๐ฅ We release Harmon: a unified framework for multimodal understanding & generation with a shared visual encoder (vs. decoupled Janus/-Pro). ๐ฅ SOTA on GenEval, MJHQ, WISE ๐ง Strong understanding performance ๐ Paper: https://t.co/RFhEl9NEN7 ๐ Code:
0
2
15
We turned our method, rejected by CVPR and ECCV, into the iOS app "Cutcha". EdgeSAM, our fast Segment Anything Model, runs at over 30 FPS on an iPhone 14. Enjoy intuitive one-touch object selection and precise editingโall processed locally on your device. No cloud needed!
8
26
215
๐ธ๐ Attention all photography and imaging enthusiasts! Join us at the Third MIPI Workshop at #CVPR2024! ๐ Location: Arch 213 โฐ Time: 08:30 AM - 12:10 PM ๐ Website: https://t.co/3x06T1AvaF Don't miss out on an exciting lineup of speakers: ๐น Lei Zhang: How Far Are We From
2
9
31
The Upcoming AI talk: ๐LLaVA๐ฆ A Vision-and-Language Approach to Computer Vision in the Wild by Chunyuan Li @ChunyuanLi More info: https://t.co/ap7S1osxAm Subscribe us: https://t.co/m7NoJNciLe
0
14
33
(1/2) We are actively seeking PhD candidates from various countries to foster diversity in our research group at Nanyang Technological University. Know someone interested in a PhD with us? Please refer them to our team. Thanks for supporting diversity in academia! ๐๐
2
17
86
๐ฌ Our study introduces "Upscale-A-Video," a text-guided latent diffusion framework for video upscaling. It ensures temporal coherence locally & globally, balancing fidelity and quality. ๐ Project page: https://t.co/3UkaXXyMCC ๐ป GitHub: https://t.co/irQRuPHxED ๐ฅ Video:
11
51
293
EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM ๐ Project page: https://t.co/ydz7rZ78sS ๐ GitHub: https://t.co/VJS0YQHJI0 ๐ค Hugging Face:
huggingface.co
๐ Excited to share our latest work: "EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM" Supercharged SAM for Edge Devices! ๐ #EdgeSAM is a faster, optimized version of SAM, now tailored for edge devices. We've reimagined SAM's ViT-based image encoder
0
0
8
EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM ๐ Project page: https://t.co/ydz7rZ78sS ๐ GitHub: https://t.co/VJS0YQHJI0 ๐ค Hugging Face:
huggingface.co
0
0
0