
Josmy Faure
@JosmyFaure1
Followers
15
Following
6
Media
4
Statuses
9
Software Engineer at @Google | PhD Student at National Taiwan University
Taipei, Taiwan
Joined July 2017
🚀 New at #ICCV2025: HERMES — a Video Understanding framework that’s both ⚡ efficient and 🎯 accurate. No more trade-off between speed and performance.
2
2
9
Super excited to share HERMES @ICCVConference — our video understanding framework that can boost accuracy, speed, and reduce memory! Please check the original post from @JosmyFaure1 for more details! website: https://t.co/LFQgh9lDq4
#ICCV2025 #NVIDIA #LLM #Video #multimodal
🚀 New at #ICCV2025: HERMES — a Video Understanding framework that’s both ⚡ efficient and 🎯 accurate. No more trade-off between speed and performance.
0
10
57
Huge thanks to my collaborators: @CMHungSteven, Jia-Fong Yeh, Hung-Ting Su, Shang-Hong Lai, Winston H. Hsu Excited to see how the community builds on this 🙌
0
0
0
And it’s open-source! You can plug HERMES into your own VLM today: 📄 Paper: https://t.co/N43RxfSaTk 💻 Code: https://t.co/CLNx7XXjTX 🌐 Project page:
github.com
[ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics - joslefaure/HERMES
1
0
0
With ECO + SeTR, HERMES: ✅ 43% faster inference ✅ 46% less GPU memory ✅ +3.8% accuracy boost on top VLMs ✅ New SOTA on multiple benchmarks
1
0
0
(2) 💡 Semantics reTRiever (SeTR): captures overarching themes (e.g. an “80s rock party vibe”) scattered across the whole video.
1
0
0
Our approach consists of two cognitive-inspired modules: (1) 🧠 Episodic Compressor (ECO): processes long videos like humans do, bundling frames into meaningful episodes (“arriving,” “singing,” “cake-cutting”). Dense → efficient memory.
1
0
1
Until now, better video models were slower and more resource-intensive. We asked: can we break this trade-off?
1
0
0