Gedas Bertasius Profile
Gedas Bertasius

@gberta227

Followers
1K
Following
3K
Media
45
Statuses
502

Assistant Professor at UNC, previously a postdoc at Meta AI, PhD from UPenn, video understanding, multimodal AI, a basketball enthusiast.

Chapel Hill, NC
Joined June 2020
Don't wanna be here? Send us removal request.
@gberta227
Gedas Bertasius
8 days
Is language a "terrible abstraction" for video understanding? Many in the video community often dismiss language-driven approaches in favor of complex, video-native solutions. However, I believe this resistance stems more from internal bias—validating a research identity as a
2
4
20
@mohitban47
Mohit Bansal
4 days
🚨 Check out our awesome students/postdocs' papers at #EMNLP2025 and say hi to them 👋! Also, I will give a keynote (virtually) on "Attributable, Conflict-Robust, and Multimodal Summarization with Multi-Source Retrieval" at the NewSumm workshop. -- Jaehong (in-person) finished
2
30
63
@mohitban47
Mohit Bansal
3 days
@cyjustinchen @ArchikiPrasad @swarnaNLP @EliasEskin -- Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning @ZiyangW00 @jaeh0ng_yoon @shoubin621 @mmiemon @gberta227 https://t.co/THxKAhgCPX https://t.co/c6s8hnrKFH
@ZiyangW00
Ziyang Wang
4 months
🚨Introducing Video-RTS: Resource-Efficient RL for Video Reasoning with Adaptive Video TTS! While RL-based video reasoning with LLMs has advanced, the reliance on large-scale SFT with extensive video data and long CoT annotations remains a major bottleneck. Video-RTS tackles
1
3
8
@AiYiyangZ
Yiyang Zhou
3 days
🚨 BREAKING: AI Can't Actually See Videos. New benchmark shows mainstream LVLMs barely hit 60% accuracy—while humans reach 94.82%. This isn’t a glitch—it’s a fundamental failure in video understanding. LVLMs are doing visual theater, not real comprehension.
2
9
19
@jaeh0ng_yoon
Jaehong Yoon
5 days
🎉 Excited to share that 5/5 of my papers (3 main, 2 findings) have been accepted at #EMNLP2025, in video/multimodal reasoning, instructional video editing, and efficient LLM adaptation & reasoning! 🚨 I’m recruiting Ph.D. students to join the Multimodal AI Group at NTU College
15
32
311
@codezakh
Zaid Khan
17 days
🥳 Honored and grateful to be awarded an NDSEG Fellowship in Computer Science! 💫🇺🇸 Big thanks to my advisor @mohitban47 for his guidance, and shoutout to my lab mates at @unc_ai_group, collaborators, internship advisors, and mentors for their support 🤗 Excited to continue
@unccs
UNC Computer Science
17 days
🎉 Congratulations to our student Zaid Khan (advised by @mohitban47) for being awarded a prestigious NDSEG Fellowship for his work on environment generation! Established in 1989, the fellowship has an acceptance rate of <7% and covers diverse science and engineering disciplines.
15
20
48
@mangahomanga
Homanga Bharadhwaj
19 days
I'll be joining the faculty @JohnsHopkins late next year as a tenure-track assistant professor in @JHUCompSci Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!
87
112
861
@gberta227
Gedas Bertasius
20 days
Can AI models teach you to shoot like Steph Curry? 🏀 Come to my talk on Challenges in Expert-Level Skill Analysis at 4:30 pm in Room 318-A tomorrow (Sunday) to find out! https://t.co/gYPFtEB1ZU #ICCV2025
sauafg-workshop.github.io
ICCV 2025 SAUAFG Workshop on AI-driven skill assessment, understanding, and feedback generation.
@ParitoshParmar_
Paritosh Parmar
28 days
🗓Oct 19, 2025 | 📍Hawaii Convention Center, Room 318-A 👉 Learn more: https://t.co/J9BCFRmuo7 🔍 We'll explore AI-driven Skilled Activity Understanding, Assessment & Guidance generation in various domains from Surgery to Sports, from Robotics and Manufacturing to Education
0
3
15
@codezakh
Zaid Khan
24 days
How can an agent reverse engineer the underlying laws of an unknown, hostile & stochastic environment in “one life”, without millions of steps + human-provided goals / rewards? In our work, we: 1️⃣ infer an executable symbolic world model (a probabilistic program capturing
2
42
89
@ParitoshParmar_
Paritosh Parmar
28 days
📣 Announcing 1st International Workshop on Skilled Activity Understanding, Assessment & Feedback Generation @ICCVConference! 🎙️ All-star keynotes: @anfurnari, @walteriomayolc, @rgnespolo, @gberta227, Kristen Grauman, @eadeli 🧠+ Poster Presentations 🗓 Oct 19 · 1:45–6 PM HST
1
1
3
@han_junlin
Junlin (Hans) Han
1 month
Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world? Project page: https://t.co/9mQt3qnckL
7
24
149
@ZiyangW00
Ziyang Wang
3 months
🎉Our Video-RTS paper has been accepted at #EMNLP2025 Main!! We propose a novel video reasoning approach that combines data-efficient reinforcement learning (GRPO) with video-adaptive test-time scaling, improving reasoning performance while maintaining efficiency on multiple
@ZiyangW00
Ziyang Wang
4 months
🚨Introducing Video-RTS: Resource-Efficient RL for Video Reasoning with Adaptive Video TTS! While RL-based video reasoning with LLMs has advanced, the reliance on large-scale SFT with extensive video data and long CoT annotations remains a major bottleneck. Video-RTS tackles
1
30
40
@mmiemon
Mohaiminul (Emon) Islam (on job market)
4 months
Checkout our new paper: Video-RTS 🎥 A data-efficient RL method for complex video reasoning tasks. 🔹 Pure RL w/ output-based rewards. 🔹 Novel sparse-to-dense Test-Time Scaling (TTS) to expand input frames via self-consistency. 💥 96.4% less training data! More in the thread👇
@ZiyangW00
Ziyang Wang
4 months
🚨Introducing Video-RTS: Resource-Efficient RL for Video Reasoning with Adaptive Video TTS! While RL-based video reasoning with LLMs has advanced, the reliance on large-scale SFT with extensive video data and long CoT annotations remains a major bottleneck. Video-RTS tackles
0
7
13
@ZiyangW00
Ziyang Wang
4 months
🚨Introducing Video-RTS: Resource-Efficient RL for Video Reasoning with Adaptive Video TTS! While RL-based video reasoning with LLMs has advanced, the reliance on large-scale SFT with extensive video data and long CoT annotations remains a major bottleneck. Video-RTS tackles
1
37
42
@mmiemon
Mohaiminul (Emon) Islam (on job market)
4 months
🚀 On the job market! Final-year PhD @ UNC Chapel Hill working on computer vision, video understanding, multimodal LLMs & AI agents. 2x Research Scientist Intern @Meta 🔍 Seeking Research Scientist/Engineer roles! 🔗 https://t.co/z9ioZPFCi9 📧 mmiemon [at] cs [dot] unc [dot] edu
Tweet card summary image
md-mohaiminul.github.io
A highly-customizable Hugo academic resume theme powered by Wowchemy website builder.
0
4
18
@mmiemon
Mohaiminul (Emon) Islam (on job market)
5 months
Great to see our paper ReVisionLLM featured by MCML blog! @gberta227 #CVPR2025
@hannan_tanveer
Tanveer Hannan (on job market)
5 months
🚀 Check out our latest work, ReVisionLLM, now featured on the MCML blog! 🔍 A Vision-Language Model for accurate temporal grounding in hour-long videos. 👉 https://t.co/cTNNcRLsFE #VisionLanguage #MultimodalAI #MCML #CVPR2025
0
1
2
@mmiemon
Mohaiminul (Emon) Islam (on job market)
5 months
Come to our poster today at #CVPR2025! 🗓️ June 15 | 🕓 4–6PM 📍 Poster #282 | ExHall D 📝 Paper: https://t.co/4XCHPFWchy 🌐 Project: https://t.co/alktUQtIzE 💻 Code: https://t.co/mRWxTRCh6z 🎥 Youtube:
@mmiemon
Mohaiminul (Emon) Islam (on job market)
8 months
🚀New #CVPR2025 Paper🚀 Introducing BIMBA, an efficient multimodal LLM for long-range video QA💡 It sets SOTA on 7 VQA benchmarks by intelligently selecting key spatiotemporal tokens utilizing the selective scan mechanism of Mamba models. 🧵Thread below👇 https://t.co/yP9ZLkUX2N
0
2
10
@mmiemon
Mohaiminul (Emon) Islam (on job market)
5 months
Great to see a lot of interest among the video understanding community about ReVisionLLM! If you missed it, checkout https://t.co/KAF47QI7yp @hannan_tanveer
@mmiemon
Mohaiminul (Emon) Islam (on job market)
5 months
Presenting ReVisionLLM at #CVPR2025 today! Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos If you are at CVPR, please stop by 📍 Poster #307, Session 4 🗓️ June 14, 5–7PM | ExHall D 🔗 https://t.co/qrBvf2UUAo @hannan_tanveer @gberta227
0
2
10
@mmiemon
Mohaiminul (Emon) Islam (on job market)
5 months
Presenting ReVisionLLM at #CVPR2025 today! Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos If you are at CVPR, please stop by 📍 Poster #307, Session 4 🗓️ June 14, 5–7PM | ExHall D 🔗 https://t.co/qrBvf2UUAo @hannan_tanveer @gberta227
0
3
7
@gberta227
Gedas Bertasius
5 months
Another great accomplishment by Emon this #CVPR2025. Interestingly, rather than using some complex ensemble model, Emon won the EgoSchema challenge by simply applying his latest BIMBA model, which he will also present at the poster session on Sunday 4-6pm. Be sure to stop by!
@mmiemon
Mohaiminul (Emon) Islam (on job market)
5 months
🚀 Excited to share that we won 1st place at the EgoSchema Challenge at EgoVis, #CVPR2025! Our method (81%) outperformed human accuracy (76.2%) for the first time on this challenging task 🎯 Stop by #CVPR: 📍 Poster #282 | June 15, 4–6PM | ExHall D 🔗 https://t.co/alktUQtIzE
1
4
26