Frederik Hvilshøj
@FHvilshoj
Followers
35
Following
53
Media
23
Statuses
30
ML Lead @ Encord
Rinkgkøbing, Denmark
Joined October 2011
Here are some useful resources 1️⃣ A technical blog post: https://t.co/M4HtR3KYfr 2️⃣ My 5-minute explainer: https://t.co/5x8xxwqNfu ... and by the way, > We have also integrated SAM 2 into our labeling platform.
0
0
0
Here are some useful resources 1️⃣ A technical blog post: https://t.co/M4HtR3KYfr 2️⃣ My 5-minute explainer: https://t.co/5x8xxwqNfu and by the way... > We have integrated SAM 2 into our labeling platform
0
0
0
4/ A big new open dataset A new dataset is introduced 1️⃣ Employs an ingenious data engine taking frame annotations from 38 to 5 secs/frame 2️⃣ Contains +35M masks (50x the previously largest dataset) 3️⃣ Contains +50K videos (11x the previously largest dataset)
1
0
0
3/ The technical masterstrokes 1️⃣ A new memory mechanism (bank + encoder) gives SAM 2 the tracking ability 2️⃣ An occlusion detector helps remove false positives across frames 3️⃣ A hierarchical image encoder (Hiera) that allows skip connections improve accuracy
2
0
0
2/ Quantitative implications 1️⃣ Image segmentation; mIOU 59.1 (SAM) => 69.6 (SAM 2) across 14 different tasks 2️⃣ Video segmentation; J&F 56.7 (SAM+Cutie) => 64.3 (SAM 2) across 17 video datasets 3️⃣ Inference is 6x faster than SAM! SAM 2 blows everything else out of the water
1
0
0
1/ Practical implications 1️⃣Annotating segmentation “tracks” in videos is easy, interactive, and fast (8.5x faster and 3x fewer interactions) 2️⃣No need for additional trackers like Cutie; SAM 2 does it all 3️⃣ It’s lightweight and easier to run - even on cheaper hardware
1
0
0
Last night, Meta AI announced the release of Segment Anything 2 (SAM 2) 🤯 The implications for AI teams are HUGE. I’ve taken a deep dive into SAM 2 and explored: Practical implications Quantitative implications The technical masterstrokes A big new open dataset
1
0
2
You can read the paper here: https://t.co/zdiJBGxhKn Learn more about DINO here: https://t.co/1QNNGSZNnI (9/9)
0
0
0
The conceptual idea of using temporal information by applying cross-attention to mask the input to the student network intelligently (6/9)
0
0
0
Can you do better than just applying DINO on your data? (3/9)
0
0
0
Imagine you're the ML lead at a company with many videos but no labels (2/9)
0
0
0
ICLR'24 is wrapping up. VAEs got the test-of-time award and an outstanding paper award was given to an interesting successor to DINO specifically tailored for videos. I put some more words on it here. (1/9)
8
0
0
(3/3) Excited to try it! Here's link to an interactive demo https://t.co/8ur5VzbYNr Webinar is here: https://t.co/vu1LalvAYM
0
0
0
(2/3) Three bottlenecks of pre-existing open source models
0
0
0
Did you notice the huge step forward for MM LLMs that happened this week?
2
0
0
(3/3) HuggingFace paper: https://t.co/GqTdxIqVqU Colab Notebook: https://t.co/DstKbSytAW CLIP paper: https://t.co/GCOo3cNY65
0
0
0