SkalskiP Profile Banner
SkalskiP Profile
SkalskiP

@skalskip92

Followers
17,879
Following
904
Media
1,006
Statuses
5,768

Computer Vision @roboflow . Open-source. GPU poor. Dog person. Coffee addict. Dyslexic. | GH: | HF:

Kraków, Polska
Joined February 2014
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@skalskip92
SkalskiP
5 months
here is the final version of my vehicle speed estimation demo read the thread below to learn how I built it. I will cover: - detection - tracking - perspective transformation - speed calculation - some bonus ideas ↓
74
285
3K
@skalskip92
SkalskiP
7 months
RIP image annotation companies Fully automated image labeling with GroundingDINO + SAM + OpenAI Vision API code:
Tweet media one
65
397
3K
@skalskip92
SkalskiP
10 months
supervision-0.13.0 is out! Now you can effortlessly build advanced video analytics. Trackers, Zones, Annotators, and much more. GitHub repository:
37
547
3K
@skalskip92
SkalskiP
4 months
REAL-TIME object detection WITHOUT TRAINING YOLO-World is a new SOTA open-vocabulary object detector that outperforms previous models in terms of both accuracy and speed. 35.4 AP with 52.0 FPS on V100. ↓ read more
35
388
3K
@skalskip92
SkalskiP
2 months
supervision, the open-source library I created a year ago, has crossed 10,000 stars on GitHub this weekend! thank you to everyone who helped me build this project! it took us 2,000+ commits, 500+ PRs and 50+ contributors to do it. repository:
21
315
2K
@skalskip92
SkalskiP
19 days
almost fully functional version of my football AI project today, I added player tracking using ByteTrack and projection of players onto the map code coming soon:
54
210
2K
@skalskip92
SkalskiP
10 months
supervision-0.13.0 is out! We added ByteTrack support! Now you can easily plug in any object detector and use it for tracking. GitHub repository:
47
311
2K
@skalskip92
SkalskiP
3 months
I'm starting to get more and more serious with YOLO-World; trying to solve real-life problems. I wanted to see if YOLO-World could recognize that the holes had been filled out. It was pretty tricky, but I learned a little about prompting. ↓ read more
16
167
2K
@skalskip92
SkalskiP
9 months
The traffic analysis project is growing! The YouTube tutorial will be out this week. Progress: I can now identify that the car is in a specified zone. Next: Match entrance and exit zones for every tracker ID to analyze the traffic flow. GitHub repo:
20
292
1K
@skalskip92
SkalskiP
7 months
Chat with the webcam using @OpenAI vision API
45
182
1K
@skalskip92
SkalskiP
20 days
I'm taking my football/soccer project to the next level today, I worked on detecting players, referees, and the ball and mapping their positions from video frames to positions on the field. ↓ read more
60
121
1K
@skalskip92
SkalskiP
11 days
I fine-tuned my first vision-language model PaliGemma is an open-source VLM released by @GoogleAI last week. I fine-tuned it to detect bone fractures in X-ray images. thanks to @mervenoyann and @__kolesnikov__ for all the help! ↓ read more
Tweet media one
30
196
1K
@skalskip92
SkalskiP
9 months
ball and player 3d pose estimation - easily one of the coolest computer vision projects I have ever made repository:
24
187
1K
@skalskip92
SkalskiP
2 months
detecting AI-generated text researchers studied the impact of ChatGPT on AI conference peer reviews, confirming what we all knew paper: ↓ read more
Tweet media one
33
119
1K
@skalskip92
SkalskiP
6 months
Nov 6th, 2023: We love you guys! Nov 17th, 2023: Sam is fired!
40
161
1K
@skalskip92
SkalskiP
2 months
manual data labeling is (almost) dead 1,500,000 images auto-annotated within 2 weeks of release. now, we also support automatic segmentation labeling. ↓ read more about open-source models that power this feature
51
143
1K
@skalskip92
SkalskiP
3 months
YOLOv9 is out looks like a new SOTA real-time object detector I'm already working on a custom training tutorial
@_akhaliq
AK
3 months
YOLOv9 Learning What You Want to Learn Using Programmable Gradient Information Today's deep learning methods focus on how to design the most appropriate objective functions so that the prediction results of the model can be closest to the ground truth. Meanwhile, an appropriate
Tweet media one
9
165
799
24
164
1K
@skalskip92
SkalskiP
13 days
I need to take a break from football AI for a while. I plan to experiment with PaliGamma, Google's new open-source VLM, over the next few days. but don't worry, I'll be back. In the meantime, the football AI code is slowly making its way to this repo.
37
140
1K
@skalskip92
SkalskiP
3 months
train YOLOv9 on your dataset tutorial - run inference with a pre-trained COCO model - fine-tune model on custom dataset - evaluate the trained model - run inference with a fine-tuned model blogpost: ↓ read more
14
156
1K
@skalskip92
SkalskiP
9 months
supervision-0.13.0 is out! Now you can effortlessly count crops in the fields with a single drone flyby. GitHub repository:
11
153
1K
@skalskip92
SkalskiP
14 days
taking my football/soccer AI to the next level - image embeddings - dimension reduction - player clustering - awesome visualizations code: (code migration in progress...) ↓ read more
32
92
982
@skalskip92
SkalskiP
4 months
what stops you from using supervision today? link:
24
108
938
@skalskip92
SkalskiP
6 months
looking for OpenAI-4V alternatives? - LLaVA - BakLLaVA - CogVLM - Fuyu-8B - Qwen-VL I am working on a short blog post discussing some GPT-4V alternatives. It will probably come out today. links all resources:
Tweet media one
@skalskip92
SkalskiP
6 months
What OpenAI-4V alternatives would you recommend? - LLaVA - BakLLaVA
45
44
489
43
154
929
@skalskip92
SkalskiP
7 months
Automated @NBA match commentary using @OpenAI vision and TTS (with code!) Everyone is bragging about projects that generate automatic video commentary, but no one is showing the code. I did it while waiting for the plane. code:
43
144
913
@skalskip92
SkalskiP
3 months
manual data labeling is almost dead define prompts, tweak the confidence threshold, and make manual adjustments if necessary. this feature is now available to all users, even on free accounts. read more:
12
126
922
@skalskip92
SkalskiP
4 months
how to calculate the TIME objects spend IN THE ZONE? - that's the topic of my next tutorial. here's a short (and a bit creepy) demo I built a few months ago. do you have ideas for a less creepy use case for this tech? github repository:
56
127
862
@skalskip92
SkalskiP
3 months
taking traffic analysis to the next level with supervision-0.19.0 speed estimation + 3d roead visualization link: ↓ read more
12
109
865
@skalskip92
SkalskiP
6 months
analyzing store traffic to find the most frequently visited areas super demo created by @Hine__Po - member of Supervision community link to repo if you want to build something over the weekend:
13
147
819
@skalskip92
SkalskiP
3 months
The YOLO-World YouTube tutorial is out! please, let us know what you think! - model architecture - processing images and video in Colab - prompt engineering and detection refinement - pros and cons of the model watch here: ↓ more resources
12
137
810
@skalskip92
SkalskiP
2 months
now you can run real-time object detection on multiple streams with 10 lines of code link: ↓ code snippet
13
142
791
@skalskip92
SkalskiP
3 months
YOLOv9 tutorial: train model on custom dataset - running inference with pre-trained COCO weights - fine-tuning the model on a custom dataset - model evaluation - model deployment sorry it took me so long; hope you like it
15
100
756
@skalskip92
SkalskiP
1 month
it took us a while, but the supervision-0.20.0 release will finally add support for key points. what are your thoughts on annotators? so far, we only have EdgeAnnotator and VertexAnnotator. supervision repo:
21
97
748
@skalskip92
SkalskiP
8 months
supervision-0.15.0 is out! This time, we bring highly customizable annotators. We added eight annotators - box, mask, ellipse, label, circle, corner, trace, and blur. But the best part is... you can freely mix them! GitHub repository:
9
126
740
@skalskip92
SkalskiP
4 months
improving object counting logic today I solved an interesting bug that has existed in my library for a loooooong time repository: ↓ WARNING: lots of math in the thread below
8
80
739
@skalskip92
SkalskiP
8 months
Easily one of the most exciting projects built with Supervision! Our community member Vriza Wahyu Saputra built this fantastic ball juggling counting demo using the moving LineZone available in our API.
12
95
719
@skalskip92
SkalskiP
4 months
parking occupancy analysis calculation of percentage occupancy in individual parking zones all this was done with supervision: btw, @UenoLeo is cooking a blog post covering this project, so stay tuned! ↓ read more
13
98
706
@skalskip92
SkalskiP
6 months
Am I the last person who didn't know about OpenAI Cookbook? link:
Tweet media one
23
89
707
@skalskip92
SkalskiP
2 months
support for pose estimation and key point detection soon in the supervision you can expect connectors for the most popular models and the first annotators in the next supervision release can't wait to build demos like this with supervision
14
84
707
@skalskip92
SkalskiP
3 months
smart self-service checkout powered by YOLOv9 the value of the basket is updated live based on its changing content; what else should I add? demo build with supervision:
15
90
708
@skalskip92
SkalskiP
2 months
I love watching other people build cool demos with the supervision library; traffic analysis examples built by Anant Jaiswal - object tracking - zone counting - heat-map analysis link:
4
95
702
@skalskip92
SkalskiP
7 months
What papers should I read to expand my knowledge of Transformers? Please send links in the comments and write why this paper is worth reading. Thanks for your help!
Tweet media one
32
103
687
@skalskip92
SkalskiP
4 months
Qwen-VL-Plus is SACARY good! (better than GPT-4V) here it is casually solving Recaptcha! - You don't have to give any additional instructions other than 'Solve it.' - It can even mark the exact position of the objects it is looking for. ↓ it can do so much more
Tweet media one
24
105
679
@skalskip92
SkalskiP
5 months
speed estimation tutorial is finally out! - object detection -multi-object tracking - filtering detections with polygon zone - perspective transformation and speed estimation link: below are some interesting visualizations I created for this video ↓
13
112
674
@skalskip92
SkalskiP
6 months
Sports Analytics with GPT-4 Vision I wondered whether GPT-4V had the capability to automatically separate players into teams based on the color of their uniforms. It took me a ridiculously long time to create this image, but in the meantime, I learned a lot about GPT-4V.
Tweet media one
@skalskip92
SkalskiP
10 months
supervision-0.13.0 is out! We added ByteTrack support! Now you can easily plug in any object detector and use it for tracking. GitHub repository:
47
311
2K
20
90
670
@skalskip92
SkalskiP
2 months
new YouTube tutorial: compute dwell time using computer vision in live streams (seems easy, yet tricky) - static file vs stream processing - preventing growing latency and frame buffer overflow - efficient stream processing full tutorial: ↓ read more
6
77
675
@skalskip92
SkalskiP
9 months
- Object detection over HTTP? - Easy! We just open-sourced our inference server under Apache 2.0 Left terminal: @roboflow inference Right terminal: video client
7
81
664
@skalskip92
SkalskiP
9 months
The traffic analysis project is done! The YouTube tutorial will be out tomorrow. Stay tuned! Wait till flow counters appear around 0:06. Github repo:
17
103
651
@skalskip92
SkalskiP
7 months
SAM + MetaCLIP + ProPainter produce masks: remove object: I'm working on combo space!
7
104
619
@skalskip92
SkalskiP
4 months
it blows my mind to see things that are created using my code
Tweet media one
14
26
621
@skalskip92
SkalskiP
4 months
It took me ONE HOUR to craft this demo using supervision-0.18.0 - Three new annotators: PercentageBar, RoundedBox, and OrientedBox - Enhanced LineZone feature for improved counting - OBB (oriented bounding boxes) integration ↓ read more repo:
13
94
611
@skalskip92
SkalskiP
3 months
YOLO-World + EfficientSAM + StableDiffusion for language-guided inpainting I was inspired yesterday by the work of @MrDravcan (see attached), and I decided to try to replicate it. SPOILER ALERT: it didn't quite work out for me. ↓ read more
17
97
612
@skalskip92
SkalskiP
2 months
time-in-zone (dwell time) tutorial is coming this is the third time I'm trying to make this video; hopefully, the last one I finally have a good use case - waiting time for service. here is the first iteration. what do you think? link:
11
61
602
@skalskip92
SkalskiP
2 months
awesome example of using Supervision for the detection, annotation, and counting of coffee seedlings kudos to community member Eric Kimwatan supervision repo: ↓ youtube tutorial and colab
8
87
599
@skalskip92
SkalskiP
21 days
always triple-check the correctness of your datasets and data augmentations. today, I found two separate errors that ruined my model training. but finally, we are on the right track ↓ here's where I messed up
12
38
602
@skalskip92
SkalskiP
4 months
supervision-0.18.0 is almost here! we had planned to release it tomorrow, but we're still putting the finishing touches on the OBB (oriented bounding box) support repository:
Tweet media one
6
67
586
@skalskip92
SkalskiP
6 months
Manually annotate ONE image and let GPT-4V annotate ALL of them. 1. generate boxes for all images with GroundingDINO 2. provide categories for the reference image 3. prompt GPT-4V to map generated boxes to reference categories
Tweet media one
8
86
580
@skalskip92
SkalskiP
2 months
detecting small objects is hard I spent some time today writing a short how-to guide on using supervision (in combination with the most popular CV libraries) to detect small objects. btw is that a good idea for a video tutorial? link: ↓ read more
18
59
580
@skalskip92
SkalskiP
9 months
I'm working on a new YouTube tutorial... It's going to be sick! Here is v1 of my custom vehicle detector.
19
62
570
@skalskip92
SkalskiP
6 days
I'm experimenting with PaliGemma tonight a single open-source model allowing you to: - detect car (detection) - answer questions about its color and brand (VQA) - read license plate number (OCR) all that on a single consumer-grade GPU is there any other model that can do it?
Tweet media one
25
84
635
@skalskip92
SkalskiP
6 months
me: Find dog. gpt-4 vision: for the past few days I have been working on a library for advanced prompting of LMMs here it is:
Tweet media one
16
76
555
@skalskip92
SkalskiP
2 months
I'm experimenting with a new annotator that zooms in on small detections do you think it is something useful? or am I just wasting my time here? more cool annotators:
41
55
561
@skalskip92
SkalskiP
3 months
processing documents with Claude 3 - Good OCR capabilities - Process up to 20 images with a single API call - API seems slow and a bit unstable; expect a lot of variance in call execution time - ~2x cheaper than GPT4-V (please check my math) ↓ read more
Tweet media one
Tweet media two
10
47
544
@skalskip92
SkalskiP
8 months
Working on a new tutorial - time in the zone. Detection, tracking, and zones are ready. Time to add timers. GitHub repository:
9
95
536
@skalskip92
SkalskiP
2 months
time analysis with computer vision - blurring faces - detection and tracking - smoothing detections - filtering detections by zone - calculating time let me know if you want me to explain anything else. ;) code: ↓ read more
8
58
524
@skalskip92
SkalskiP
5 months
finally had a little bit of time to work on my upcoming vehicle speed estimation tutorial any improvement ideas? the demo was built with the supervision code will soon land on GitHub:
26
77
500
@skalskip92
SkalskiP
8 months
Is that demo too creepy? Ignore that one lady sitting in the zona since the beginning is undetected. I am still trying to figure out why... But zone timers work! GitHub repository:
39
81
513
@skalskip92
SkalskiP
3 months
🔴 stream: YOLO-World Q&A + coding in less than 15 minutes, I start my first YT stream; I'll be talking about YOLO-World and answering your questions that you left under my last YT video stop by to say hello link: ↓ some of the topics we will cover
7
76
517
@skalskip92
SkalskiP
1 year
Two months ago, I created a @github repository where I gathered links to the best free AI courses. 🔥 I started with five links, and now there are almost 20. 🚀 The entire repository already has 1200+ ⭐ ⮑ 🔗 GitHub repository: ↓🧵some of the courses
Tweet media one
8
154
499
@skalskip92
SkalskiP
2 months
OpenAI, xAI, and... Supervision top 5 on GitHub trending!
Tweet media one
@skalskip92
SkalskiP
2 months
supervision, the open-source library I created a year ago, has crossed 10,000 stars on GitHub this weekend! thank you to everyone who helped me build this project! it took us 2,000+ commits, 500+ PRs and 50+ contributors to do it. repository:
21
315
2K
19
43
486
@skalskip92
SkalskiP
6 months
What OpenAI-4V alternatives would you recommend? - LLaVA - BakLLaVA
45
44
489
@skalskip92
SkalskiP
6 months
supervision-0.17.0 release is just around the corner - plug in your favorite detection/segmentation model - compose the perfect visualization github:
9
93
490
@skalskip92
SkalskiP
9 months
Traffic Analysis Tutorial is out! I'm sorry for the delay. I didn't expect it to be 23 minutes long. Please let me know what you think. YouTube video:
13
102
490
@skalskip92
SkalskiP
6 months
using GPT-4V to split players into teams blending detections with the same tracker ID allows you to significantly reduce the number of GPT-4V API calls when you process video 1 call / 25 frames kudos to @ikuma_uchida18 for coming up with this strategy read more, it's cool ↓
@skalskip92
SkalskiP
6 months
Sports Analytics with GPT-4 Vision I wondered whether GPT-4V had the capability to automatically separate players into teams based on the color of their uniforms. It took me a ridiculously long time to create this image, but in the meantime, I learned a lot about GPT-4V.
Tweet media one
20
90
670
9
79
472
@skalskip92
SkalskiP
7 months
The second day of work on my SAM + MetaCLIP + ProPainter HF Space - Automated object masking [done] - Automated inpainting using ProPainter [in progress]
13
68
466
@skalskip92
SkalskiP
6 months
I just added the polygon annotator to the supervision package you can now use masks or polygons to visualize the result of the instance segmentation model polygon annotator will be available in supervision-0.17.0 code:
5
54
456
@skalskip92
SkalskiP
6 months
processing this one-second video exhausted my entire daily quota of 500 GPT-4V requests but if you were wondering, @OpenAI GPT-4V can automatically divide players into teams based on the color of their uniforms
@skalskip92
SkalskiP
6 months
Sports Analytics with GPT-4 Vision I wondered whether GPT-4V had the capability to automatically separate players into teams based on the color of their uniforms. It took me a ridiculously long time to create this image, but in the meantime, I learned a lot about GPT-4V.
Tweet media one
20
90
670
26
62
451
@skalskip92
SkalskiP
4 months
defect detection with computer vision training and deploying manufacturing defect detector step-by-step guide blog post: ↓ read more
Tweet media one
9
70
455
@skalskip92
SkalskiP
25 days
estimating traffic density based on the live feed from NYC street cameras. you can find out in real-time which streets are congested. shoutout to @UenoLeo for creating this cool project!
7
50
449
@skalskip92
SkalskiP
7 months
Segment Anything (SAM) + MetaCLIP - unleashing the full power of @Meta open source! I'm having fun with @Gradio today!
@NielsRogge
Niels Rogge
7 months
CLIP by @OpenAI was revolutionary, but its data curation pipeline was never detailed nor open-sourced. @Meta has now released MetaCLIP, a fully open-source replication. Models are on the hub:
17
172
1K
10
74
437
@skalskip92
SkalskiP
3 months
YOLO (unofficial and incomplete) history who made what? while I wait for my first YOLOv9 model custom dataset fine-tuning to finish, I decided to share with you an incomplete YOLO history with links to papers and code YOLO (2016) Joseph Redmon et al. - paper:
Tweet media one
4
60
427
@skalskip92
SkalskiP
3 months
zero-shot video object detection with YOLO-World as promised, I just updated my @huggingface ; have fun! space:
8
70
417
@skalskip92
SkalskiP
8 months
supervision-0.15.0 will be out tomorrow! This time we bring highly customizable annotators. Just plug in your model and we'll take care of the rest. GitHub repository:
4
62
412
@skalskip92
SkalskiP
7 months
Adding meaningful regions and labels significantly improves GPT-4V's reasoning capabilities.
Tweet media one
18
70
403
@skalskip92
SkalskiP
7 months
the must-have resource for anyone who wants to experiment with and build on the @OpenAI Vision API code:
Tweet media one
7
58
402
@skalskip92
SkalskiP
1 month
zone analysis is awesome; you can use it to calculate an object's precise position in space, determine its movement path, or measure its distance traveled. air traffic monitoring demo by @carlos_melo_py supervision repo: ↓ youtube tutorial and code
5
59
405
@skalskip92
SkalskiP
2 months
whenever I show zone analysis in my tutorials, people ask me how I designed the polygons I decided to spend a few hours and create for you a small tool you can fire up locally to draw zones code:
8
37
400
@skalskip92
SkalskiP
6 months
looking for OpenAI-4V alternatives? - LLaVA - BakLLaVA - CogVLM - Qwen-VL different tasks: - VQA - answering questions about images - OCR - reading text - zero-shot detections link:
12
58
385
@skalskip92
SkalskiP
1 month
I'm having a good time building a new supervision annotator the hardest part is to ensure that labels do not overlap
16
34
382
@skalskip92
SkalskiP
5 years
For a month I've been working on an open source tool for labelling images in a browser - here it is @justadudewhohax @benhamner @PyImageSearch @lavanyaai @shiffman Maybe you'll find it useful in the next ML/CV project? visit GH:
15
122
375
@skalskip92
SkalskiP
5 months
CS25: Transformers United V3 by @Stanford Stanford has recently updated its free course on Transformers, adding a fresh set of lectures. Among the new content is a lecture by @DrJimFan demonstrating how agents based on GPT-4 can be used to play Minecraft.
Tweet media one
7
76
371
@skalskip92
SkalskiP
7 months
the most critical paper for anyone planning to build applications with GPT-4V (Vision) arXiv:
Tweet media one
4
45
368
@skalskip92
SkalskiP
8 months
Kosmos-2: Image Description + Bounding Boxes Adds grounding capabilities to understand object descriptions and link text to visual elements. GitHub repository: #computervision #multimodality #llm #objectdetection
Tweet media one
6
78
361
@skalskip92
SkalskiP
6 months
counting people in zone (with code) some time ago, I showed you how to use polygon zones to count people. I just added a refreshed version of this project to supervision/examples. the new version offers a fast change of zone configuration. code:
6
72
355
@skalskip92
SkalskiP
21 days
this new DiffMOT tracker looks pretty good. I'd love to test it in one of my demos. have any of you managed to get it working on your own video? (if so, let me know)
11
34
359
@skalskip92
SkalskiP
6 months
supervision-0.17.0 is out! - added PixelateAnnotator, TriangleAnnotator, and PolygonAnnotator - made MaskAnnotator 5x faster - added integration with @OpenAI CLIP and @huggingface Timm github:
6
50
353
@skalskip92
SkalskiP
2 months
top 100 adjectives that are disproportionately used more frequently by AI
Tweet media one
10
50
343
@skalskip92
SkalskiP
4 months
time in zone tutorial is coming! btw, would you like to watch me build my computer vision demos on Twitch? zone timer will be released with supervision-0.18.0 this week:
12
33
340
@skalskip92
SkalskiP
7 months
count vehicles on the road with supervision supervision-0.16.0 is coming out this week! stay tuned GitHub repository:
3
43
340
@skalskip92
SkalskiP
7 months
supervision-0.16.0 will be out tomorrow! Among other things, we are adding more annotators. GitHub repository:
10
51
333