Pieter Robberechts
@p_robberechts
Followers
2K
Following
866
Media
40
Statuses
354
PhD student @DTAI_KULeuven, applying Machine Learning on sports data.
Belgium
Joined January 2010
๐จThe third and final blog post in our series on possession value models design decisions๐: Can the features chosen to represent the game state inadvertently bias player ratings? w/ @p_robberechts @jessejdavis1 @lodevantente
dtai.cs.kuleuven.be
Conceptually, possession value approaches such as VAEP, PV, OBV, and g+ are all identical: they estimate the chances of scoring (andโฆ
1
6
13
The workshop welcomes research on applying ML & data mining to sports โ all disciplines, including e-sports. ๐
Monday 15 or Friday 19 September 2025 ๐ Porto, Portugal ๐ Submission info: https://t.co/DTXTrcPNkb Feel free to reach out with any questions!
1
0
3
After a fantastic run by @JanVanHaaren, @jessejdavis1 and Ulf Brefeld, weโre honored to carry the torch and continue the workshop's legacy! ๐ฅ
2
0
2
๐ฃ Excited to share that the Workshop on Machine Learning and Data Mining for Sports Analyticsย (MLSA) will be held again this September as part of @ECMLPKDD! w/ @MaaikeVanRoy @hugoriosneto and @azimmerm_dm
https://t.co/BCk6GQU3B3
dtai.cs.kuleuven.be
Workshop on Machine Learning and Data Mining for Sports Analytics at ECML/PKDD 2025
1
4
18
One of my hobbies is doing some light data science for soccer. Best package in the game is SoccerData. Makes it easy to pull down dataframes of match and season data from the major soccer providers.
2
4
24
๐ ๐ค๐ฅ๐จ๐ฉ๐ฉ๐ฒ==๐.๐๐.๐ Happy Holidays to the Sports Analytics Community! This release contains some exciting updates, you can find the highlights in this thread. But first, we're really excited that @p_robberechts has officially joined as a maintainer! ๐
1
2
8
Part 2 in our series on possession value models design decisions๐: How the definition of "near future" has interesting effects on player ratings. It's not about 'better,' but about understanding the nuances. w/ @p_robberechts @jessejdavis1 @lodevantente
https://t.co/QGAsvnkh41
dtai.cs.kuleuven.be
Conceptually, possession value approaches such as VAEP, PV, OBV, and g+ are all identical: they estimate the chances of scoring (andโฆ
1
9
20
I promise there will be some great insights into this blog post series.
We're doing a deep dive into possession value models. While VAEP, g+, PV & OBV are conceptually identical, they make different design choices. First, we look at using (no) goal vs. xG as the target variable. w/@p_robberechts @jessejdavis1 @lodevantente
https://t.co/5hyFDrsDDo
0
0
3
Nothing quite like a 'quick update' turning into a two-day saga of dependency chaos. But hey, ๐ฑ๐ฏ-๐๐ผ๐ฐ๐ฐ๐ฒ๐ฟ is updated for the first time in 4 years! ๐ https://t.co/NjhN1smZgF
2
12
117
Has soccer gone too far in its obsession with keeping possession?โฝ๏ธ A few weeks ago I presented our latest research paper โBoot It: A Pragmatic Alternative to Build-Up Playโ at the #StatsBombConference. ๐ฝ๏ธ https://t.co/9k20mIyYc3 ๐ https://t.co/rS3DXs4XO1 [1/3]
3
4
15
Attending @ECMLPKDD and interested in sports โฝ๐๐? Donโt miss our tutorial on Team Sports Analytics tomorrow! Our goal is to provide an accessible overview of existing work on the use of machine learning in sports. Check out the details here ๐ https://t.co/WmMwCzH7R6
1
6
20
Part of @kloppy_dev 3.15.0 is the ๐ข๐จ๐จ๐ณ๐ฆ๐จ๐ข๐ต๐ฆ method. This allows you to go from dataset to aggregation in a single line. The first one implemented is ๐ฎ๐ช๐ฏ๐ถ๐ต๐ฆ๐ด_๐ฑ๐ญ๐ข๐บ๐ฆ๐ฅ. It returns the time a player was on the pitch (including start- and end timestamps).
1
2
9
At long last, we have a (fingers crossed) bug-free implementation of orientation and pitch dimension transforms in @kloppy_dev!๐โจ
๐ ๐ค๐ฅ๐จ๐ฉ๐ฉ๐ฒ==๐.๐๐.๐ A new version of kloppy (3.15.0) is now available! This release includes major additions: โ
DatasetTransformer (to easily transform pitch dimensions) โ
Minutes Played Aggregator โ
Time Based Positions โ
Improved Orientation
0
4
13
Where do you normally store large assets required for automated tests? Should it be part of the repository? Those assets can be several GB so donโt want to download those on every single run. Oh, and the assets are private and access should require some sort of authentication.
3
1
0
Curious about the favorites for #CopaAmรจrica2024? our projections are below with home advantage for the US. w/@p_robberechts
0
3
11
๐๏ธ With 51 matches ahead, our computer model has a standout #euro2024 favoriteโFrance๐ซ๐ท at 26%. Other top contenders include: ๐ฉ๐ช16%, ๐ด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ15%, ๐ต๐น11%, ๐ช๐ธ10%. Our interactive visualization provides detailed odds for each team๐(\w @_dnzcn @jessejdavis1 ) https://t.co/UVHjcYiRJK
1
0
14
A key challenge in developing sports analytics metrics is how to evaluate them. I shared some insights on this topic at the @PySportOrg meetup a few months ago, covering various approaches and lessons learned. A recording of the talk is now available online.
The videos of our last meetup at the @statsperform office are out on youtube! Presentations by @p_robberechts, @numberstorm and @patricklucey. Thanks @andycoops83 for co-organising this great event! https://t.co/k6CmoPQsTp
0
4
17
Our projections are back! ๐ฎ We have simulated #euro2024. Our model rates France as the strong favorite with a 26% chance of winning, followed by Germany, England, Portugal, and Spain. w/@p_robberechts @jessejdavis1
https://t.co/V9YIJNSPyI
dtai.cs.kuleuven.be
It is hard to believe, but it is already time for another Euros Football Tournament. So, who are the favorites and dark horses heading intoโฆ
1
5
14
Looking for places to publish #sportsanalytics research? Weโve put together a list: https://t.co/qGVi67xwXc
dtai.cs.kuleuven.be
The DTAI Sports Analytics Lab is a group within the KU Leuven DTAI lab that focuses on the applications of machine learning and data mining in sports.
1
11
41
11. soccerdata (scrape data)
Python Packages to explore for football analytics: 1. mplsoccer (football viz) 2. statsbombpy (statsbomb package) 3. openai (AI) 4. numpy (math) 5. scikit-learn (machine learning) 6. matplotlib (data viz) 7. bs4 (web scraping) 8. requests (web scraping) 19 pandas (data cleaning)
0
3
77