Analytics in Football - Official Thread

I've heard Mark O Haire make that very point about the data on the betfair podcast. I actually meant to ask you about it, but forgot.
The average Championship club has comfirtably more value than the average club of any of the leagues newly added to fbRef's xG set, so I dunno.

It's worth digging into. Maybe Phil could create a nice viz relating to the new leagues they've added and which ones are more/less subject to wild displacement from expected performance.
 
Interesting alright.

I've always felt slightly uncomfortable with the xG concept that, from a given position, any two strikers are equally likely to convert. Even when we restrict the pool of strikers to only those in the PL.

Even at the elite level, if I had to put my house on it I 'd choose Harry Kane over Raheem Sterling to take on the shot (or the 100+ shots per season, or the 1500+ shots across a career).

Maybe we just have to get comfortable that it fits the 99.9% but there are players (at either end) who have characteristics which mean they'll consistently under or overperform the benchmark.

Likewise, maybe there are factors which mean some clubs lower down the footballing pyramid can be significant outliers in terms of the attacking xG underperformance?

- mentality: are they at that level (and at that specific club) because they choke in pressure situations more often?
- nature of chances created: is there a uniform profile at PL/Championship level in terms of the distribution of type of chances created? (e.g. set-piece, header form cross, through ball leading to 1-on-1, shot from distance etc.)

Very hard to pin down with any accuracy though I'd imagine.

1667389057089.png

1667389332904.png

1667389343247.png
 
There's a general +/- % that all top players fall into though as regards finishing. The very best/worst finishers will only over/under perform from zero to around 20-25%, and if they step outside that mark for a season or so, they'll revert towards the mean rapidly enough after that. Kane, Messi & Sterling all fall into these parameters from the figs you show.

Maybe these parameters loosen as you travel down the pyramid, but my instinct would have said otherwise.
 
I also wonder whether financial disparity in a league actually causes less wild variance.

You'd think it would be the opposite but that's another possible explainer for the lack of a quick correction in the Championship as a whole.
 
There's a general +/- % that all top players fall into though as regards finishing. The very best/worst finishers will only over/under perform from zero to around 20-25%, and if they step outside that mark for a season or so, they'll revert towards the mean rapidly enough after that. Kane, Messi & Sterling all fall into these parameters from the figs you show.

Maybe these parameters loosen as you travel down the pyramid, but my instinct would have said otherwise.

This is a great visual. Interesting to see an objectively outstanding player like Benzema vary so wildly

la-liga-xg-over-under-performance-1536x737.jpg




Karim Benzema, Real Madrid, 2017-18 | Goals (5), Expected Goals (13.2), Underperformance (-8.2)

It’s May 2018. You’re Real Madrid and you’ve relinquished your La Liga crown to rivals Barcelona. You finished third, 17 points off the winners, despite scoring 94 goals. You’re about to lose your all-time record scorer Cristiano Ronaldo, the man who scored 450 goals in 438 games, to Juventus and you’ve got [checks notes] a 30-year-old Karim Benzema to fill the void. The same Benzema who managed just five goals that season. *gulp*.

But again, look under the hood and expected goals tells us that Benzema should have scored over 13 goals that season. His underperformance of 8.2 is the biggest single underperformance of any player in La Liga in recorded history.

So, Ronaldo leaves. And what happens next? Benzema scores 20+ goals for three consecutive seasons, is currently on track for another 20 this season, and comes fourth in the 2021 Ballon d’Or rankings. Ok, so Real aren’t at the peak of their powers right now, but on an individual level, Benzema is perhaps in the form of his life. The Frenchman is renowned as a world-class finisher, and given that, his 2017-18 numbers were likely to positively regress.
 
The overlap of people who actively watched the 1990/1994 world cup and also have coding / data visualisation skills may not be that deep.
 
This is a great visual. Interesting to see an objectively outstanding player like Benzema vary so wildly

la-liga-xg-over-under-performance-1536x737.jpg




Karim Benzema, Real Madrid, 2017-18 | Goals (5), Expected Goals (13.2), Underperformance (-8.2)

It’s May 2018. You’re Real Madrid and you’ve relinquished your La Liga crown to rivals Barcelona. You finished third, 17 points off the winners, despite scoring 94 goals. You’re about to lose your all-time record scorer Cristiano Ronaldo, the man who scored 450 goals in 438 games, to Juventus and you’ve got [checks notes] a 30-year-old Karim Benzema to fill the void. The same Benzema who managed just five goals that season. *gulp*.

But again, look under the hood and expected goals tells us that Benzema should have scored over 13 goals that season. His underperformance of 8.2 is the biggest single underperformance of any player in La Liga in recorded history.

So, Ronaldo leaves. And what happens next? Benzema scores 20+ goals for three consecutive seasons, is currently on track for another 20 this season, and comes fourth in the 2021 Ballon d’Or rankings. Ok, so Real aren’t at the peak of their powers right now, but on an individual level, Benzema is perhaps in the form of his life. The Frenchman is renowned as a world-class finisher, and given that, his 2017-18 numbers were likely to positively regress.
I was wondering about that graphic, I was surprised Messi was so high, then I noticed the x-axis is raw goals. In my opinion, if it was like a percentage it would make more sense.

I'm still kind of surprised he's so high because I'd have thought his free kicks would have dragged him down a bit.

But maybe I just don't know how it's 'calculated'. Is it based on where the scorer receives the ball initially? I thought until now it was based on the shooting.

I'm still kind of unconvinced about the value of the metric as giving much meaningful information.

Also, what's the y axis in that visualization? They should have used seasons as it would make it easier to track progress, but I think they just spaced it out at random.
 
EVENT GUIDE - HIGHLIGHT
The Complete Stone Roses
The Oliver Plunkett, Oliver Plunkett St.

1st Aug 2024 @ 8:00 pm
More info..

Jan McCullough: Night Class

Crawford Art Gallery, Today @ 10am

More events ▼
Top