Fink Tank falls to the tyranny of numbers

This is around 2 weeks late, but try as I might, I can’t resist poking holes in it.

On May 29th, The Times (the online version) proudly announced the Fink Tank Premiership Player ratings. Fans from all Premiership clubs (with the exception perhaps of Chelsea and Manchester United fans) criticised the ratings, calling them all sorts of unpleasant names.

I’ve had the chance to follow the Fink Tank for almost an year now – I used the online version extensively throughout the 2006/2007 Premiership season and despite it’s quirks in predicting final scores, it was quite accurate in predicting win/loss/draw results. It also predicted, as soon as Manchester United had the top spot in the Premiership, that Man Utd would win the title.

However, there are significant differences between the app I was using and the algorithms used to rank 403 Premiership players.

Tony Cascarino, in defence, says that stats are not always perfect because they lack the human element. He’s right, of course, but he’s missed something more important here.

Like I said, I’ve been following the Fink Tank for a while now and last year I had the chance to read up on a few of the documents / papers / articles authored by it’s creators. One thought stuck with me, and what I’ll do is to paraphrase it here, and then present a suitable quote from this year’s ratings.

1. Basically, Finkelstein argues that most prediction algorithms suffer from too much detail when trying to predict match results, and that skews the final answers. Instead of using every available detail / stat, Fink Tank supposedly picks only the important factors that can be measured.

Contrast the above with this:

2. Dr Ian Graham and Dr Henry Stott used the model to allow us to identify the relationship between goals scored and every kick of the ball made by every player for every club.

Now when I read that the first time, I was like – why the hell do you need to go into so much detail?

DF goes into more detail here, and while I think that they’ve done a great job, I think they’ve suffered from a mistake they’ve tried so hard to avoid in the past – too much information.

The problem with stats is that you can use them to prove anything, as long as you have the numbers (and the time to play with them). McCarthy’s 20+ goals may lead you to think that’s he’s as good as Berbatov until you consider that he has 2-3 assists, while Berbatov has 10 or 11 (from memory). And THEN you might argue that while Berbatov had good players playing around him, McCarthy was a lone ranger and as such his goals are more valuable. Stats go both ways, unfortunately.

And what really turned it off for me was their brash haughtiness in this article – where Robinson’s ratings are penalised because Tottenham were defensively incoherent – are you really going to blame all conceded goals on the goalkeeper and not on the defense – not because Robinson himself made a lot of clangers.

For DF to not see that tells me that they’re putting too much faith in numbers.

For the record – Lampard does deserve to be in the top 10 – his goals and work-rate for Chelsea were excellent. Silva, maybe, maybe not. Hleb – well, are you also counting the number of attacks a player like Fabregas might have started if he had the ball in the same places as Hleb had? Of course Hleb is better than the average player, that’s not the complaint against him – he just frustrates fans too much with certain aspects of his football, because he has such great chances and they go nowhere.

The rest – you be the judge. All I know is that they need to rework their algorithm – not because Lamps is 2nd or Robinson is 402nd – but because they’ve got the fundamentals wrong.

How would I do it?

Well, that’s a question for another day – and will probably take a whole day to answer anyway.

Arrow to top