The Need For Advanced Football Stats

Stewart Downing
Stewart Downing

As a sports fan living in America, you’re bombarded with statistics of increasing complexity across every sport – except football (or soccer, as the US calls it).

Football analysis still relies on the same football statistics we’ve seen for years, which are generally cumulative totals. These totals can be of cards, shots, possession, corners – almost anything that can be an indicator of success or failure is totalled.

The same is true in Britain, but is changing as organizations like OptaSports corner the market on Sabermetric analysis. A quick glace at Opta’s website shows who relies on their advanced stats; their Twitter feed is a must-follow for anyone looking for juicy football tidbits. Their system produces vast amounts of top-quality information.

For American sports, there is (generally) free or low-cost metrics available which effectively prove beyond doubt factors like a quarterback’s preferred side, how many shots per possession an NBA player will heave up or any number of myriad baseball efficiency stats. Even now NBA, NFL and MLB stat-geeks can all measure player value in “Win Shares” – or how crucial that player is to their team using as defined by a variety of detailed analytics.

Sabermetrics is defined as the analysis of baseball using objective, measureable numbers and was first coined by Bill James. It gained widespread notoriety in Michael Lewis’ book “Moneyball”, which detailed the role advanced baseball metrics had in the 2001 success of the financially-challenged Oakland Athletics. The A’s ascribed to Sabermetrics before anyone else and their results were far in excess of what was expected of a mid-market, struggling team.

In football, the Moneyball theorem got it’s most public airing when Liverpool signed Stewart Downing at great cost from Aston Villa. Liverpool owner John W. Henry is said to ascribe to the Moneyball Principle and advanced football statistics said Downing had greatest number of completed crosses in the Premiership over the past five years. Saying Henry had turned the Boston Red Sox into a winner using Sabermetrics made good copy, but while true, it also ignored the multi-millions (billions?) of dollars NESV spent building such a winner.

The old cliche says that “Statistics can be used to prove (or disprove) anything”. With bad statistics, this is quite true; however information which incorporates the rate at which events happen is much more informative and doesn’t confirm to the cliche. Rate, or how frequently you can expect one result from a process, is key to all statistical analysis and must be the basis of a new football evaluation.

Sabermetrics, and advanced statistics in general, give context to information. This is because they equate rate to efficiency – the more (or less) frequently a player or team does X, Y or Z over time directly relates to their success in wins and losses.

The available basic football stats leave reader to draw their own conclusions. If Nani makes 3.5 successful dribbles per game, the result could be a completed cross, turnover in the box, shot on target or many other possibilities. Teams have access to this information, both through their own scouting and through the services of Sabermetric analysts. The data is available, but at a cost; the football public should now ask for similar detailed numbers.

Some basic examples have been published on Soccerlens. These include Goalkeeper Save percentage, indicating either ‘keeper error or defensive breakdown; and how frequently players contribute a “Scoring Stat” (a goal or assist).

It’s time to add a couple more: one team-based and one individual.

A new way of measuring form

The Form table is now a matter only of wins, losses and draws. This pays no heed to the quality of opposition the team has faced, nor the scoreline of each result – the confidence gained from an 8-2 triumph is (probably) greater than scraping a 1-0 victory. While form is, in the strictest sense, a matter of wins and losses, reading a normal form table and then an enhanced form table – taking into account opposition strength and scorelines – would be more enlightening.

To evaluate the quality of opposition, we could use a method similar to the NFL does for assessing schedule strength. In the NFL where each team plays 12-13 of the 32 franchises, a team’s fixture list for the season is evaluated for it’s strength by measuring their opponents collective wins against collective losses. For example, the Houston Texans and Tennessee Titans had the hardest NFL schedules in 2010, playing teams who combined for 140 wins and 116 losses (or a .547 winning percentage).

This doesn’t work for a football League campaign as every team plays each other twice; as a start-of-season indicator, however, it works well. It’s perhaps best to evaluate the earliest of early-season form (first five games) by taking into account the entire of the previous season’s form, rather than the percentage of points opponents have taken from the past five games of last season due to the vagaries of late-season matches. After five games have been completed, the same calculation is then performed for the current season.

In their first five matches this season, Manchester United face(d) West Brom, Spurs, Arsenal, Bolton and Chelsea. These teams combined for 294 points (from a possible 570) last season, for a schedule strength of 0.516. With promoted teams scoring the average points obtained by the relegated teams they replaced (37), Wigan’s schedule strength for the first five matches is significantly weaker at .414.

A second wrinkle would be to include goals scored in the form table – or, rather than goal difference, using it expressed as a percentage. For example, over the first four games of the season United would have a form table of WWWW, with scoring percentage 6.00. Manchester City’s would be similar – WWWW, 5.00; mid-table QPR ranks as LWLD, 0.167 and bottom-placed Blackburn Rovers LLLD, 0.429.

Scoring efficiency

Outside totals, individual stats are also difficult to find in the public domain.

Scoring efficiency is one factor often left disregarded. While “Chicharito” Hernandez is said to be a “predator in the box”, how much more efficient is he than any other forwards? This issue is complicated by matters of sample size – statisticians need a suitable number of attempts before being able to draw any conclusions.

Measuring a player’s conversion rate is done easily enough by comparing the number of goals they score against the number of chances they are afforded. It’s too early to do so for this season (many players have a 1.000 scoring efficiency rating based on one shot for one goal) – to investigate the most efficent scorers last season is interesting.

Amongst the top twenty-four scorers last season, Chicarito was significantly more accurate than most of his opposition. He scored on nearly one quarter of all his chances at 0.241. Next best was Wolves’ Steven Fletcher, who recorded 0.227 efficiency; only two more players – Berbatov and Kuyt – shot with over 20% accuracy, at 0.204 and 0.203 respectively. Of the top twenty-four scorers last season, Didier Drogba shot with the least efficiency at 0.073. Three other players needed eleven shots to score one goal – including slumping pair Fernando Torres and Wayne Rooney.

Advanced metrics have their place in football. It’s now time to look further into so-called “statistical analysis” – and demand more than just basic stats.

For more football stats and analysis, visit WhoScored.com.

Arrow to top