June 26 2012 10:08AM
- graphic courtesy of The Russian Machine Never Breaks
For two years, my standard explanation for why we pay a lot of attention to shot differential (Corsi or Fenwick) is that it correlates strongly with scoring chances, zone time, and puck possession, and is the best predictor we have of future goal scoring. Each of those claims is based on seminal work by Vic Ferrari and JLikens.
The claim about scoring chances was based largely on this plot from Vic, showing a correlation between scoring chances and Fenwick of 0.92:
A comprehensive data review
If scoring chances correlate that closely to shot differential, then that would suggest that shot quality differences are negligible, that tracking the scoring chances isn't really adding much information. With more and more bloggers tracking scoring chances these days, we actually have a pretty large body of work to draw from; let's see whether this tight correlation is a general phenomenon.
I managed to dig up 18 teams' scoring chance data: The 2009-10 and 2010-11 Canadiens, Oilers, Maple Leafs, and Flames, and the 2011-12 Canucks, Stars, Flames, Hurricanes, Blue Jackets, Canadiens, Rangers (~3/4 season), Sharks, Capitals, and Flyers. (Thanks to Canucks Army, Defending Big D, Shutdown Line, Jackets Cannon, Leafs Nation, En Attendant les Nordiques, The Copper N Blue, Flames Nation, Blueshirt Banter, Fear the Fin, The Washington Post, and Broad Street Hockey for either publishing or emailing me their results.)
Across those 18 teams, the correlation between a player's shot differential and his scoring chance differential is about 0.80-0.85 (depending on whether you set the cutoff for games played at something like 20 or something like 70).
There is a strong correlation between shot differential and scoring chance differential, so the simple shot differential tells us most of what we need to know. And since others have shown this for a team here or there, this isn't news; the analytical community has long known that shot differential is very important and shot quality effects are minor.
The thing that surprised me is that the spread was actually larger than I had expected -- some players do lie a reasonable distance above or below the fit line, getting scoring chances at a rate like a player whose shot differential is a few percent better. Does that mean that we need the scoring chance totals to get the full story? Or is that just random noise?
Reproducible differences or random fluctuations?
To address that, we'll look at year-over-year correlations; if players are showing an actual talent in this area, we'll expect their numbers to be at least somewhat reproducible from year to year.
Dividing a player's scoring chance differential by his shot differential (SC% / Fenwick) gives us a metric assessing how much of a shot quality edge the team had with him on the ice that year. We can look at the same shot quality index for the following year and ask whether players who did well one year were likely to do well again the next year. With 105 players who had games tracked two years in a row, we should be able to tell whether this shot quality index is measuring a persistent talent.
That's not much of a trend. With an R^2 of 0.049, the correlation is strong enough to be statistically significant, but weak enough to be insignificant in practice. If a guy is at 0.9 this year (one of the worst in the league), you can't even say that he'll be below average next year -- just that he won't be one of the very best.
Whatever tendency certain players might have for driving their team to get more scoring chances than a simple shot differential predicts is small and swamped by random noise. This suggests tracking scoring chances isn't adding much information to the readily available shot differential numbers.