October 24 2014 09:21AM
Normal distributions of team PDO over nine seasons at each score situation
The idea that PDO converges to a value of 1.000 (or 100.0 or 1000 depending where you want your decimal point) over long periods of time is a useful one for hockey analysts. However, it can also provide an additional lens on performance over short periods of time. Knowing that the sum of shooting percentage and save percentage will eventually converge to 1 can indicate whether the short term results are indicative of what we can expect over larger samples.
You can call it a measure of luck or randomness or chance or bounces or whatever. The point is that they all even out eventually.
Or do they?
If you break down PDO into it's component parts, you're left with shooting percentage and save percentage. There are indications that team level performance can affect both of those. Some teams can consistently shoot above league average while elite goaltending is by definition above average. So for example, you would expect that if Boston can put up anything close to league average shooting percentage, their PDO will be above 1 just by virtue of the fact they have Tukka Rask putting up an even-strength save percentage of 0.939 over 76 games.
But that's a bit of a subjective observation because it's still based on relatively small sample sizes. So while we can recognize that team shooting and goaltending skill can affect PDO, it's difficult to take into account because player shooting ability changes over time, goalies age, rosters change etc. As a result, trying to normalize PDO for team shooting and goaltending is going to be much more subject to variance and unsubtantiated inference.
That being said, we do have large historical samples of score effects, which provide a much more stable statistical source on which to make some inferences.
A Different Take On Score Effects
Thanks to the great work by the folks over at war-on-ice I was able to grab team-by-team situational stats for ever season back to 2005-06. That's 270 team-seasons of data to work with, with breakdowns by score situation:
- Trailing by 2 or more (Down 2+)
- Trailing by 1 (Down 1)
- Score tied (Tied)
- Leading by 1 (Up 1)
- Leading by 2 or more (Up 2+)
We know that the score affects how teams play and that this shows up in the on-ice stats. We account for this either by eliminating situations where the score is no longer close, or by adjusting for it. So you'll see references to "score-close" stats or "score-adjusted" Fenwick in many analyses.
The most common adjustment is to use "score-close" stats. That means we count on-ice events while the score is within one during the first two periods, or tied during the third period. But looking at the historical stats, it is clear that teams change their style of play even when down a goal no matter at what point in the game it happens. Here's even strength shot attempts (Corsi For) per 60 minutes of ice time at each score situation over nine seasons:
I chose raw shot attempts because that's the chart I had handy, but you'd see a similar effect if I showed it as a percentage of shot attempts for and against.
Now, I'm not discounting the use of score-close numbers. The effect of a one-goal difference on run of play is definitely more pronounced in the third period than in the first. But at the same time, I don't think we should completely discount that being down on the scoreboard is going to impact play no matter how much time has or has not elapsed.
So we know that teams press more when they're behind, and tend to ease up a bit when they're ahead, but interestingly enough that doesn't mean trailing teams score more goals than teams that are ahead:
The Down2+/Up2+ numbers are right on the border of being statiscally significant.
Even so, the impact on scoring rates is basically imperceptible.
Because while the leading team has fewer shot attempts, they are presumably higher quality chances. If so, we should expect to see this reflected in shooting percentages at each score situation:
Normal distributions of team sh% over nine seasons at each score situation.
Sure enough, the shooting percentage is higher when leading even by one goal, and much higher when leading by two or more.
That chart's a little messy, so here are the shooting percentages at each score situation over the last nine seasons in bar chart format:
The jump in shooting percentage is much more apparent in this format.
It's interesting to note that the shooting percentages while trailing do go up marginally. Both are statistically significant, but only barely. I would expect that this is due to the increase in rebounds as the shot volume increases.
Conversely, if the score has an impact on shooting percentage, we can expect it will have an inverse effect on save percentage, especially at the league level:
Normal distributions of team sv% based on nine seasons at each score situation
Not going to bother with the bar chart because it will just be the direct inverse of the shooting percentage chart up above.
Putting the Pieces of PDO Back Together
So, what does this mean, exactly?
Well, we've seen that the score situation will consistently impact both shooting percentage and save percentage over long periods of playing time at even strength.
If we now put these two components of PDO back together, we get this:
So yes, while we can expect PDO to converge to 1.000 over large samples, empirically, we can say that:
- PDO will converge to 1.013 over large samples of even strength playing time while leading by a goal, and
- PDO will converge to 1.026 over large samples of even strength playing time while leading by two goals.
And because it's a zero-sum stat, the PDO while trailing will be symetrical around 1.000.
Now that we have long-term average PDOs in each score situation, we can use this to adjust the PDO of teams over a shorter time period to account for these score effects.
For example, if a team plays significant periods of time leading by two or more goals, their PDO should be higher than 1.000. We can account for this by taking the ratio of actual PDO to the reference PDO at each score situation and weighting it by the TOI at that score situation. The result will give us a better indication of the true impact of luck, randomness, chance, bounces, puck luck, whatever you want to call it, over short time spans.
Applying this method to the 2014/15 season to date gives us:
I've gone on long enough, so I'm not going to comment much on how the adjustment impacts individual teams and what they might mean. But suffice it to say, that some of the adjustments are material.
This is not so much a limitation as an observation. As the season progresses, the amount of time played with the score tied will dominate start to dominate the weightingt of adjPDO. So over the course of the season, we can expect the difference between PDO and adjPDO to diminish, and thus also it's usefulness. But early in the season, or when calculating x-game rolling average PDO, this method should prove useful.