September 18 2012 09:23AM
Previously, we described how tracking zone entries for the Flyers this year at Broad Street Hockey led us to some startling conclusions. There was very strong evidence suggesting that puck possession on a zone entry is quite important, that carrying the puck in generates more than twice as much offense as dumping it in.
More surprisingly, the data also strongly suggested that shot differential was almost entirely determined in the neutral zone. Claude Giroux was more likely than Zac RInaldo to push the puck forwards into the offensive end and more likely to carry the puck in, but shockingly his carry-ins did not generate any more shots than Rinaldo's. In fact, there was no evidence that any Flyer consistently did well or poorly at generating or preventing shots in the attack zones.
While this seemed to be unequivocally true for last year's Flyers, I was hesitant to generalize beyond that for such an unexpected result. So I called for assistance, encouraging readers to try tracking their favorite teams. Several people expressed interest, and I've now received my first significant packet of data -- Bob Spencer has tracked zone entries for the first 50 games of the Wild's 2011-12 season (his analysis of the data can be found here).
In this article, we will compare the 5-on-5 data from the Wild and the Flyers. We will find the following:
- Like for the Flyers, the Wild players show that the ability to control the neutral zone is a persistent talent.
- Like for the Flyers, no Wild players can be identified as having an ability to get more shots per offensive zone possession.
- Unlike for the Flyers, the data for the Wild suggests that there may be players who limit shots in the defensive zone.
- Overall, the Wild were much less effective in the neutral zone than the Flyers, likely because of both talent and coaching.
Checking for scorer bias
Before we get started, let's see what differences we might have between scorers. Zone entry tracking does not involve many subjective decisions; it is generally clear which player sent the puck across the blue line and whether they dumped it in or carried it in.
But there is one key decision that could cause significant bias: we exclude dump-and-change plays where no significant effort is made to recover the puck and generate offense. If Geoff excluded more dump-ins than Bob, he would have fewer zone entries per game, a higher percentage of them being with possession, and more shots coming from each dump-in (since the same number of shots would be distributed across fewer dump-ins).
As it turns out, one indicator (percentage of entries with possession) pointed to Geoff excluding more dump-ins, but one (shots per dump-in) pointed to him excluding fewer. And the most direct measure -- total number of entries per game -- was almost identical, with Bob recording 152.5 per game and Geoff recording 152.6.
Moreover, the game where the Wild played the Flyers gives us a means to directly compare their results. There were some small differences -- there were many plays that Bob recorded as a carry-in and Geoff recorded as a pass-in, for example, but since we collapse both of those plays into a single "entry with possession" category for our analysis, that won't impact our numbers.
They recorded the same results on 94% of the zone entries, and the differences do not indicate a systematic bias; each of them recorded 50% of the 5-on-5 entries in that game as being with possession. So while the results are not identical, the differences between them are relatively minor, especially when compared to more subjective analysis such as scoring chance recording. The scorer bias when comparing numbers from the two datasets should be small.
Where do the data sets match?
So how does it look -- do we see the same trend for the Wild, with neutral zone indicators being reproducible and attack zone performance being random over the course of a season?
Like before, we can look at a split-half reliability to see which statistics represent a reproducible talent. If I give you the stats from the odd-numbered games and you can't guess roughly how the player did in the even-numbered games, then the stats in question aren't really telling us much.
For the Flyers, the only metrics that showed significant reproducibility at the single season level were those that relate to the neutral zone performance:
- Fraction of a player's individual entries that are with possession
- Team's entry differential with given player on ice
- Percent of team's entries that are with possession with a given player on the ice
- Percent of opponent's entries that are with possession with a given player on the ice (D only)
- Overall neutral zone score (contribution of neutral zone play to shot differential)
The results we see for the Wild are very similar, but not identical (see appendix). The same neutral zone factors all show up as significant again. And the offensive zone factors -- number of shots per entry -- all show up as being insignificant again.
This is worth emphasizing: we now have a second team for which shots per offensive zone possession seems irreproducible over 50-82 game sample sizes. It seems hard to believe, but it seems increasingly likely that if some players have the ability to get more shots from each offensive zone possession, it takes more than a season to tell which players they are.
Moreover, the previous observation that zone entries with possession generate twice as much offense as entries without possession still clearly holds:
|Shots per entry with possession||Shots per entry without possession|
|Flyers entries for||0.56||0.29|
|Flyers entries against||0.59||0.29|
|Flyers games overall||0.57||0.29|
|Wild entries for||0.58||0.28|
|Wild entries against||0.60||0.30|
|Wild games overall||0.59||0.29|
Comparing the #2 offense in the league to one that was #29 in goals per game through these 50 games, we see almost identical numbers of shots per entry, with the lesser offense actually getting very slightly more shots per entry.
The number of shots a team gets seems to be driven almost entirely by how effective their players are in the neutral zone -- how often they push the puck into their offensive zone and how often they retain possession as they do so.
Defensive zone performance
The two data sets conflict to some degree on whether players possess a defensive zone skill for suppressing shots against. For the Flyers, the defensive zone metrics all showed up as irreproducible. In contrast, for the Wild, the defensive zone metrics show fairly strong reproducibility.
So what does this mean? Undoubtedly, it means that the picture for defensive zone performance remains muddled. However, there is a reason to mistrust the apparent reproducibility of the Wild defensive zone data.
When we say that those metrics appear to be reproducible, we mean that the odd/even correlations are large enough that there is less than a 5% chance of getting correlations that strong by random chance. However, we have calculated a lot of correlations in this study -- enough that we should expect a couple of one-in-twenty outliers to show up. Since we have conflicting evidence from the Flyers' data, let's look a little closer at what we see for the Wild.
Here are the defensive zone metrics for the Wild defensemen who got significant playing time in the first 50 games, listed in order of their average ice time (from most-used to least):
|Name||TOI per game||Shots against per entry w/possession||Shots against per entry w/out possession|
While I generally put quite a bit of faith in statistics, this seems a little bit fishy. If we take the defensive zone metrics at face value, then the conclusion would be that the Wild consistently gave the most ice time to the defensemen who played the worst defense.
We can't attribute this to facing top opposition, since we showed in the previous section that better players don't appear to produce more per offensive zone possession. If we can't find players who get more shots from each possession, then we can't argue that Scandella faced those players.
I am skeptical that the Wild would have used their worst defensive zone performers as their top pairing. So while we definitely need more data to draw firm conclusions (join the project! Email me at bsh.erict -at- gmail or catch me on Twitter), the combination of seeing irreproducibility for the Flyers and seeing unintuitive results for the Wild leaves me suspecting that defensive zone performance is not reproducible at the season level.
Comparing the teams
Finally, it's interesting to look at how the two teams performance compared. We saw earlier that their offensive and defensive zone performances were quite similar, but how do they compare in the neutral zone?
|5v5 entries per game||For||Against||Entry fraction|
|Flyers score close||37.5||34.6||52.0%|
|Wild score close||37.8||42.2||47.2%|
The Flyers got a much larger share of the entries, which means they were able to win the neutral zone and advance the puck more often than the Wild. Score effects play a role in zone entry totals, as a team with the lead will tend to dump the puck and go for a change rather than chase the puck to generate offense; since we exclude the dump-and-change plays, this makes the leading team's entry fraction go down. However, in situations where score effects are minimal (within one goal in the first two periods or tied in the third), we see the Flyers won the neutral zone much more often.
Another aspect of neutral zone performance is how often the team was able to retain possession on their zone entries and deny their opponents possession.
|Entries with possession||For/game||For/total||Against/game||Against/total|
|Flyers score close||19.7||52.4%||17.6||50.7%|
|Wild score close||15.7||41.4%||20.5||48.4%|
On offense, the Flyers gained the zone with possession far more often than the Wild did. This is a major driving force behind their superior offense -- the two teams had a similar number of entries for per game, but the Wild were forced to dump and chase on a much larger fraction of those entries.
The Flyers were also more effective defensively in the neutral zone, giving up fewer entries with possession per game. They also gave up fewer dump-ins, so the entry-with-possession fraction was actually a bit higher, but it was a slightly higher percentage of a much lower total. Overall, facing far fewer defensive zone possessions means facing fewer shots against.
We can also look at individual players' contributions to the neutral zone outcomes. Here is a plot showing how involved each forward was in the neutral zone (how often was he the one sending the puck into the offensive zone) and how successful he was (how often did he avoid having to dump the puck in):
The Flyers were clearly much better at gaining the zone with possession than the Wild were -- the Flyers had four players who were better at it than anyone on the Wild and the Wild had six players who dumped it in more often than any Flyers forward, so they may simply have been more talented with the puck.
In addition, the Flyers' offense was much more efficiently constructed. By and large, the players asked to bring the puck through the neutral zone were the ones who were most effective at it (with the one notable exception of Wayne Simmonds #17 having more entries than his linemate Daniel Briere #48). In contrast, the Wild show absolutely no correlation between involvement and results.
The Flyers were destined to retain possession on more of their entries, since they had more skilled puck-handlers. However, the disparity was widened by the Wild's failure to get the puck on the stick of the people who were most effective with it; they did not get as much out of their players as they could have.
We now have a second set of data that suggests that shot differential is largely determined in the neutral zone, as teams strive to push the puck forwards and retain possession as they cross the blue line. Entries with possession produce twice as much offense as entries without possession, and some individuals have a clear talent for gaining the zone.
The data on defensive zone performance remains murky, but neither team's results are what we might have expected going in, with the best defenders limiting how many shots the opponents get after they enter the zone.
In the offensive zone, what was previously observed for the Flyers also appears to be true for the Wild. At the 50-82 game level, players do not show a clear offensive zone talent for generating shots; shot generation comes largely from how (and how often) they enter the zone, with offensive zone talent then limited to shot quality effects. We may still find otherwise with more teams or with more years to evaluate each player, but what was previously a conclusion held tenuously pending further data now seems reasonably likely to prove true.
Previously by Eric T.
- Zone entries: Introduction to a unique tracking project
- How important is neutral zone play?
- More on the advantages of puck possession over dump and chase
- A competition metric based on ice time
- Time on ice competition plots for all 30 teams
- 2012-13 Philadelphia Flyers preview: No longer with Pronger
- 2012-13 Boston Bruins preview: Lots of promise without Thomas
We set playing time thresholds that gave us the top 20 skaters for each team and calculated the correlation between odd and even game results for a number of key statistics. With that few players, correlations of at least 0.5 (forwards, or overall) or 0.6 (defensemen) are required for us to be 95% certain that our correlation is not simply the product of random chance.
|Correlations between odd and even games||Flyers correlation||Wild correlation|
|Fraction of a player's entries that are with possession||0.8||0.9||0.9||0.8||0.4||0.9|
|Team's shots per time player carries puck into zone||<0||0.2||0||<0||0.3||0|
|Team's shots per time player dumps puck into zone||<0||0||<0||0||0.2||0.1|
|Team's entry differential with given player on ice||0.4||0.9||0.5||0.6||0.8||0.6|
|% of team's entries that are with possession w/player on ice||0.9||0.3||0.9||0.8||0.2||0.7|
|Team's shots per entry w/ possession w/ given player on ice||0||0||<0||<0||<0||<0|
|Team's shots per dump-in with given player on ice||<0||0.5||0.1||0.3||<0||0.1|
|% of opponent's entries that are with poss. w/ player on ice||0||0.7||0.2||0.4||0||0.3|
|Opp's shots per entry w/ possession w/ given player on ice||0.2||0.4||<0||0.3||0.4||0.4|
|Opp's shots per dump-in with given player on ice||0.4||<0||<0||0.6||0.7||0.5|
|Overall offensive zone score (contribution to shot differential)||<0||0.2||<0||<0||0||<0|
|Overall defensive zone score (contribution to shot differential)||0.4||<0||0.2||0.4||0.6||0.4|
|Overall neutral zone score (contribution to shot differential)||0.5||0.7||0.5||0.6||0.5||0.6|