June 12 2014 09:00AM
In 2012, Eric Tulsky published a call to arms for tracking offensive zone entries in hockey spurred by his initial findings on the subject. Zone entries have since become a focal point in the advanced statistics community, with many contributing to the endeavour by collecting data from games involving their favourite teams. Tulsky, along with Geoffrey Detweiler and Corey Sznajder, presented their work at the 2013 Sloan Sports Analytics Conference in Boston, wherein they outlined the importance of neutral zone performance and how we may use zone entries to separate play in all three zones.
It was found that entries in which the zone was gained with uninterrupted control of the puck generated a far greater number of shots and goals on average than those in which the puck was shot in. In addition, players failed to show consistency in their performances at either end, while their play in the neutral zone showed statistically significant reproducibility. I will henceforth assume prior knowledge of this topic so I urge anybody who hasn't read up on it yet to do so.
This season I undertook a tracking project involving every blue line event at five-on-five in Ottawa Senators games. By the inclusion of all instances of pucks crossing either line I aimed to divide play into three independently observable pieces. An interesting by-product of this method is the possibility of extrapolating zone time; that is to say, the duration of time the puck spends in either offensive end. For the sake of full disclosure, I’ll explain the guidelines I abode by in tracking offensive zone entries:
- Passes over the blue line were counted as controlled entries and credited to the receiver.
- Indirect passes were counted as controlled entries if the attacking team could be considered to have possession of the puck at every moment.
- Both the puck and the entering player must remain in the zone for a minimum of two seconds following a controlled entry.
- Redirected pucks into the zone were counted as entries without control and credited to the player redirecting the puck.
- In the case of a player from the defending team redirecting the puck into their own zone, the entry was credited to the last attacking player to touch the puck.
- A minimum of one attacking player must enter the zone within two seconds of the puck and both must remain in the zone for a minimum of two seconds following an entry without control.
For those interested, I’ve shared all my data in a public Google Doc that includes raw game data and on-ice stats for all Senators players. Over the games I tracked, entries with uninterrupted possession generated 0.780 attempts at the net on average while entries without control generated 0.282. This ratio is greater than that obtained in Eric Tulsky’s work for unblocked shot attempts and may partially be due to leniency on my part in counting dump-ins. That my data only reflect one season’s worth of entries and in no way prove league-wide phenomena is a disclaimer I’ll undoubtedly repeat throughout this article.
The fact remains that gaining the zone with possession will always create more offensive opportunities than otherwise, which is an obvious yet important conclusion. Over Ottawa’s season, only 21.7% of the 5v5 play took place in the neutral zone. The standard deviation of that percentage from game to game is only 1.64%, meaning you can expect the puck to spend between 9 and 10.5 minutes in neutral ice during a 45 5v5-minute game and be right about 70% of the time. What occurs during those crucial minutes, however, has shown to have a powerful influence over the rest of them.
In order to investigate the extent to which they represent true talent, I performed split-half reliability tests on a number of the on-ice statistics I obtained, both for and against. For each player, the data were divided into two sets of equal size; one set limited to even game numbers and the other, odd. In principle, if these values are a measure of reproducible skill more than they are the product of randomness we should expect to observe close correlations between these arbitrarily chosen samples. I’ll reiterate that nothing can be ascertained from a single team’s season’s worth of games. The purpose of this exercise is to raise questions rather than attempt to answer them.
Not dissimilarly to what has been observed in the past, only the metrics relating to neutral zone play showed appreciable internal consistency. Here, the acronym WZER serves as our neutral zone score. It is a projection for Corsi differential based on the entries occurring with a given player on the ice and their standard shot productions. In using the average number of attempts generated per entry of either type, we’re essentially cropping out performance in either offensive zone. While Ottawa’s defencemen did show consistency in their shot attempt production following entries without control, it’s worth noting the sample of six players having qualified for the minimum 50 GP is miniscule.
TOA represents the time the puck spent in the offensive zone following an entry. It begins the moment the entry occurs and ends with a stoppage in play or an exit. To my knowledge, the repeatability of TOA per offensive zone entry hasn’t yet been tested. Interestingly, the Senators’ forwards produced an R2 value for dump-ins great enough to at least be significant. What’s perhaps more interesting is the ability for forwards to retrieve the puck on shoot-in plays didn’t show reliability for the Philadelphia Flyers of this past regular season. If we make the gross assumption that this holds true in Ottawa’s case, we’re left to deduce any consistent ability to spend more or less time in the offensive end following dump-ins is tied to the ability to forecheck and disrupt exits rather than recover the puck and initiate possession. At the very least, this trend merits further investigation.
Though multiple data sets now suggest performance in the offensive/defensive zones following entries is riddled with randomness and noise, the eventual productivity of those entries occurring while a given player is on the ice can significantly impact his shot differential. Let ZE CF% define the differential of real shot attempts resulting from a player’s on-ice entries.
Instances where this value surpasses that predicted by the average production of said entries indicate that player benefited from some combination of fewer or greater shots per entry for or against. In some cases involving players having appeared in a large number of games, this difference can amount to almost two points, which is quite obviously important. Over data sets spanning multiple seasons, it’s possible players would begin to display a true ability to separate themselves from the pack.
For this quantity of games, however, true talent is too heavily masked by variance to attribute this ability to outperform one’s predicted CF% to player skill. For some form of confirmation, I also took a look at in-zone shot density. By this, I mean the rate of on-ice shot attempts per ten seconds of time in zone following entries. If the differences observed in players’ average entry productions were meaningless, we’d expect to see a near-perfect correlation between shot density and control rates. In reality, this correlation is expectedly high, but there is still a great deal of wiggle room for players to over or under-perform in the offensive ends.
It isn’t terribly surprising that many of these metrics relate to one another in some fashion. The following correlation matrix summarizes the relationships between all these values:
Though performance in the neutral zone undoubtedly helps drive possession, there exist enough degrees of separation between what a player’s entries predict and his eventual shot differential that only a moderate correlation is observed here. Since I tabulated these R2 values, Tyler Dellow of mc79hockey has been sharing open-play CF% values for the Senators among other teams. They represent the 5v5 on-ice shot differential with the 21 seconds following face-offs omitted. In free-flowing situations such as these, on-ice entries did a very good job of predicting possession with a 0.61331 R2 rating. The relationship between the ratio of zone time and Corsi is noteworthy, as is that of Quality of Competition and on-ice control rates on entries against. The fact New York Islanders blogger Garik16 did not observe this last correlation in his team’s data set will serve as my final reminder that we’re still only dealing with one season involving one team.
Whether or not this trend applies to the remaining 28 NHL teams won’t become clear until enough data become available. In the Senators’ case, however, given the players’ on-ice control rates against only showed modest internal consistency, it’s reasonable that the quality of the attacking players trumped the defenders’ ability to deter clean entries. I suspect neutral zone defence is club-specific and systems put in place to prevent opponents from gaining the zone vary from team to team. In Eric Tulsky’s work, the Flyers data suggested that only their defence showed reliability for control rates against, while the Wild set and my own showed only the forwards displayed such consistency.
If performance in neutral ice is driven by repeatable skill, we should expect WZER to provide some truth in smaller samples of games. A metric which can’t predict its final value decently well at the halfway mark of the season doesn’t have much value. In a comparison test between CF%, WZER and EVENT%, I plotted reliability as a function of the number of games in the sample. Here, “events” simply describes the sum of zone entries and shot attempts. Our measure of reliability is ∆i defined as follows:
Where i is for a given number of games in a complete set n, and Xi is the cumulative value for our metric at that point. In simpler terms, ∆ for a given number of games gives us the average deviation in our metric from its eventual final value the rest of the way. Intuit how delta decreases as the data set grows and the cumulative value of the metric nears its final value. Averaged over all players having played over 50 games, ∆i for our three stats produce the following trend lines:
Both WZER and EVENT% outperformed Corsi in predicting their own final values in smaller samples of games. Though the spread in Ottawa’s players’ final neutral zone ratings was larger than that of Corsi, far fewer games were needed to provide adequate assessments of their skill. At the 20 game mark of each player’s season, the correlation between their cumulative WZER and their final rating produced a 0.74529 R2,which vastly exceeds CF% at the same mark with 0.37643. Given what’s been shown regarding the importance of neutral zone play in driving puck possession, being able to identify skill or lack thereof in this area despite small samples is incredibly useful.
The Senators performed quite well in neutral ice, boasting a much higher rate of controlled offensive zone entries than they allowed for their opponents. Their CF% at even strength, good for 7th in the league, certainly agrees. There’s no doubt score effects accounted for a significant portion of their impressive ratings, given they played 138 more 5v5 minutes while trailing then they did leading and spent the fifth most time of all 30 teams trailing by two goals or more.
In the final two periods of games I tracked, the leading team entered the offensive zone far less frequently and with a lower rate of control than the trailing team. Teams ahead by at least one goal gave up roughly three and a half points to teams chasing the lead with respect to percentage of controlled entries and dropped three and a third below 50% in WZER. I’d expect a far less flattering control rate on entries against had Ottawa found themselves in more close-score situations.
Because of the large gap between the Senators’ frequency of carry-ins and that of their opponents, the shot differential decisively favoured Ottawa despite comparable time spent in the offensive zone. Recall that shot density closely follows control rates. This trend stunts Corsi’s ability to approximate zone time. While shot differential has been shown to tightly correlate with possession time in the offensive zone, raw puck-in-zone time doesn’t fare nearly as well:
Note that my TOA is defined by zone time following entries and not true possession time.
The labels that adorn the three largest discrepancies show control rates for Ottawa (blue) and their opponents (red) averaged over the last five games and identify the issue with Corsi as a proxy for time spent in zone. Truthfully, Corsi (and by extension, offensive-zone possession time) is the far more meaningful metric, but it’s an important distinction to make.
In contrast, Vic Ferrari has shown that a very clear correlation exists between teams’ shot-based possession metrics and their zone time differential. In Ferrari’s work, we’re looking at shots and zone time through all game situations as opposed to strictly 5v5 which I believe accounts for quite a bit of the observed relationship. To be fair, it’s likely that a correlation between teams’ final 5v5 CF% and ZT% does in fact exist, despite the large discrepancy in Ottawa’s values. Is one showing the other, or do teams that excel at one tend to excel at the other? The puzzling lack of zone time data prevents further investigation of the subject, but given the massive gaps that can be observed in the plot above, it’s probably best to emphasize that even-strength Corsi serves as a proxy for offensive zone possession rather than plain time.
For one, the idea that shot production following on-ice entries is not driven by readily repeatable skill is becoming increasingly difficult to doubt. The time spent in the offensive zone following dump-ins, on the other hand, showed signs of reliability and warrants a closer look. Better performance between the blue lines breeds better possession, especially in open situations. Moreover, strength of neutral zone play seems to reveal itself in much smaller samples of games than overall shot differential. Players may perform significantly better or worse than what their blue-line events would project, but over multiple seasons it would be fair to expect regression. In what is a pretty telling fact, the Senators players’ ability to gain the zone with control was a larger contributor to shot density that actual offensive zone performance, piling further importance on what occurs at and between the lines.
I intend to expand my data set to include next season’s Ottawa Senators in order to analyze year-to-year repeatability. The prospect of using zone entries to help predict future shot-based metrics is also one that intrigues me. The painstaking nature of collecting blue line data outweighs my curiosity at times and serves as a reminder of how important the work people do in tracking games truly is. For those who may be unaware, Corey Sznajder of Shutdown Line is undertaking a massive project in which he intends to track entries and exits for every game of the 2013-2014 hockey season. Anybody with an interest in this stuff should help fund him, as the work he’s doing will provide the data needed to dig deeper into this topic.
Thanks for reading!