Almost famous: Lessons from a near miss

Eric T.
August 07 2012 06:19AM

Hubble
The universe - it's not so static.
Photo by NASA, Public Domain

It was one of those moments that you live for in research. A shocking result popped out of the analysis. "Holy crap, this is going to change the game," I thought.

But that's not the story here. The real story is what happened next.

Let's back up. It all started for me when I read this quote from Vic Ferrari:

Nah, the obvious weakness of this model (and Gabe's for that matter) is that defenders and forwards are given equal weight. It becomes obvious very quickly that forwards are driving the outchancing-at-evens bus ... defensemen are just riding it.

This has stayed with me ever since, because it's a pretty important claim. If scoring chance totals are really driven by forwards, then we probably overrate the defensemen who play with good forwards -- and overrate defensemen in general, for that matter.

This year's work on zone entries brought it to the forefront of my mind again. Recall that we have preliminary data suggesting that shot differential (and therefore scoring chances) may be driven largely by performance in the neutral zone. I think the forwards do the majority of the work in the neutral zone, so if the neutral zone is really that important, it stands to reason that the forwards might indeed be responsible for the majority of the outcomes.

Yet if you look at the top defensemen, it is clear that everyone on the team does much better with them on the ice, which doesn't seem consistent with them being pure passengers on the outchancing bus. How much of a hand do they have on the wheel?

It occurred to me that we could get an empirical assessment by looking at defensemen who changed teams. If the forwards are really doing the bulk of the work, then when a defenseman moves to a whole new group of forwards, that should lead to a big change in his Corsi rating (the team's shot differential with him on the ice). So I pulled some Corsi data from behindthenet.ca, and here's what I found for players who played at least 500 even strength minutes both seasons:

The correlation between '10-11 relative Corsi and '11-12 relative Corsi for defensemen who stayed on the same team was 0.54; for the 40 defensemen who changed teams it was just 0.16.

In contrast, for forwards it was 0.62 if they stayed with the same team and 0.58 if they changed teams.

Whoa.


This is where things get exciting. It's also where things get dangerous.

At this point circumstances seem pretty compelling. I have a hypothesis that came from an analytical authority I deeply respect. I have some circumstantial data making Ferrari's claim seem plausible. And now I have some direct evidence showing a huge effect. Everything is lined up to support this conclusion so far. It's exhilarating; this data could change the way we view the game.

But that emotion is part of what makes this a dangerous moment -- once a person forms a conclusion, it becomes difficult to objectively view data on the subject, especially if the issue is emotionally charged. That's why a Wild fan who dug in early in the year against the predictive power of statistics was arguing at the end of the year that not one person used stats to predict the Wild's save percentage would fall, even though he had personally written about several stat-based articles that said their save percentage was unsustainable. The subconscious desire not to be wrong can be so powerful that when reminded of those articles, he concluded that they must have been changed after the fact.

I don't want to fall into that trap. I need to be sure about this before I go any further.


So what do I do? I start looking for flaws in my approach, other ways to view the situation. I describe the data to Derek Zona, who immediately starts in with a rapid-fire list of potential issues for me to look into.

  • Are the formulas wrong?
  • What impact does quality of competition have? Are defensemen who change teams usually changing roles?
  • What impact do zone starts have?
  • Is there a change in ice time?
  • What if we use straight Corsi instead of relative? Or a zone-start-adjusted version?
  • Can we look at the list of players and see if anything obvious jumps out about who went up and who went down?

I plug through all of these possibilities, and at the end, the effect is looking as strong as ever. With all of the correction factors we could think of, the year-over-year correlation for defensemen who changed teams this year never got above 0.2.

There's one last hole to check. There were 40 defensemen in my study -- a decent sample, but not huge. Just by random variation, there's about a 1-2% chance I could've gotten a correlation as small as 0.16 for those 40 guys even if the true effect is similarly small to what we saw for forwards. It's an outside chance, but to make an extraordinary claim, I want to have extraordinary proof. So let's see what we find...

For the 44 defensemen who changed teams between '09-10 and '10-11, the year-over-year correlation is 0.41. For the 45 guys who changed teams between '08-09 and '09-10, the correlation is 0.45.

Poof, just like that, the amazing result is gone. It turns out to be just an improbable statistical quirk. I sink into my chair, deflated.

And then it hits me, a flood of relief as I realize how close I came to an embarrassingly public error. Only the desire to pressure-test my findings from every possible angle saved me here. Having a critic trying to poke holes helped. And this is what I take away from that emotional ride; this is what I hope to share with you today.


Hockey is a complicated game, with a lot of things to see, track, and think about -- and a lot of ways to be wrong. If I am trying to generalize and develop my understanding of the game, I need to investigate my theories as rigorously as possible and constantly remind myself which beliefs are not yet well-tested.

This is not a simple stats-versus-eyes issue; it is a general approach to rational thought. We need to constantly challenge our beliefs, examine the evidence against (both visual and statistical), seek out the critics, and adjust our conclusions as needed. As Hawerchuk's tagline for Behind The Net used to read, "The facts have changed, so my position has changed. What do you do, sir?"

Every approach has weaknesses; everybody makes mistakes. No matter how you come to conclusions, you should be constantly updating your information and reassessing your beliefs. The more angles you use to come at a problem, the better your solution will be.

I may not have changed the game today, but the reminder to seek and embrace criticism will serve me well tomorrow.


Recently by Eric T.

2654ef2681c88bc3252431ec45e30590
Eric T. writes for NHL Numbers and Broad Street Hockey. His work generally focuses on analytical investigations and covers all phases of the game. You can find him on Twitter as @BSH_EricT.
Avatar
#1 Kent Wilson
August 07 2012, 07:05AM
Trash it!
0
trashes
+1
0
props

Very interesting. Incidentally it's always been my assumption that forwards drive the play as well, so it's interesting to see that's probably not true.

Avatar
#3 Kent Wilson
August 07 2012, 07:34AM
Trash it!
0
trashes
+1
0
props

@Eric T.

I probably should have said "it's making me re-think my assumptions".

Avatar
#4 Anders
August 07 2012, 07:40AM
Trash it!
0
trashes
+1
0
props

Consider this article http://nhlnumbers.com/2012/5/17/the-pronger-effect-how-much-impact-can-one-player-have

Should it come as a surprise that a great D-man can have the same effect if not greater than a great forward?

Avatar
#6 David Johnson
August 07 2012, 08:00AM
Trash it!
0
trashes
+1
0
props

Good story and a good reminder that one result, even if it supports a hypothesis, does not provide proof. One result is simply evidence. An open mind, a critical mind, and a willingness to be proven wrong are all useful traits when exploring new ideas.

All that said, I still think Vic Ferrari is right. Forwards drive offense. Last summer I wrote an article on the subject (http://hockeyanalysis.com/2011/05/17/players-worth-by-position-revised/) which used my HARO+, HARD+ and HART+ ratings (which take into account QoC and QoT) to evaluate the importance of position in driving results. I did this by looking at the standard deviation of the ratings for each position. So, for example the standard deviation in HARO+ (offensive rating) is:

Centers: 0.171 Right Wings: 0.162 Left Wings: 0.167 Defense: 0.095

Think of the above as an indication of the variation in ability of players to drive goal scoring.

This indicates that there is less variation in offensive ratings among defensemen meaning the difference between the best defensemen and the worst defensemen at driving offense is smaller than the difference among forwards.

Now this doesn't strictly mean that forwards drive offense, it means that the variation in abilities of forwards to drive results is greater than the variation in abilities among defensemen to drive results. But, if we assume the talent distribution among forwards is the same as the talent distribution among defensemen we can probably conclude that forwards have the greater influence on offense.

Defense we see things a little differently. Here are the standard deviations in HARD+.

Centers: 0.116 Right Wings: 0.096 Left Wings: 0.099 Defense: 0.101 Goalies: 0.080

Since HARD+ are goal base stats I can include goalies. Interestingly, the talent level of the center seems to have the greatest impact on defense followed by the defensemen, the wingers and then the goalie (yet another study that diminishes the value of goalies).

I didn't do the study for fenwick or corsi ratings but if I do it for the last 3 seasons I get the following standard deviations for FenHARO+ (5v5 ZS adjusted >2000 minutes ice time):

Center: 0.077 Right Wing: 0.074 Left Wing: 0.066 Defense: 0.061

And for FenHARD+ Center: 0.068 Left Wing: 0.068 Right Wing: 0.064 Defense: 0.062

Again, variations in defender talent both offensively and defensively are smaller than any of the forward positions and the greatest variations in talent levels come at the center position.

Again, this doesn't explicitly imply that forwards drive offense (and to a lesser extent defense) but that the variation in talent among forwards has a greater impact on offense (and defense) than that variation in talent among defensemen.

For reference, here are the HARO+ and HARD+ standard deviations using the same players and time period as the FenHARO+ and FenHARD+ standard deviations above.

HARO+ Center: 0.178 Right Wing: 0.173 Left Wing: 0.151 Defense: 0.115

HARD+: Center: 0.125 Defense: 0.114 Right Wing: 0.106 Left Wing: 0.096 Goalie: 0.077

Avatar
#7 David Johnson
August 07 2012, 08:16AM
Trash it!
0
trashes
+1
0
props

One other monkey wrench I want to throw into the equation is style of play. It is my belief that style of play has a major impact on the observed results. Players who are asked to play offensive roles score more goals, but give up more goals too (generally speaking). Players who are asked to play more defensive roles give up fewer goals but score fewer goals too (generally speaking).

Now, the thing is, a lot of forwards are usually give one role or the other and there are a lot of forward specialists. There are some forwards that almost exclusively focus on defensive play, and others whose primary job is offensive play. This specialized focus allows forwards to have greater extremes in terms of offensive rating and defensive rating. I believe this sort of specialized role is much less common among defensemen. Yeah, some coaches will try to match up a key defensive defenseman against the opposition first line, but generally speaking defensemen are probably more generalists and play in all situations, offensive and defensive. This mitigates their ability to have significant impact on either offense or defense because they are trying to do both.

Avatar
#8 Rhys J
August 07 2012, 08:48AM
Trash it!
0
trashes
+1
0
props

Perhaps with the evidence you gathered and the conclusion you began to come to (that we overrate defensemen in general), it would have been a good time to re-visit the shot quality vs shot volume argument. Since hockey is as much preventing goals as it is scoring goals, a fact evidenced by goal differential being more strongly correlated to wins than both GF and GA individually, perhaps the impact of a good defenseman can be observed through contributions in the defensive end more so than at the offensive end of the ice. A couple of numbers I would be curious to see are Fenwick against/60 min relative to teammates (in order to see whether opponents generate fewer shots against that d-man), and on-ice save% relative to teammates (if that d-man is good at allowing only lower quality shots, logically it should be reflected in an increased on-ice sv%).

There's also the unpopular possibility that possession metrics are just bad at measuring and quantifying what happens in a hockey game, especially for defensemen. Doing my own research, I found the relationships between Corsi vs Wins and Corsi vs Goal Differential to both be very weak for every year post-lockout, with R-squared values never exceeding 0.43 for either test. This indicates that the majority of what goes on in a hockey game is not captured by possession, or at the very least not by Corsi. Perhaps by undertaking Corsi-based analysis, we're simply missing how defensemen contribute and run the risk of subsequently undervaluing them.

Avatar
#9 Stephan Cooper
August 07 2012, 09:26AM
Trash it!
0
trashes
+1
1
props

I wouldn't be surprised if forwards have a greater impact on the game per unit time compared to defensemen. But when calculating value you should also factor in that defenders play signficantly more than forwards by an overall factor of 4/3 on even strength.

So if the true value for D on even strength per unit time is .45 and for forwards is .58, if you correct for ice time to equalize value the dmen go up to .60, essentially equal to the forwards.

Same with David's numbers. Add HARO+ and HART+ std dev together for each position and you get

C: .287 RW: .258 LW: .266 D: .196 (.261)

This is a quick and sloppy way of looking at it but its interesting to see the effect basically disappear when we factor in ice time, that defenders are less involved in the game per unit time but equally involved per game.

Which makes sense, there is a reason why player endurance caps out around 20-22 minutes a game for forwards but 25-28 minutes a game for defenders. Defensemen probably do less per minute than forwards which is why its set up for there to be 3 units of them and 4 units of forwards in the lineup.

Avatar
#10 Derek Zona
August 07 2012, 09:38AM
Trash it!
0
trashes
+1
0
props

David,

If you'd like to advertise, I can put you in touch with our sales department.

Thanks.

Avatar
#12 David Johnson
August 07 2012, 10:07AM
Trash it!
0
trashes
+1
1
props

@Derek Zona

A wise man once wrote "The more angles you use to come at a problem, the better your solution will be."

Feel free to dismiss my comment as an advertisement (it wasn't but I really don't care) but know that dismissing it runs counter to everything Eric wrote in his article.

Avatar
#13 Derek T.
August 07 2012, 11:26AM
Trash it!
0
trashes
+1
0
props

This was a terrific read. I was actually thinking about something along these lines after I read the scoring chance recap at BSH yesterday where Todd wrote this:

"There is an interesting trend noticeable from the forwards and defenseman graph; there is more variation in scoring chance generation (horizontal spread) than scoring chance prevention (vertical spread). This is perhaps not surprising that players have more control of the outcome when they have the puck than when they don't, but it makes me wonder what Timonen does to be such an outlier."

With so many people slated to track zone entries and exits next season, I think we'll be able to better express defenseman contributions to possession.

Avatar
#14 Derek T.
August 07 2012, 11:41AM
Trash it!
0
trashes
+1
0
props

Additionally, the Timonen comment in the quoted passage above has me thinking what would happen if we removed the likes of Weber, Chara, Doughty, Keith and other d-men we know to be consistently awesome from the sample. Are the results of non-superstar defensemen largely driven by team factors?

Avatar
#15 JP Nikota
August 07 2012, 12:02PM
Trash it!
0
trashes
+1
0
props

Brilliant stuff, Eric. I know exactly the feeling you're talking about when you find an error in something you're about to publish. I've re-written articles many times to adhere to a completely new and different conclusion.

Avatar
#16 Ben Wendorf
August 07 2012, 12:14PM
Trash it!
0
trashes
+1
0
props

@Rhys J

Corsi-based analysis (and its proponents) have never sold it as the be-all for identifying possession, hence why said analysts have spent so much cyber-ink on adjustments based on zone starts, quality of competition, and quality of teammates. On the flip side, lacking the data we'd really like (possession percentage data, a la soccer), Corsi (and really Fenwick, which is shots plus missed shots) have tested as the best indicators of possession that we have. Your .43 r-squared is pretty much right on the mark for the Corsi predictive model (if that's your r-squared, then correlation would be that number times two).

Avatar
#17 Frag
August 07 2012, 12:24PM
Trash it!
0
trashes
+1
0
props

I know that feeling, but with baseball articles. I found something I thought would be good to publish, then find errors in it after thinking it over.

Sleeping it over is what I do, as I usually find things I didn't see the day before.

Avatar
#18 Ben Wendorf
August 07 2012, 12:28PM
Trash it!
0
trashes
+1
0
props

Great stuff. I think there's a misconception that stats analysis guys think they're infallible (although some certainly seem to think that). The people that are doing the good stats analysis are or have exercised the kind of rigour you're demonstrating here. Defend what's defensible, and otherwise share in the uncertainty and curiosity for the rest.

Incidentally, I remember carrying out a study a bit over a year ago that had similar results to David's, that the difference and variation among defencemen versus forwards was such that it sure seems that defencemen can more often be "along for the ride". Intuitively, it's probably because they have more direct effect on shots-against than shots-for, while conversely forwards should have more direct effect shots-for than shots-against. If you had to compare D effect on shots-for versus F effect on shots-against, I would probably say that forwards win that battle, too, b/c their play takes them all over the shooting areas in the defensive zone, whereas defencemen infrequently wander from the blue line anywhere lower than the faceoff dots.

Avatar
#19 Kent Wilson
August 07 2012, 12:53PM
Trash it!
0
trashes
+1
1
props

@David Johnson

I don't think Derek is dismissing your contributions David...he's just saying please avoid turning comments into overly long treatises on your own work.

Avatar
#20 Rhys J
August 07 2012, 01:59PM
Trash it!
0
trashes
+1
0
props

Technically, the correlation is the square root of the R-squared value. In many cases, it's significantly less than R-squared times two. But hey, semantics.

I realize that Corsi, Fenwick and other possession stats are the best we have available, but since the R-squared (the proportion of variability in the data explained by the model) is quite low for our "best available" measures, perhaps instead of offering QoC and QoT adjustments (which are based off of Corsi themselves, therefore suspect to the same weaknesses) to try and account for hypothesized results that we aren't seeing, we're simply analyzing the wrong stuff.

The point I was trying to make is that there's just too many other things going on to conduct a possession-based study and conclude that defensemen are over-rated, as Eric admitted to almost doing. However, the opposite is also true - possession analysis does not definitively prove player x is better than player y (I have read many smart bloggers do this all too frequently).

All in all, I agree with the prevailing message of the article here. Any testing of a hypothesis should be approached from all possible angles and perspectives. In cases of player evaluation, this does not just include Corsi-based metrics like Corsi%, Fenwick%, QoC/QoT, etc., but stuff like zone starts, scoring chances, percentage metrics, and any other tools we have at our disposal. Unfortunately, some of these stats that are desperately needed are still in their infancy and we don't have complete data for the whole league, which makes evaluation with any degree of certainty really, really hard.

I guess I would just like to see fewer "certain" conclusions drawn from Corsi-based stuff because it really isn't telling the whole story. More perspectives are always better.

Avatar
#21 Ben Wendorf
August 07 2012, 02:43PM
Trash it!
0
trashes
+1
0
props

@Rhys J

Right, the point is that nothing tells the whole story, so we either choose to cease all attempts at analysis, or forge ahead with the best tools we have...so it's not clear what argument you are trying to make there, unless it's solely an anti-Corsi thing.

Of course, if this is a pro-shot quality thing, you've got your work cut out for you, since the most recent scoring chance efforts over the last couple of years has suggested that Corsi/Fenwick reflect scoring change differentials. In other words, the case for shot quality as a big, confounding factor for Corsi analysis has far less teeth than the kind of analysis Eric is attempting here.

Avatar
#22 JDM
August 07 2012, 02:49PM
Trash it!
0
trashes
+1
1
props

Good read. And, wow, that Minnesota Wild stuff from HW is just... embarrassing.

Avatar
#23 DLJr
August 08 2012, 06:03AM
Trash it!
0
trashes
+1
0
props

Well at least I can put your email in context now. Sorry the results almost had you claiming proof with an anomaly.

Avatar
#25 DLJr
August 08 2012, 08:04AM
Trash it!
0
trashes
+1
0
props

@Eric T.

Haha, I was just curious as to what you were putting together at the time; frankly I'm glad I didn't screw up anything for you.

Comments are closed for this article.