# Can we make predictions in hockey with Machine Learning? A simple experiment

**Josh W**

May 18 2013 11:23AM

## Introduction

You probably don't know me, my name is Josh Weissbock and I am a Masters student at the University of Ottawa up in Canada and I study Machine Learning. Many of the other authors on the Nations Network, who use advanced stats, typically use statistical models to analyze hockey where as I use algorithmic models. What I can do is use Machine Learning techniques which is an application of these same statistical ideas. Machine Learning is simply the ability for a computer to take a whole bunch of data, learn from it and be able to make decisions.

Part of my research is combining Sports Analytics and Machine Learning; can we use these algorithms to learn from hockey data and try and make accurate decisions or predictions? This area actually has very little academic research; in fact I could find no peer reviewed published works on predicting anything in hockey in any field. Basketball, American football and soccer all have plenty of research in machine learning but none have been found in hockey.

This post is the first of many on applying Machine Learning to hockey and the NHL. In this simple experiment to see if it is possible to make simple predictions on who will win the game, using known stats on each team. The goal is to beat 50%, or a coin flip. This type of predicting model is valuable both to team managers and for betting.

## Data

I've been collecting data daily on every game that happened over the NHL season. This experiment used games over a ten week period fro February to April 2013, a total of 386 games. Before each game I collected advanced stats and traditional stats on the teams. After the game I collected the result of who won, the score and the shots for and against. This gives me the flexibility to calculate more stats in the future.

The traditional stats that were collected were: Home/Away Status, Goals For, Goals Against, Goal Differential, Power Play Success Rate, Power Kill Success Rate, Shot Percentage, Save Percentage, Winning Streak and Conference standings. For the advanced stats I collected the Fenwick Close %, 5-on-5 Goals For/Against ratio and the season PDO of each team. These stats have been shown to be much more correlated with wins than traditional stats.

## Classification & Results

After pre-processing the data I plugged the vectors of game data into the algorithms to let the algorithms do their learning. The algorithms I used were Neural Networks (NN) because they are good with noisy data; Support Vector machines (SVM) because it does well in machine learning; Naive Bayes (NB) to learn on past data to predict the future; and Decision trees (J48) as they create a human readable output. I won't go into the math of how these algorithms work but I have linked them to their Wikipedia page which gives further detail.

Testing of the algorithms was done using 10-fold cross-validation. The results of the learning with these algorithms are presented in the table below. I ran the algorithms on just the traditional stats, just the advanced stats and a mixture of both.

Traditional | Advanced | Mixed | |

Baseline | 50% | 50% | 50% |

SVM | 60.10% | 55.05% | 60.10% |

NB | 59.07% | 54.53% | 57.25% |

J48 | 58.03% | 52.85% | 58.29% |

NN | 58.68% | 55.83% | 58.03% |

Results of the J48 Decision Tree on traditional data

## Conclusion & Future Work

60% accuracy, that's pretty decent for a very simple experiment. That would probably make profit in the long run if used in betting. I can't confirm which algorithm is statistically best without running a t-test. What I find most interesting is that we see this advanced stats on the Nation Network are the best for forecasting future success but here they didn't seem to make any improvement in the forecasting rate. That doesn't mean they aren't valuable in machine learning, it most likely stems from the fact I only used three of them. Also interesting is if you look at the decision tree generated ML finds the most value in the teams Goals For, Power Play % and Conference standing in predicting a game winner. I don't think PDO is that useful in predicting over the long run as it regresses to 100% but it would be much more useful in predicting the short term.

There is much future work to build off this, first I would start with using a different variation of PDO. Machine Learning can predict more than just who will win a game, it has been used to predict the scores of games, who will win a tournament, which prospects are likely to work out, and there has been plenty of ML research in baseball on predicting award winners and hall of fame nominees.

If you are interested in my data for your own work let me know. If you have any questions or discussion feel free to send me a line. This experiment is a positive result in making predictions in hockey using Machine Learning and future posts will look at it in more depth.

**Comments are closed for this article.**

TyMay 18 2013, 01:27PM

Hello, Josh. I think it would be best to not use this season to calculate a result. The lockout season will most likely be dramatically different from previous seasons, which will hurt the results in the future going. Maybe you could try using the 2012 season to calculate a result. As an example: In the NBA lockout season, last year, the teams generally played very poorly at the beginning, and then got better as the season went on, but not quite enough to where the usual talent level would be at.

Also, I find it interesting that the ML has the GF, and PP% as the most value, as they aren't as predictive as GA, and PK%. Maybe they do have more value, but they seem to fluctuate more than the defensive metrics.

For PDO, do you account for teams with a PDO below 1000 to improve and teams above 1000 to regress?

I'm very interested in this project. I am doing something like this myself (although it is a bit more crude).

RexLibrisMay 18 2013, 05:30PM

Has anyone emailed this link to Brian Burke or perhaps Glenn Healy? It might be enough to finally make their heads explode.

Can I be the first to dub the Machine Learning side Team Turing?

I found the results very interesting and something of a surprise, considering the depth and breadth of data available in advanced stats versus the traditional data.

The Machine Learning test is a great experiment to run, and I am very interested in hearing your results as you go along here.

One thing that stands out to me though is that the end result may be that the test provides more information on the advanced stats than the initial goal of the experiment, that is teaching a computer to make predictions off of those stats.

You mention that you expect to use a difference variation of PDO in future, and I use that as an example. PDO was considered a fairly telling statistic, or even meta statistic as it correlates two other performance numbers. Yet here you suggest that, as it is formulated today, it doesn't provide a great indicator of future performance.

I'm excited about this idea and look forward to reading more from you. Thanks!

Josh WMay 19 2013, 09:17AM

@RexLibrisThanks for the feedback (and from @Ty too).

I am completely for advanced stats, I see them as very valuable and they help with my machine learning side of things. I think the difference is that we are looking at macro vs micro predictions. I am trying to predict the next game where as the advanced stats at good at predicting in the long term, who is going to make the playoffs, who is likely to win the Stanley Cup.

PDO is an example of that, we can see who is playing better/worse than their skill level and in the long run we can see who is going to drop out of playoff contention and those who should rise into it.

I have a hypothesis (and I've done some testing to back it up - but that's a future post) that PDO using only a small sample of the last n games will do better for predicting the next game than using the PDO of the season.

Something @Ty mentioned is that he is concerned that this years data is "useless" because it's too short and the numbers are wonky. I agree with that in the long term sense. I don't think Toronto and Anaheim would have made the playoffs in an 82 game schedule while the Devils would have (and would have done well IMO). But since we are looking at single games to predict them I think the data is fine for this season.

I will be collecting this data again next season, over a full year, and see if the results are any different. So expect a similar post to this after next year's regular season.

scmMay 19 2013, 11:18PM

Thanks for the article. Couple of questions:

What about injuries / game day roster? This wouldn't be simple random noise in the data.

For the advanced stats did you correct for score effects? or correct for game day roster?

Josh WMay 20 2013, 02:42PM

Score-adjusted Fenwick is in my 'future work' to collect for next season. Same for injuries which is something I thought of way too late.

For injuries I was going to go with either their cap hit to represent how valuable they are, and/or, with their Corsi Rel from BehindTheNet.

I don't know how to go back and find out who was injured on what day in this season otherwise I would reconstruct the data.

HendyMay 21 2013, 04:29PM

Is there a 40% failure rate on these statistics? So in as to say you will be wrong 40% of the time?

Louis PomerantzMay 26 2013, 09:20AM

I have actually been betting NHL games using the premises that advanced stats follow such as regression of PDO, but it seemed like this year more so than the last few, PDO's were more sustainable. If you bet using the regression to the mean premise you would have ended up with a lot of bets on the Devils, Panthers, Hurricanes, Lightning and a lot of bets against Pittsburgh, Toronto, Anaheim and those combinations were not very successful this year. Last year was a god send if you hopped aboard the LA Kings Express after they dumped Jack Johnson. It would be awesome if someone smarter than I could do the same analysis JLikens did over at objectivenhl and apply it to this year (link at bottom) Obviously there is a smaller sample this year but it would be interesting to see how EV GR through 20 or 30 games did in predicting the remaining sample of games in comparison to Corsi T or any possession metric, ideally Score adjusted Fenwick. I would guess EV GR would perform stronger, and the possession metrics weaker, than in previous seasons. It should also be interesting to follow the Penguins if they can keep this group of players together and healthy and see what kind of PDO they are capable of posting as a team, especially with Vokoun in net. No team since the lockout appears more capable of sustaining off the charts shooting percentages as this one.

I cant manage to copy and paste exact chart so link only leads you to blog. The post I am referencing is about 60% of the way down and the title of it is "Loose Ends- Part 1: Predicting Future Success" http://objectivenhl.blogspot.com/search?updated-max=2011-04-13T13:54:00-07:00&max-results=7&reverse-paginate=true

Josh WMay 27 2013, 10:43AM

@Hendy Yes I would expect this to work ~60% of the time and fail another 40%.

@Louis That's a cool blog, I will look at some of their posts. I agree with a lot of the ideas of these but the problem comes from micro to macro predictions. Luck takes up at a large part of the results in hockey, 38%, so it definitely becomes difficult to predict a single game, that's what I am focusing on.