Goal-based Metrics Better Than Shot-based Metrics at Predicting Hockey Success

Author: Rob Found
9432-152 Street
Edmonton, AB, Canada
T5R 1N2
(780) 479-7919

Corresponding author: found@ualberta.ca

ABSTRACT
The growing business of professional sports has lead to an increasing demand for effective metrics quantifying factors leading to team success, and evaluating individual player contributions to that success. In the sport of hockey the advancement of analytics has lead to a decline in the use of goal-based metrics, and an increased reliance on shot-based metrics. I tested assumptions behind this trend by using statistical modeling of 10 years of NHL data to directly compare the effectiveness of goal versus shot-based metrics at predicting team success, and comparative hypothesis testing to determine how well goals and shots quantify player contributions to team success. Goal-based models consistently outperformed their shot-based analogs. Models of team goal differential successfully predicted winning % during the 2015-16 season, while shot differential did not. Goal-based metrics (i.e. relative plus-minus/minute of ice time) were also better than shot-based metrics (i.e. relative Corsi/minute of ice time) for evaluating individual player contributions to team winning %. These results show that team and individual performance is not correlated with all shots, but only those shots effective enough to result in goals. These results will lead to more effective evaluation of individual players, and better understanding and prediction of those factors leading to team success.

Keywords: performance, goals, shots, evaluation, Corsi, plus-minus

INTRODUCTION
There is a growing need to improve methods for evaluating team and individual performance in elite level team sports (Berri & Schmidt, 2010; Moskowitz & Wertheim, 2012). This has lead to a recent surge in the use of analytics for predicting individual and team success in the sport of hockey, particularly in the National Hockey League (eg. Bedford & Baglin, 2009; Chan et al., 2012; Schuckers & Curro, 2013). Along with this growing popularity is an increasing use of shot-based metrics, such as Corsi (net-shots directed at goal) and Fenwick (Corsi with blocked-shots excluded) values for both teams and individuals, and a subsequent decline or even abandonment of goal-based metrics, such as individual plus/minus (eg. Gramacy et al., 2013). Driving this trend may be a number of known or assumed advantages of shot-based metrics, which include the following.

1. Hockey is low scoring compared to other sports using scoring-based performance metrics, such as basketball (Ilardi & Barzilai, 2008). In the NHL there are approximately 10 shots taken for every goal scored. This greater number of shot “events” allows for greater contextual in-game diversity in measurements of both shots taken and shots allowed, compared to goals (i.e. more data allows more parsing). Shot-based metrics can thus be applied to players who may be only infrequently involved in events resulting in actual goals, but are much more likely to be involved in shot-based events. Generally speaking, increases in sample sizes also result in greater statistical power.

2. There is a prevailing belief that shooting and save percentages regress to a mean (eg. Anderson, 2011; Yost, 2013), over time, and thus goals scored or allowed are a direct function of shot volume for or against. Such presumptions seem to preclude the potential for players or teams improve their ability to shoot successfully, and generally ascribe long-term variation in team shooting (and save) percentage purely to randomness (i.e. luck). This assumption is ultimately plagued by the obvious fact that each shot does not actually have an equal chance of resulting in a goal. Using only those high quality shots that can be described as “scoring chances” could go far to meet this assumption, but at the loss of much of the sample-size advantages referred to previously.

3. Shots are assumed to be proxies for zone possession time, which itself is considered a proxy for team success (Thomas, 2006). The combined assumption is that teams with greater offensive possession time both take more shots and have a higher likelihood of winning the game.

4. Since it is assumed that teams that shoot more win more, shot-based metrics are subsequently believed to give a more accurate picture of the contributions an individual player makes to team success, compared to similar goal-based metrics such as plus/minus. Assumptions 1-4 lead to the inevitable conclusion that if team shot-differential predicts winning, players who make the largest positive contributions to that shot differential are therefore contributing the most to team success.

Along with these perceived and assumed advantages of shot-based metrics, goal-based metrics, including individual plus/minus, seem to have largely fallen out of favour for the following reasons (eg. Macdonald, 2011; Gramacy et al., 2013; Ilardi & Barzilai, 2008):

1. Team quality has an inordinate influence on individual plus/minus, in both directions (Schuckers et al., 2011). For example, teams with high plus/minus are invariably overrepresented amongst the top rankings of individual plus/minus leaders. It thus can appear that one team is successful because it has several of the top 2-way players (i.e. those that excel at both offensive and defensive play) in the league, when instead it may be that those individuals are benefiting from the superior overall play of their team. Conversely, excellent 2-way players can have low plus/minus values if they play on poor teams. For example, 3-time Selke Trophy winner (best defensive forward) Guy Carbonneau had 4 seasons with a negative plus-minus, one of which occurred after he had already won the Selke twice, and the season before he won it a third time. Since it is unlikely that a strong two-way player would lose this ability, then subsequently regain it, such examples suggest plus-minus does not adequately quantify what it is intended to (Bedford & Baglin, 2009).

2. Because shot-based metrics are considered a better proxy for predicting long-term team success than goal-based metrics (see above), individual plus/minus measures derived from goals for and against are also assumed to be inferior measures of individual contributions to that long-term team success.

However, and obviously, winning or losing a hockey game directly depends only on how many goals each team scores, not how many shots each takes. The vast majority (90 to 95%) of shots do not actually contribute to team success at all, while every single goal directly impacts the game result. In essence, while using shots may increase the quantity of data compared to data based on goals, the quality of the data decreases.

The overarching goal of this paper was to compare shot-based and goal-based metrics directly, using multiple measures of team success, and player contributions to team wins. My first objective was to use model ranking and hypothesis testing of past NHL team statistics to determine what metrics actually are the best predictors of team winning %. Models of past contributions to winning % can be used to predict actual winning %, and has been done, for instance, using adjusted plus-minus in basketball (eg. Walker, 2014). I used the top goal vs. shot based models to determine which was best at predicting actual winning % during the 2015-16 season. I predicted that goal-based metrics would consistently outperform shot-based metrics at explaining past success, and thus also be superior for predicting future success, because all goals contribute directly to the outcome of games, while the vast majority of shots don’t contribute to the outcome at all. My second objective was to create derived statistics for both Relative +/- and Relative Corsi that indicate how well player usage (i.e. ice time allotment) based on each either goal or shot-based metrics correlated with team winning %. I hypothesized that while giving the most ice time to players with high Relative +/- and Relative Corsi leads to greater team success, Relative +/- would be the more directly correlated to team winning %.

METHODS

Objective 1: What are the best metrics for predicting team success?
I used data sourced from the NHL (NHL.com) and additional advanced analytics based on this data that has been calculated and collated by Behind The Net (www.behindthenet.ca), for the seasons 2005-06 through to 2014-15. My response variable for team success, “winning %”, was the percentage of standings points gained out of those available. I excluded shootout results, as shootout success or failure is not relevant to analyses of in-game team success. I created generalized linear models (GLM) using 14 different season statistics that potentially predicted team success (see results for list of statistics used). I then applied the top goal and shot-based models to NHL data from the 2015-16 season, the latter of which had been withheld from model building. I used linear regression to compare predicted winning % with actual winning % in 2015-16, for both the top goal and shot-based models.

I also examined PDO %, which is the combination of team shooting % (goals/shots) and team save % (saves/shots). Because one team’s shooting % is necessarily the inverse of its opponents save % (i.e. a save by one team can also be considered a failed shot by the other team), PDO % is close to 100 overall, for the league as a whole, in any season. This has been interpreted by some as evidence that the relative success or failure of shot to result in a goal is highly luck based, and that luck is equal for all teams, over time (Anderson, 2011; Yost, 2013). If shooting and save % are in fact luck based, PDO% should average about 100 in the long term, for every single team. I examined this by calculating the mean PDO% over 10 years for the league, and then conducted an ANOVA on the mean PDO% for each individual team.

As a final exploration of the effectiveness of shots at predicting team success, I used a t-test to compare winning % when teams outshot their opponents, versus when teams were outshot. To support interpretation of the previous comparison of shot versus goal-based metrics I conducted linear regressions on data from the 1983-84 to 2014-15 seasons, to quantify long-term trends in goal scoring, shots, and save percentage.

Objective 2: Are individual contributions to team success best measured by goal-based or shot-based metrics?
If shot-based metrics are better than goal-based metrics at predicting team success, they must also be better for evaluating individual players. To determine this I used “Relative +/-” as my representative goal-based metric, and “Relative Corsi” as my representative shot-based metric, for ultimately measuring individual contributions to team success. Standard +/- is simply the number of goals scored minus the number of goals allowed by a player’s team when that player is on the ice, excluding powerplay goals. Relative +/- is the difference between team +/- when a player is on the ice compared to off (behindthenet.ca). Corsi is calculated similarly to +/-, except it is the total shots directed at net minus the total shots directed at one’s own net, when a particular player is on the ice. Relative Corsi, then, is the difference between team Corsi when a player is on the ice compared to when off the ice. Both of these measures are directly influenced by the quality of competition (QOC) an individual faces when on the ice, but because QOC is itself calculated using either the Relative +/- or Relative Corsi of one’s competition, I expected QOC to have a similar influence on both goal and shot-based metrics, and so excluded it from analysis. Defensive zone starts also cost a player 0.25 shots each (Ryder, 2004), but I assumed defensive zone starts would have a similar influence on goals, so also excluded zone starts from analysis.

I compared Relative +/- to Relative Corsi using a basic assumption about player performance and team success. As in all elite-level sports, coaches give the most playing time (i.e. time-on-ice per game) to those players the coaches feel are most likely to help the team win. In hockey this results in top forwards averaging 17-20 minutes per game, and top defenseman up to 25 minutes/game (nhl.com), while lower ranked players typically see less than 10 minutes per game. Both shot and goal-based metrics might be used to optimize player usage for maximum team success. To compare the two approaches I divided each individual’s Relative +/- by their average ice time, to derive a value for the net contribution of “goals per minute” for each player. I made a similar calculation using Relative Corsi to derive a value for the net contribution of “shots per minute”. Because defensemen typically spend more time on ice than forwards, but also influence net shot and net goal-based metrics differently (Staples, 2015), I grouped defenseman and forwards separately.

I pooled all individual results to create values for each team that measured how well they optimized player usage (whether intentionally or not) using Relative +/- or Relative Corsi. I then used linear regression to compare each optimization value as independent variable compared to the dependent variable “team winning %”, to determine whether player usage based on Relative +/- or Relative Corsi is the better predictor of future team success.

RESULTS

Objective 1: What are the best single metrics for predicting team success, and are these goal-based or shot-based?
All 14 linear models were statistically significant predictors of team success, based on winning % (Table 1). The top model ranked by AIC was goal differential in all situations, followed by goal differential at 5 on 5. The top two models were composite goal-based models derived from two other statistics – goals allowed, and goals scored – which each represented the top 2 models using non-derived statistics. Each of the 4 goal-based models ranked higher than their analogous shot-based models. Defense based models (goals allowed, shots allowed) outperformed offense based models (goals scored, shots taken). The best goal-based model used team goal differential [winning % = 0.5517 + (0.1811*goal differential)], while the comparative shot-based model using shot differential [winning % = 0.5517 + (0.01527*shot differential)] was only the 6th ranked model (Figure 1).

Table 1

When these two models were applied to data from 2015-16 winning % predicted by the goal-based model was correlated with actual 2015-16 winning % (F1,28 = 12.29, P < 0.005), but winning % predicted by the shot-based model was not significantly correlated with actual 2015-16 winning % (F1,28 = 3.37, P = 0.077; Figure 2).

I found that over a 10 year span, winning % was significantly different between the 30 teams (ANOVA; F31= 3.57, P < 0.0001), but the number of shots taken (ANOVA; F31 = 0.64, P = 0.93) and shots allowed (F31 = 0.87, P = 0.67) were not. In other words, both winning and losing teams take a similar amount of shots over time, showing the lack of a relationship between winning and shot-taking. On a game-by-game basis there was no difference in the mean winning percentage of teams outshooting their opponents (0.504 ± 0.007%) compared to those teams that were outshot (0.495 ± 0.007%; t538 = 0.948, P = 0.34. I indirectly tested the assumption that PDO regresses to a common mean about 100 and found that over that same 10 year span both save % (ANOVA; F31 = 2.58, P < 0.0001) and shooting % were significantly different between teams (ANOVA; F31 = 2.30, P < 0.001). Based on linear regressions, since 1983-84 goal scoring has been declining (F30 = 73.22, R2 = 0.716, P < 0.0001) while shot taking has remained relatively stable (F30 = 3.76, R2 = 0.115, P = 0.062). As a consequence, linear regression also showed there has been a significant pattern of increasing save % (F30 = 194.18, R2 = 0.870, P < 0.0001; Figure 3).

Objective 2: Are individual contributions to team success best measured by goal or shot-based metrics?
For forwards, relative goals/minute was correlated with team success (z28 = 2.06, P = 0.040) but relative shots/minute were not (z28 = -0.53, P = 0.59). For defenseman neither relative goals/minute (z28 = 0.19, P = 0.852) or relative shots/minute (z28 = 1.29, P = 0.20) was correlated with team success. Forwards on teams in the top 15 in standings in 2013-14 had mean goals/minute (0.0032 ± 0.000068) that were 285% higher than those of players on teams in the bottom 15 (0.000830 ± 0.000046; t28 = 2.78, P < 0.01; Figure 4). Mean shots/minute were 65% higher for forwards on top 15 teams (0.0038 ± 0.00069) compared to players on bottom 15 teams (0.0023 ± 0.00053) but this difference was not significant (t28 = 1.72, P = 0.097). For defenseman goals/minute on top 15 teams was 8%, and not significantly lower than on bottom 15 teams (t28 = 0.0271, P = 0.98). For defensemen shots/minute were 660% higher for players on teams in the bottom 15 of the league, though because of extreme variance this difference was also not statistically significant (t28 = 0.86, P = 0.40; Figure 4).

DISCUSSION
Despite the recent trend towards shot-based metrics for evaluating team and individual success, I found that comparable goal-based metrics consistently outperformed shot-based metrics at predicting team success and individual player contributions to that team success. Linear models showed that the best single predictor of team success was the amount of goals a team allows, while the best overall model predicted team success using goal differential. Of all single parameter and composite parameter models, those incorporating goals invariably outperformed those using shots. The “shots for” model was not even as highly ranked as the “faceoff wins” model. When applied to data from 2015-16, which was withheld from model building, the top goal-based model “goal differential” correctly predicted future winning %, while the comparable shot-based model “shot differential” did not.

Given the poor value of shot-based metrics at predicting team success, it was then not unexpected that shot-based metrics were also poor measures of individual player contributions to team success. As representative measures I used relative net goals (i.e. +/- relative to the rest of the team) and relative net shots (i.e. Corsi relative to the rest of the team), which were each converted to either net goals/minute or net shots/minute contributions by each player, on each team. I found teams that gave the most ice-time (whether purposefully or not) to forwards with high relative goals/minute had significantly higher winning % than those that gave the most ice time to forwards with low relative goals/minute. Teams giving the most ice time to players with the best relative shots/minute did not have any higher team winning % than teams allotting ice time randomly. There was no correlation between the allotment of ice-time to defenseman based on either relative goals/minute or relative shots/minute. This showed that, as I predicted, using goal-based metrics to evaluate individual players was superior to using shot-based metrics, but as an unanticipated result, this was only the case for forwards. Neither relative net-goals nor net-shots were useful for evaluating individual defensemen, suggesting that defenseman evaluation should be based more on metrics not related to the production or prevention of shots and goals. Examined more broadly, individual goal-based metrics for forwards predicted whether a team would be in the top or bottom half of the NHL standings, while individual shot-based metrics provided no such value. Neither metric was useful at predicting team success when applied to defenseman alone.

There are innumerable sources of variance in the recording of any hockey statistic, but most of those affecting shot-based metrics will affect goal-based metrics similarly. For example taking more faceoffs in the defensive instead of offensive zones, playing against higher quality of competition (QOC), and the time and stage of a game (i.e. a rout vs. a tied game) all influence the number of shots taken and allowed, but here were assumed to influence the number of goals scored and allowed similarly. Adjustments for QOC would be functionally the same for either method, so I did not expect them to influence conclusions that were strictly comparisons between shot or goal-based metrics, and not necessarily quantifications of either.

The number of shots that actually result in goals is far more important to team winning success than the amount of shots taken. There are meaningless shots, but no meaningless goals, and the increased sample sizes available for shots do not seem to be enough to make shot-based metrics as valuable as goal-based metrics. Over a period of 10 seasons, both save and shooting % varied amongst teams, and correlated with winning %, which also varied. What did not vary over time were the mean shots taken and shots allowed. This forces us to conclude that teams take and allow about the same amount of shots, year after year, but what determines whether they win or not is how many of those shots result in goals.

Perhaps the most difficult to quantify source of variance in sports performance is the influence of luck. It has been proposed that PDO regresses to around 100, over time, precisely because shooting and save % are primarily the results of luck (eg. Anderson, 2011; Yost, 2013). Statistically we expect luck to balance out over time, like a flipped coing that is sometimes heads, sometimes tails, but reaches an even 1:1 ratio over time. This is not appear to be the case when it comes to shooting and save%, as PDO did not even out over the 10 season span of these analysis. In lieu of available data on the quality of individual scoring chances (i.e. some teams may take more shots from scoring zones, compared to peripheral and long distance shots), fluctuations in PDO are the results of some teams taking better shots, some allowing better shots, and some teams simply having better goaltending. As further insight into the correlation between good goaltending and winning, of the 40 teams that reached the Stanley Cup finals in the last 20 years, 28 had goaltenders that earned Vezina Trophy (best goaltender) votes that season, while the remaining 12 had goaltenders good enough to have earned Vezina votes in other seasons. It will surprise no one that good goaltending wins hockey games, so it should also surprise no one that the amount of shots your team allows is largely irrelevant if you have a good goaltender that can stop more of them than an average goaltender.

CONCLUSIONS & APPLICATIONS IN SPORT
Rather than sound the death knell for shot-based metrics, it is hoped that my results instead show the value of goal-based metrics. An optimal strategy might in fact be a combination of both measures, to better quantify what truly does determine a player’s worth: a net production of high quality shots that result in a net production of goals. When comparing approaches to evaluating team success, and individual contributions to that success, it is important to recognize that no single metric exists in isolation. Goals can only come from shots, shots can only come from possession of the puck, and possession is an inevitable result of winning faceoffs, getting powerplays, forcing turnovers, and so on. This explains why all our tested metrics were actually significant predictors of team success, and our models simply identified which were the best of this demonstrably good bunch. However, when shot-based metrics are used a proxy for offensive zone possession, possession is itself is being used as a proxy for goals. Perhaps instead of abandoning goal-based metrics, the most useful approach to improving our ability to predict team and individual success may be to simply improve our goal-based metrics. Furthermore, neither goal nor shot based metrics may be effective at evaluating defenseman, so our pursuit of optimal analytics for evaluating players and teams may have to diverge, and finally consider how different the forward positions are from defense.

Since the early 1980s peak in NHL scoring the mean shot volume has remained comparatively stable, while scoring has dropped precipitously. This clearly describes a trend of increasing save % (i.e. which is the same as describing a declining shooting % by one’s opponent), which has generally been ascribed to some combination of better goaltending, larger goaltending equipment, and team strategies designed to limit quality scoring chances. Either way, this trend clearly describes a declining importance of shots themselves, and an increasing importance of goals. Quite simply, as fewer goals are scored, each one becomes more important, while as more shots results in saves, each shot declines in importance. The emphasis of analytics then, should be turning even more towards shot quality (of which shots that score are the highest quality), and not shot quantity. The growth of shot quantity based metrics for evaluating teams and individuals has the effect of encouraging shot quantity only, and merely hoping that shot volume will translate into more goals. Instead we should be promoting analytical methods for hockey that better reflect a game whose sole purpose is to score more goals than your opponent, regardless of how many shots are taken.

REFERENCES
1. Anderson, C. 2011. PDO regression to the mean, or why you should ignore shooting percentages. http://www.arcticicehockey.com/2011/12/20/2648333/pdo- regression-to-the-mean-or-why-you-should-ignore-shooting. Accessed September 20, 2015.
2. Bedford, A. & Baglin, J. Evaluating the performance of an ice hockey team using interactive phases of play. IMA Journal of Management Mathematics 20: 159-166.
3. Berri, D. & Scmidt, M. 2010. Stumbling on Wins: Two Economists Expose the Pitfalls on the Road to Victory in Professional Sports. Pearson Education, Upper Saddle River, New Jersey, USA.
4. Chan, T. C. Y., Chow, J. A. & Novati, D. C. 2012. Quantifying the contribution of NHL player types to team performance. Interfaces 42: 131-145.
5. Gramacy, R. B., Jensen, S. T. and Taddy, M. (2013). Estimating player contribution in hockey with regularized logistic regression. Stat AP, Available at arXiv:1209.5026.
6. Ilardi, S. and Barzilai, A. (2008). Adjusted plus-minus ratings: New and improved for 2007–2008. Available at http://www.82games.com/ilardi2.htm. Accessed September 20, 2015.
7. Macdonald, B. (2011). A regression-based adjusted plus-minus statistic for NHL players. Journal of Quantitative Analysis in Sports 7 (3).
8. Moskowitz, T., & Wertheim, J. L. 2012. Scorecasting: The Hidden Influences Behind How Sports Are Played and Games Are Won. Crown Publishing Group.
9. Ryder, A. 2004. Win Probabilities: A tour through win probability models for hockey. Available: http://www.hockeyanalytics.com. Accessed September 20, 2015.
10. Schuckers, M. & Curro, J. 2013. Total Hockey Rating (THoR): A comprehensive statistical rating of National Hockey League forwards and defensemen based upon all on-ice events. Proceedings of: MIT Sloan Sports Analytics Conference, Boston, MA, USA.
11. Staples, D. 2015. Why it’s problematic to use on-ice stats like Corsi to rate individual NHL players. Edmonton Journal: http://edmontonjournal.com/sports/hockey/nhl/cult-of-hockey/why-its-problematic-to-use-on-ice-stats-to-rate-individual-nhl-players. Accessed September 20, 2015.
12. Thomas, A. C. 2006. The Impact of Puck Possession and Location on Ice Hockey Strategy. Journal of Quantitative Analysis in Sports. Volume 2, Issue 1, ISSN 1559-0410, DOI: 10.2202/1559-0410.1007.
13. Walker, N. 2014. Square root of meh: real plus-minus based wins projections. http://nyloncalculus.com/2014/10/27/square-root-of-meh-2014-2015-real-plus-minus-based-win-projections. Accessed April 9, 2016.
14. Yost, T. 2013. PDO and Luck: Projecting Progression / Regression in the NHL. http://www.hockeybuzz.com/blog/Travis-Yost/PDO-and-Luck-Projecting-Progression–Regression-in-the-NHL/134/44551. Accessed September 20, 2015.

FIGURE LEGENDS
Figure 1. Correlations between season winning % (excluding shootouts) and goal differential (top) and shot differential (bottom), for all NHL teams, from the 2005-06 through 2014-15 seasons.

Figure 2. 2015-16 actual team winning % (Y-axis) compared to predicted winning % (X-axis) using the goal-differential model (top) and shot-differential model (bottom). Selected outliers are identified by team. Prediction using goal model was statistically significant (P < 0.005) but prediction using shot model was not (P = 0.077). Figure 3. Long term trends of increasing save % and declining goals/game in the NHL. Figure 4. Mean net goals (Relative +/-) and shots (Relative Corsi) produced, per minute, per player, for 2014-15. Black bars are means for players on teams in the top 15 teams in the NHL, as ranked by winning %, grey bars are for players on teams in the bottom 15. Includes only those players in the top 10 of total season ice-time on their team. Table 1. Statistics for single-parameter models predicting NHL team success, as measured by percentage of non-shootout points gained versus those available, for the period from 2005-06 to 2014-15. Models are ranked from best to worst, using Akaike's Information Criteria (AIC).