Authors: Louis R. Joslyn, Nicholas J. Joslyn and Mark R. Joslyn
Mark R. Joslyn, PhD.
1541 Lilac Lane
Lawrence, KS 66045-3129
Mark Joslyn is a political scientist and graduate director at University of Kansas.
Louis Joslyn is a graduate student at University of Michigan in Bioinformatics
Nicholas Joslyn is a student at Simpson College majoring in Mathematics and Physics.
In NCAA division 1 men’s college soccer, what performance measures determine improvement in win percentage from one season to the next? Though systematic research of college soccer is uncommon, using available team box scores we were able to construct robust models for year-to-year improvement in win percentage. For teams that improved win percentage greater than 5%, attacking efficiency – ratio of goals scored and shots taken – was the most important predictor followed by defending scoring efficiency – ratio of goals against and shots against – and total shots ratio – total shots for versus total shots against. We also find that efficiency measures are the most difficult to repeat from one season to the next. In short, the key performance measure for improved team win percentages is converting chances into goals, the most challenging team variable to sustain across seasons.
Keywords: attacking and defending efficiency, shots ratio, win improvement, college soccer
Since the pioneering work of Charles Reep, a major emphasis of statistical research is team performance (16). Researchers attempt to isolate the measures that distinguish between successful and unsuccessful teams. Such measures were identified by Hughes and Bartlett as action variables (7), and include possession (8), passing (17) and a host of game related statistics notably shots and goals (11, 12).
However, research appears exclusive to the professional level, most notably the European Leagues and World Cups. This level of the sport of course attracts significant media attention and financial resources, which in turn produce demand for advanced analytics (1, 10). Predictably, the trove of available metrics draw scholarly attention. But below the professional club level, where in the United States a vast majority of players and coaches perform, and thousands of highly competitive games are played, systematic research is largely absent. In this article, we fill this void by analyzing team performance measures for the NCAA Division 1 men’s soccer seasons 2013 and 2014. To our knowledge, this is the first study to examine conventional box score measures in collegiate soccer.
Further we exploit this data to answer a long-standing concern among coaches at all levels of play: What factor(s) are most important for year to year differences in win percentage? Coaches are judged on improvement, or lack thereof, from one year to the next. Some teams simply have poor years, unlucky circumstances govern, yet bounce back the next year. Other teams enjoy seasons of destiny only to fall short the following year. We concentrate on shot volume, goals chances converted and goal chances denied and discover they account for nearly 75 percent of the variance in changes in team win percentage from 2013 to 2014 seasons. A team’s scoring efficiency is the most important predictor of improvement. Teams that enhance their efficiency in front of the net are more likely to improve than teams that increase their shot frequency, or decrease the opposition’s conversion rate. Thus the single most important metric that differentiates changes in win total from one season to the next, is how well a team takes its chances. Paradoxically, scoring efficiency is the most difficult skill to repeat from one year to the next.
Finishing chances is notoriously difficult, and analysts confirm that it is the least likely metric to be repeated from one year to the next (15). At the team and individual level, finishing includes a large dose of luck (5) and so chance plays a significant role in winning (1, 13). Poor years may then be attributed to dismal finishing and modest defending – high defensive scoring efficiency. Exceptional years however may turn on clinical finishing and robust defending. To determine the relative value of finishing toward an improved season win percentage, we sought season total measures of teams’ shots and team goals.
The National Collegiate Athletic Association (NCAA) archives season statistics for men’s soccer (14). For specific years, Division 1 teams are listed and include season box score statistics. For the 2013 and 2014 seasons, we collected shots for and against, goals for and against, and team win percentages for all Division 1 teams. Team wins are counted as 1 point and draws as .5 points. The NCAA kept official statistics for 202 Division 1 teams for the 2013 season and 205 for 2014. Not all teams play the same number of games and season box score totals include NCAA tournament games.
Variables and distributions
For teams competing in 2013 and 2014 seasons, we derived change in win percentage by subtracting 2014 team win percentage from 2013. Figure 1 exhibits the distribution of change. The distribution is close to normal, though skewed slightly right, and most teams are clustered within the -.10 to .10 interval. Approximately 3% of teams produced identical win percentages across the two seasons while another 10% ended the 2014 season with nearly the same win percentage as 2013 – within 1%. The range of the distribution is instructive as well. One team enjoyed enormous success in 2014, improving from a meager 12.5 win percentage in 2013 to 69 percent a year later; a 56.5 percent increase. By contrast, another team dropped precipitously from an outstanding 2013 win percentage of 66.7 to 16.6 in 2014.
For the predictors, we first consider shots. Shots for and shots against were combined into a composite measure called Total Shots Ratio. The measure is expressed as the ratio of how many shots a team takes versus the number of total shots. A ratio of .5 means teams are matching shots, whereas a value over or under .5 indicates one team outshooting the other. Research demonstrated the importance of shot volume in distinguishing successful teams (2, 4). Moreover, analysts demonstrated the value of TSR to forecast season points, goals, and wins across various European leagues and World Cups (3, 12, and 18). We do not know whether TSR translates to the college game.TSR values range approximately .24 to .69 across both years.
A second predictor of team success is scoring efficiency (19).
While it is important to control the ball and produce shots, teams must nevertheless convert those chances. It is one matter to create chances — it is another entirely to finish. Losing coaches, players, and fans often identify missed opportunities: “we were not clinical enough in front of the net” is a common lament. Teams that produce a large number of shots may not be able to convert while teams that generate few shots may be excellent finishers. Similarly, teams may enjoy a shot advantage but nevertheless lose because they cannot effectively prevent the opposition from converting chances. So, regardless of a team’s capacity to generate shots, and indeed limit the shots of the opposition (TSR), we expect attacking and defensive scoring efficiency to be key determinants of winning.
We calculated attacking scoring efficiency as the ratio of goals scored over shots taken.
Defensive scoring efficiency is the ratio of goals against over shots against.
The mean attacking scoring efficiency for Division 1 men college soccer teams across seasons 2013 and 2014 is .10, approximately 1 goal for every 10 attempts. To compare, the average attacking efficiency rate in the English Premier League is .11, one goal for every 9 attempts (1). The variance of team attacking scoring efficiency is also instructive. The minimum ASE is .025 percent in 2013, 1 goal out of 40 chances, and the maximum of .22 is slightly better than 1 goal out of 5 attempts. Changes from 2013 to 2014 included a superb increase in scoring efficiency of .07 and a decrease of -.09. Change in defending scoring efficiency extended from a poor .097 increase to the exceptional -.09.
Table 1 shows the relationship between levels of win improvement and change in the three performance measures. Cells represent the average changes given a specific level of team improvement. From above 0 to 5% is modest improvement in win percentage. This may reflect winning one more game or perhaps two draws. Notable improvement denotes a win percentage interval above 5% to 15%. Fifteen percent increase, assuming a 20 game season, may include 3 additional wins. Change in team win percentage beyond 15% represents excellent improvement.
Two findings are important in Table 1. First the average changes in team performance measures, with one exception, reflect advances from the prior year. So, among teams that improve win percentages, predictably, perhaps, performance measures also advance. However, attacking scoring efficiency does not show an increase among the modestly improved teams. While average change in the two other measures are predictably small, given the modest win category, attacking efficiency declines slightly. We are not certain why this would be, it does imply that finishing chances is perhaps more important to reaching notable and excellent improvements.
Second, performance measures increase with levels of team win improvement. For example, change in shots ratio increase from .011 to .042. Considering the standard deviation in this measure is .07, the increase is significant, a nearly fourfold increase from the modest category. Similarly, attacking and defending scoring efficiency exhibit substantial increases. In fact increase for attaching efficiency from modest to excellent improvement is especially impressive.
In short, teams that exhibit improvement on performance measures appear to increase their win totals from previous the previous season. We shall see which performance measure contributes most to an improved season.
The regression model
We used the three categories for change in win percentage as the dependent variables. Teams that increased win percentage modestly (up to 5%, n = 20), notably (5% < to 15%, n = 36) and exceptionally (15% < n = 42) are compared to a common baseline of teams that performed the same or worse than they did a year before (0 ≤ to ≥ - .50, n = 104). Since the dependent variables are dichotomous (1 = category of improvement, 0 = status quo or decline), we employed logistic regression. The win improvement categories correspond well with typical benchmarks of progress. The logistic model will therefore yield the probabilities associated with achieving those benchmarks, and the estimated effects of performance measures on those probabilities. Moreover, our modeling strategy provides a consistency across dependent variables, comparing win improvements to a logical reference group; the group of teams that fail to improve from one year to the next. It is this group that teams and coaches are frequently judged against. The model is as follows.
Where i is the name team, i = [1,202], βk is the coefficient of the kth variable, α is the constant term and µi is the residual term for the ith observation.
Standard errors, model fit, logistic estimates and standardized estimates are given in Table 2. Since we wish to compare the relative importance of the estimates (βk), a standardized coefficient is obtained by multiplying βk by its standard deviation. These coefficients are found in the Std columns of Table 2.
First, the chi-square and pseudo R-square statistics show that the model is significantly different from the null or intercept only model. The model fit stats in fact increase markedly across win improvement categories. For exceptional improvement, the pseudo R-square is 3 and half times larger than for modest improvement. Perhaps the interval that represents modest wins is drawn to narrowly. Modest improvement may include only a draw, and at a maximum an additional win. This maybe too similar to prior season performance to distinguish random variation from substantive change. In any event, notable and exceptional year models yield better forecasts.
Second, yearly changes in team performance measures across the three categories of win improvement are dependably significant predictors. For modest win change, the most important measure appears to be shot ratio (Std = 1.10). A 1 standard deviation increase in shots ratio produces a 1.10 increase in the log odds of achieving modest win improvement. The relative effect is nearly the same for defending efficiency (Std = -1.01), though a standard deviation increases reduces the log odds of modest progress. However, enhanced improvement in team win percentage alters the order of relative importance of performance measures. For notable and exceptional seasons, changes in attacking efficiency is essential. In fact for exceptional seasons a 1 standard deviation increase in attacking efficiency generates nearly 3.0 increase in the logs odds. This is the maximum impact estimated across the different win improvement categories. In short, exceptional changes in team win percentages require marked improvement in team finishing.
Figure 2 illustrates the relative importance of performance measures. The effect of each measures is recorded across changes in the standard deviations of the specific variables while holding the other measures at mean levels. Beginning at zero, no change in the measures, the probability of an exceptional season is very remote, below .10. To improve, teams must augment their performance measures. For example, a significant advancement in attacking efficiency from the previous season (.015 = one half a standard deviation) raises the odds for an exceptional win improvement to .25. Advancing scoring efficiency 1 standard deviation (.03), suggests a big season is possible .57. Several more chances well taken can potentially transform a season. For the average NCAA division one men’s soccer team, a .03 increase in attacking scoring efficiency is equivalent to scoring 6 more goals – given same number of shots. An even greater increase in finishing (.045 = 1.5 standard deviations), enhances the likelihood of exceptional improvement to .84, a noteworthy gain of nearly .60 from half a standard deviation improvement. Comparable increases in total shots ratio raise the odds of improvement to .20, .48 and .75 respectively. Similar decreases in defending efficiency improve the odds by .22, .45, and .71 respectively. Performance measures are undoubtedly important, attacking efficiency a bit more so.
To elaborate on the effects of changes in scoring efficiency, we examined teams that exhibited the best and worst changes in win percentage. As noted previously, one team improved by 56.6% while the other declined by 50%. The most improved team increased attacking scoring efficiency by .04, reduced defensive scoring efficiency by a remarkable -.09, and increased total shots ratio by .05. For both efficiency measures, changes are substantially higher than a standard deviation but shots ratio change is less than a standard deviation. Consequently, the marked growth in win percentage occurred primarily because attacking and defending efficiency improved dramatically. Similarly, for the team that exhibited the greatest drop from the prior season – 10 more losses, scoring efficiency was the key variable. Attacking scoring efficiency declined by .036, defending scoring efficiency increased by .097 but change in shots ratio was trivial .007. Both teams illustrate that extreme changes in winning can occur when shots ratio remain constant. Rather it may be scoring efficiency that matters most for sudden reveals of fortune.
A note on repeatability
Coaches and players undoubtedly spend countless hours on finishing technique, the media praise or denounce strikers based on their conversion rates and huge sums of money follow the best finishers. Our empirical analyses demonstrated that the money may be well spent, for it matters, and it is very important for improvement in win percentage. But ironically, as noted above, converting chances is vulnerable to random variation. When attacking efficiency increases, additional wins can be expected. But how consistent is attacking efficiency? Can a reasonable level be repeated from one year to the next? In the college game, no one has accumulated the data to examine the question. It is important because identifying finishing a crucial to a big season means far less if finishing cannot be anticipated.
Table 3 offers Pearson correlations between the 2013 and 2014 performance measures. We can determine reliability by the magnitude of the association. A team performance measure that can be sustained across time should yield a reasonably large positive correlation, which in turn implies a stable team aptitude. A measure that is difficult to repeat from one year to the next should produce a considerably smaller correlation, which implies a larger role for chance.
Shots are clearly a team measure that is repeatable. The stability of this measure is striking when compared to the efficiency measures. The shot ratio correlation is 3 and half times larger than attacking efficiency correlation. The size of attacking efficiency correlation indicates that converting chances into goals is vulnerable to change and unlikely to repeat from one season to the next. Similarly, teams’ find it difficult to repeat defending efficiency.
The key measure that contributes most to exceptional improvement in win percentage is scoring efficiency. Yet scoring efficiency is genuine challenge for teams, coaches, and players. Table 3 suggests it is stubbornly unpredictable. To a lesser extent this is true for defending efficiency as well. Exceptional win improvements are also determined by positive changes in shot ratio. This is better news for coaches. Shot ratio can be repeated and are the most stable and predictable measure of the three examined in this paper.
Our results are perhaps not surprising. Shooting and scoring matter. To achieve an exceptionally improved win percentage, from one year to the next, a team must be able to shoot, score and defend. Yet the relative impact of these critical performance measures on an improved season is not well understood, nor is the magnitude of the impact well known, especially in college soccer. Empirical analyses of college soccer is in fact largely nonexistent, though thousands of matches are played every fall, involving players of abundant skill and experience. While men’s college soccer does not match the revenue nor media coverage of college basketball or football, it does possess an impressive history and at some Universities represents a considerable sports presence.
Significant improvements in win percentage are the result of increases in total shots ratio, attacking scoring efficiency and decreases in defending scoring efficiency. The data revealed that shot ratios were stable from season to season, suggesting it is a fundamental team characteristic. Coaching ability, players’ skill level, and style of play are unquestionably key factors in sustaining TSR levels. Teams that repeat high TSR levels are likely strong programs and well positioned for continued success. We in fact believe TSR does not simply reflect shots bur rather control, the capacity of teams to impose their system and govern play. The absence of possession statistics in NCAA box scores does not allow us to test this assertion directly. In 2017, however, Joslyn et al. (9) did show across MLS, EPL, and LaLiga, team total shots ratios and possession stats were highly correlated.
The analyses also showed that better win percentages required greater efficiency in the dangerous areas of the pitch. A team must sharpen its ability to convert chances and defend opposition chances. But as Table 3 indicated, attacking and defending efficiency levels are the most difficult to maintain from one season to the next. Analysts have noted a large measure of fortune involved in professional team conversion rates and this is true in our college data as well (5). Undoubtedly, success in the middle, attacking and defending thirds of the pitch are essential for improved seasons. But it is in the attacking third, and an improved attacking efficiency, that most prominently impacts the likelihood for exceptionally changes in win percentage.
APPLICATIONS IN SPORT
College soccer is ready for advanced analytic work. While slower than professional levels to embrace data driven perspectives, college soccer teams are beginning to incorporate quantitative approaches in all aspects of the game. This will likely further develop performance measures to include variables typically featured at the pro level. Presently, NCAA box scores are rudimentary and do not include valuable metrics such as shot location and time, tackles, clearances, possession, to name a few. While our analyses used season box scores, individual match data offers greater variation in the limited metrics now available. But accessing and entering game data represents significant challenges, and it is precisely these barriers – resources and expertise – that prohibit some programs from pursing analytics (6).
1. Anderson, C., & Sally, D. (2013). The numbers game. Why everything you know about soccer is wrong. Penguin Books. New York, New York.
2. Delgado-Bordonau, J.L., Domenech-Monforte, C., Guzman, J.F., & Mendez-Villanueva, A. (2012). Offensive and defensive team performance: relation to successful and unsuccessful participation in the 2010 Soccer World Cup. Journal of Human Sport and Exercise. 8(4), 894-904.
3. Goodman, M. (2013). What is total shots ratio? And how can it improve your
understanding of soccer? Retrieved from Grantland.com website: http://grantland.com/the-triangle/what-is-total-shots-ratio-and-how-can-it-improve-your-understanding-of-soccer/
4. Grayson, J. (2012). Another post about TSR. Retrieved from James Blog website: https://jameswgrayson.wordpress.com/2012/07/15/another-post-about-tsr/
5. Grayson, J. (2011). Predicting future performance revisited. Retrieved from James Blog website: lhttps://jameswgrayson.wordpress.com/2011/10/31/predicting-future-performance-revisited/
6. Hanlon, M. (2014). Current practices and perceptions of notational analysis among United States soccer coaches. Soccer Journal. 60(1), 6-64.
7. Hughes, M.D. and Bartlett, R.M. (2002). The use of performance indicators in performance analysis. Journal of Sports Sciences, 20(10), 739-754.
8. James, N., Jones, P.D., & Mellalieu, S.D. (2004). Possession as a performance indicator in soccer as a function of successful and unsuccessful teams, Journal of Sport Sciences, 22(6), 465 507-508.
9. Joslyn, L.R., Joslyn, N.J., & Joslyn, M. R. (2017). The impact of shots, shots against and total shots ratio in college soccer. Retrieved at College Soccer News website: http://www.collegesoccernews.com/index.php/articles/1039-the-impact-of-shots-shots-against-and-total-shots-ratio-in-college-soccer-by-louis-joslyn-nicholas-joslyn-and-mark-joslyn
10. Kuper, S., & Stefan, S. (2014). Soccernomics. Nation Books. New York, NY.
11. Lago-Pen˜as, C., Lago-Ballesteros, J., Dellal, A., & Go´mez, M. (2010a). Game-related statistics that discriminated winning, drawing and losing teams from the Spanish soccer league, 483 Journal of Sports Science and Medicine, 9(2), 288-293.
12. Lago-Ballesteros, J., & Lago-Penas, C. (2010). Performance in team sports: Identifying the keys to success in soccer, Journal of Human Kinetics, 25, 85-91.
13. Lagos, C. (2007). Are winners different from losers? Performance and chance in the FIFA World Cup Germany 2006, International Journal of Performance Analysis in Sport, 7(2), 36-47.
14. NCAA Division 1 Men College Statistics. Retrieved from: http://www.ncaa.com/stats/soccer-men/d1
15. Pugsley, B. (2013). Premier League Strikers and Repeatability. Retrieved from: http://statsbomb.com/2013/08/premier-league-strikers-and-repeatability/#prettyPhoto
16. Reep, C., & Benjamin, B. (1968). Skill and chance in association football. Journal of the Royal Statistical Society. Series A (General) 131(4): 581-5.
17. Saito, K., Yoshimura, M., and Ogiwara, T. (2013). Pass appearance time and pass attempts by teams qualifying for the second stage of FIFA World Cup 2010 in South Africa. Football Science, 10, 65-69.
18. Taylor, M. (2014). Total Shots Ratio &World Cup Betting accessed here:
19. Yue, Z., Broich, H., & Mester, J. (2014). Statistical analysis of the soccer matches of the first Bundesliga. International Journal of Sports Sciences and Coaching. 9(3), 553-60.