Junior Hockey Blog: Sunday Stats Study

Last Sunday I wrote an introductory piece on hockeymetrics, which is to hockey what sabermetrics is to baseball (or at least it will be once some analytical methods become mainstream). Today I’ll start with a basic method, Pythagorean values. Pythagorean values were first developed by Bill James, and projects a winning percentage based on runs scored and runs allowed:
RS^2
Winning Pct = -----------
RS^2 + RA^2

Where:
RS = Runs Scored
RA = Runs Allowed
^2 = raised to the power of 2
James later adjusted his exponent to 1.84, and other sabermetricians developed further refinements to adjust for dead vs. lively ball eras.

About eight years ago I adapted the equation for hockey, substituting goals for runs and using a different exponents for different eras. In the pre-Original Six era (1917-42) the exponent was 1.63. For 1942-67, it was 1.93, and in the expansion era (through 1998) it is 2.03. These exponents vary in order to minimize the average total error per team in each era, which ranged from 3.01 to 3.61 points. In other words, over 82 games I can accurately predict a record within slightly less than two games.

When I originally set up the Pythagorean numbers for hockey, I calculated the records for all NHL teams from 1917 to 1998, and also displayed the most over and under-performing teams. I also left open the question of overtime points, a question that remains unaddressed to this day.

A similar Pythagorean derivative is to split up a team’s even strength and special teams goals to determine their even strength and special teams winning percentages. For special teams play, shorthanded goals scored are subtracted from their GA, while conversely the shorthanded allowed are subtracted from the GF. The purpose is to make the equation behave more like baseball’s, where runs are scored and allowed in distinct periods (half innings). Hockey special teams play is similarly a distinct opportunity to score or be scored upon, and so shorthanded goals should be subtracted accordingly.

A few years ago, I added a new twist to the Pyth% by taking it down to the player level. Since raw +/- data sums to the total GF/GA for a franchise, why don’t I do the same for players as I did for teams? The theory generates two separate sets of figures. One set accounts for all GF/GA, regardless of the situation. The other incorporates only even strength situations. A third set may be derived from special teams play, but since not all players play in both powerplay and penalty killing situations, the data isn’t really as valid as one might think.

The final step in personal Pyth% is to incorporate ice time. Applying TOI allows us to determine the magnitude of the Pyth. Who’s more valuable, a Pyth% of .650 for a defenseman playing 28 minutes a game for 82 games, or a Pyth% of .800 a third liner who averages 12 minutes when he’s not scratched a third of the time? If one takes the total TOI and divides it by 300, you get an individual’s player-game (60 min x 5 skaters). Multiply player-games by personal Pyth% and you get individual wins (and loses). Since the sum of individual wins equals the total team wins (team Pyth% * 82), splitting it down to a player level remains a valid methodology.

Additional Reading - SportsIllustrated.com columns on this subject:
Pythagorean Hockey Primer
Advanced Pythagorean Methods
Situational Pythagorean Wins and Losses
Personal Pythagorean Stats, Part I
Personal Pythagorean stats, Part II

Suggested Reading

Education Posts

Saturday, May 20, 2006

Sunday Stats Study - Pythagorean Values