Except for stats that occur in the future, the set of baseball statistics is finite so we should be able to run the Pythagorean Expectation formula through Proof by Exhaustion or Brute Force Proof. First let’s run through an example of the original Bill James’ simple PE formula:
We’ll use the 2013 Chicago Cubs as an example.
2013 Chicago Cubs
The Cubs won 66 and lost 96 games in 2013. This means
W-L = Actual WAA = 66 – 96 = -30
Actual WAA is the WAA not estimated, the WAA that really happened. We will call the WAA as estimated by Pythagorean Expection PE WAA.
In 2013 the Cubs scored 602 runs and gave up 689 runs. Thus:
Rs = 602
Ra = 689
Based upon the simple PE formula stated above
PE Win% = (Rs)**2/(Rs**2 +Ra**2) = (602)**2/(603**2 + 689**2) = 0.433
#Wins = PE Win% * (Number Games) = 0.433 * 162 = 70.15
#Loss = (Number Games) – #Wins = 91.85
PE WAA = #Wins – #Loss = 70.15 – 91.85 = -21.7
There is a difference between estimated WAA (PE WAA) and Actual WAA. This difference in the estimation happens because other factors also contribute to generating wins and losses. We can guess at some of those factors like efficient field managers, players that choke under pressure, or simple bad luck but none of those factors are part of the formula we want to prove.
The only thing we know for fact is its error.
Error = | Actual WAA – PE WAA | = | -30 – (-21.7) | = 8.3
The summation of players who played for the Cubs in 2013 add up to the PE WAA (-21.7) and not the Actual WAA. There is a proof of the formula used by this data model to compute WAA that shows the above to be true.
Now that we know how to calculate error we can run these numbers for each team in 2013, add them together and get a total error for all 30 teams. In the next post we will show error results for 3 different variations of Pythagorean Expectation including the original, the one we showed in the above example.