I have been sitting on these tables for almost a month. In the last part to this series we meandered into run creation estimation. Run creation uses the Total Base (TB) stat which is a numerator in Slugging Ratio (SLG). On Base Percentage (OBP) is the chocolate and SLG is the peanut butter. Together they make up the OPS butter cup proving that, in mathematics, any two numbers can be added together to make a third.
Total Bases is a useful game stat and SLG is a helpful reference for game by game management players. It was shown In Part 4 of this series that TB/H, how many Total Bases per hit, converges to almost exactly 1.5 or 3/2. It might be possible to prove that when plate appearances approaches infinity, TB/H has to converge to 3/2.
If a batter is hitting 0.300 BA his SLG should be around 0.450 More then he’s getting a lot of extra bases, less he’s getting less than average. Neither BA or SLG are value stats but they could be used to help see matchup opportunities etc. The reason this has meandered into Run Creation because that’s a big aspect to how the value stat WAR is determined. OPS is even used as a value stat by TV sports announcers. ( Hello Jim Deshaies! )
Another interesting factoid is how the number of runs scored converges to the following formula:
Runs = Hits/2
There might be a way to prove that for infinite number of PA the above is always true. The very basic runs created formula, according to Wikipedia is:
(H+W)*TB/PA
The following table is a total compilation of data by decade. We’ll drill down deeper in the next part to this series. Highlighted in blue is the formula with less error for that decade.
Fun with numbers
Decade | H/2 (KISS) |
(H+W)*TB/PA | |
1920-1929 | 1.021 | 0.953 | |
1930-1939 | 0.988 | 0.952 | |
1940-1949 | 1.034 | 0.955 | |
1950-1959 | 0.994 | 0.981 | |
1960-1969 | 1.041 | 0.973 | |
1970-1979 | 1.046 | 0.983 | |
1980-1989 | 1.026 | 0.986 | |
1990-1999 | 0.969 | 0.992 | |
2000-2009 | 0.957 | 1.013 | |
1920-2017 | 1.004 | 0.985 |
The above ratios are the estimated runs using each formula divided the actual number of runs scored. Not sure what the above is supposed to mean other than the RC formula using TB has been much more accurate in modern baseball.
Below are the total averages for BA, OBP, and SLG for each decade in our study. Not sure how this is relevant but added to provide some perspective. In theory another column could have been added …. but didn’t want to confuse things. All three stats in the below table are useful on their own. Mashing them up into some other number obfuscates their benefit as a game stat.
Decade | BA | OBP | SLG |
1920-1929 | 0.285 | 0.336 | 0.397 |
1930-1939 | 0.279 | 0.336 | 0.399 |
1940-1949 | 0.260 | 0.327 | 0.368 |
1950-1959 | 0.259 | 0.327 | 0.391 |
1960-1969 | 0.249 | 0.311 | 0.374 |
1970-1979 | 0.256 | 0.320 | 0.377 |
1980-1989 | 0.259 | 0.320 | 0.388 |
1990-1999 | 0.265 | 0.331 | 0.410 |
2000-2009 | 0.265 | 0.332 | 0.424 |
2010-2017 | 0.255 | 0.318 | 0.405 |
1920-2017 | 0.262 | 0.325 | 0.395 |
The next table are some interesting ratios from each decade. TB/H converges to almost 1.5 and Hits/Walks converges to almost 5/2. There are around 1/8 ~ 12.5% more recorded plate (PA) appearances than at bats (AB). We covered these stats earlier. Sac Flies and Bunts and all the other little stuff are negligible variables that should be eliminated. Walks are the reason for that 1/8 difference. Since this model uses official PA stat IBBs are not included. Those are negligible as well (i.e. doesn’t matter in the big picture).
Decade | PA/AB | H/W | TB/H |
1920-1929 | 1.129 | 3.040 | 1.391 |
1930-1939 | 1.116 | 2.873 | 1.432 |
1940-1949 | 1.126 | 2.417 | 1.415 |
1950-1959 | 1.130 | 2.352 | 1.509 |
1960-1969 | 1.120 | 2.510 | 1.503 |
1970-1979 | 1.125 | 2.488 | 1.471 |
1980-1989 | 1.120 | 2.593 | 1.498 |
1990-1999 | 1.128 | 2.443 | 1.548 |
2000-2009 | 1.126 | 2.458 | 1.597 |
2010-2017 | 1.115 | 2.555 | 1.590 |
1920-2017 | 1.123 | 2.541 | 1.507 |
The next part of this series will break all this down where error can be measured from season to season, team to team, player to player. Spoiler Alert: The official RC formula with TB has far less error than Hits/2 on a season to season and team to team basis. The two formulae are equal on a player to player basis, each having around a +/- 20% error compared to actual runs that scored.
More on this data and methodology to come. Until then ….