What is an OPS Part 5

I have been sitting on these tables for almost a month.  In the last part to this series we meandered into run creation estimation.  Run creation uses the Total Base (TB) stat which is a numerator in  Slugging Ratio (SLG).   On Base Percentage (OBP) is the chocolate and SLG is the peanut butter.  Together they make up the OPS butter cup proving that, in mathematics, any two numbers can be added together to make a third.

Total Bases is a useful game stat and  SLG is a helpful reference for game by game management players.  It was shown In Part 4 of this series that TB/H, how many Total Bases per hit, converges to almost exactly 1.5 or 3/2.   It might be possible to prove that when plate appearances approaches infinity, TB/H has to converge to 3/2.

If a batter is hitting 0.300 BA his SLG should be around 0.450   More then he’s getting a lot of extra bases, less he’s getting less than average.  Neither BA or SLG are value stats but they could be used to help see matchup opportunities etc.  The reason this has meandered into Run Creation because that’s a big aspect to how the value stat WAR is determined.  OPS is even used as a value stat by TV sports announcers.  ( Hello Jim Deshaies! )

Another interesting factoid is how the number of runs scored converges to the following formula:

Runs = Hits/2 

There might be a way to prove that for infinite number of PA the above is always true.  The very basic runs created formula, according to Wikipedia is:

(H+W)*TB/PA

The following table is a total compilation of data by decade.  We’ll drill down deeper in the next part to this series.  Highlighted in blue is the formula with less error for that decade.

Fun with numbers

Decade H/2 (KISS)
(H+W)*TB/PA
1920-1929 1.021 0.953
1930-1939 0.988 0.952
1940-1949 1.034 0.955
1950-1959 0.994 0.981
1960-1969 1.041 0.973
1970-1979 1.046 0.983
1980-1989 1.026 0.986
1990-1999 0.969 0.992
2000-2009 0.957 1.013
1920-2017 1.004 0.985

The above ratios are the estimated runs using each formula divided the actual number of runs scored.  Not sure what the above is supposed to mean other than the RC formula using TB has been much more accurate in modern baseball.

Below are the total averages for BA, OBP, and SLG for each decade in our study.  Not sure how this is relevant but added to provide some perspective.  In theory another column could have been added …. but didn’t want to confuse things.  All three stats in the below table are useful on their own.  Mashing them up into some other number obfuscates their benefit as a game stat.

Decade BA OBP SLG
1920-1929 0.285 0.336 0.397
1930-1939 0.279 0.336 0.399
1940-1949 0.260 0.327 0.368
1950-1959 0.259 0.327 0.391
1960-1969 0.249 0.311 0.374
1970-1979 0.256 0.320 0.377
1980-1989 0.259 0.320 0.388
1990-1999 0.265 0.331 0.410
2000-2009 0.265 0.332 0.424
2010-2017 0.255 0.318 0.405
1920-2017 0.262 0.325 0.395

The next table are some interesting ratios from each decade.   TB/H converges to almost 1.5 and Hits/Walks converges to almost 5/2.  There are around 1/8 ~ 12.5% more recorded plate (PA) appearances than at bats (AB).  We covered these stats earlier.  Sac Flies and Bunts and all the other little stuff are negligible variables that should be eliminated.  Walks are the reason for that 1/8 difference.  Since this model uses official PA stat  IBBs are not included.  Those are negligible as well (i.e. doesn’t matter in the big picture).

Decade PA/AB H/W TB/H
1920-1929 1.129 3.040 1.391
1930-1939 1.116 2.873 1.432
1940-1949 1.126 2.417 1.415
1950-1959 1.130 2.352 1.509
1960-1969 1.120 2.510 1.503
1970-1979 1.125 2.488 1.471
1980-1989 1.120 2.593 1.498
1990-1999 1.128 2.443 1.548
2000-2009 1.126 2.458 1.597
2010-2017 1.115 2.555 1.590
1920-2017 1.123 2.541 1.507

The next part of this series will break all this down where error can be measured from season to season, team to team, player to player.  Spoiler Alert:  The official RC formula with TB has far less error than Hits/2 on a season to season and team to team basis.   The two formulae are equal on a player to player basis, each having around a +/- 20% error compared to actual runs that scored.

More on this data and methodology to come.  Until then ….