We are about a week away from MLB having played around 1/6 of a season (27 games) which is when there is enough data to use the player rankings for handicapping purposes. Let’s continue with our journey to discover OPS. In Part 2 we’ll look at the building blocks that make up game stats like OPS.
What is an AB and PA?
According to this data model, a Plate Appearance (PA) is defined by every completed event when a batter faces a pitcher that results in a record in an event file. You can download historical baseball event data here.
Apparently, according to the Wikipedia definition, a Plate Appearance is defined by this formula:
PA = AB + BB + HBP + SH + SF + Catcher Interference
We haven’t covered the above building blocks yet but that there is a formula for something so fundamental to baseball statistics is interesting and caused me to run through the historical league totals to figure this out. Intentional Walks (IBB) is missing from the above. Intentional Walks are so unworthy to the Sabermetric crowd they feel those don’t even qualify as a Plate Appearance.
Edit for Clarification: It wasn’t the “Sabermetric” crowd that eliminated IBB from PA, MLB did in the official definition of PA. PA is used throughout this model and we get data from official sources so we are using the PA without IBB. Including it or not including it doesn’t affect the calculation of WAA. Our historical daily dataset which gleans data from event files include IBB with PA. Those would only affect historical daily OBP calculations which we don’t care about and the difference is negligible.
An At Bat (AB) is listed above in the Plate Appearance formula which you can solve and it looks like this:
AB = PA – BB – HBP – SH – SF – Catcher Interference
The AB stat was derived at the beginning of baseball to calculate Batting Average (BA).
BA = H / AB
Batting Average is a percentage from 0 to 1 that shows how often a player gets a hit (H). Walks were considered of lessor value than a hit yet they couldn’t penalize a player for getting a walk. Thus, they deducted walks from Plate Appearances and created the At Bat (AB) stat. Soon thereafter the Sacrifice Hit (SH aka Bunt) lobby got their deduction then the Sacrifice Fly (SF) people had to get theirs and of course Cather Interference which happened 41 times out of 184580 MLB Plate Appearances in 2016.
The At Bat stat is one of the first occurrence of human bias in baseball statistics. It has been around since forever and the bias seems reasonable for the purpose this measure is supposed to convey as a game stat. Unfortunately it has been used to rank players and is one of the three crowns in a triple crown for players ( RBI, HR, BA). It is used as a category in many Fantasy baseball leagues including Draft Kings, a site that allows people to gamble real money on player game stats.
BTW: From 1950 – 2016 Sacrifice Flies (SF) occur 0.7% of Plate Appearance Bunts (SH) occur in 0.9% of Plate Appearances. How they can differentiate between a Sacrifice Fly and someone who missed getting a Home Run is … they don’t.
The following are other types of Plate Appearances and how often they occur in our 1950-2016 dataset:
Intentional Walks (IBB) = 0.72%
Hit By Pitch (HBP) = 0.7%
Regular Walks (BB) = 8.5%
Catcher Interference (CI) = 0.02%
Hits (H) = 23.2%
Outs = 66.9%
What is OBP?
On Base Percentage (OBP) is the O in OPS. Now we’re getting somewhere!
The purpose of On Base Percentage is to incorporate the lessor Walks into Batting Average to show how often a player reaches base. This provides managers with another game stat without violating the sanctity of the Batting Average used by fans over a century to follow their favorite players. Since the movie and book “Moneyball,” OBP has gained popularity.
But what does it mean? Here is the formula according to the Wikipedia page.
Already some peculiarities. Both the numerator and denominator add back the Walks deducted from Batting Average above. They also add back Sacrifice Flies for some reason. Since Intentional Walks are left out of how they define Plate Appearance they’re left out of inclusion in AB and thus, left out of this formula. Intentional Walks do not count towards OBP yet they occur at the same rate as HBP.
An Intentional Walk is a walk. Barry Bonds leads all of baseball with 688 IBBs followed by Pujols with less than half that a 308. Bonds lost 688 Plate Appearances to his career because of this human rule/bias to the OBP calculation. That’s an entire season! Based upon Bonds’ Home Run hitting ability he probably lost 50 home runs due to this. Not only does he lose out on Home Runs, RBIs, etc., he doesn’t even get benefit from a higher OBP. Teams Intentionally Walked him because he was a feared hitter. He should be rewarded for this in some way and not treated like those Plate Appearances didn’t matter and even don’t exist. Can scorekeepers tell an intentional walk from a semi-intentional walk? No.
The Keep It Simple Stupid (KISS) approach is that you don’t include a variable without proof. By including IBB as a deduction to Plate Appearances introduces bias. Just because you may think something is not fair is not proof nor a valid reason.
This model will include game stats like OBP and others for reference purposes and we will use official sources. Calculating game stats is outside the scope of what we do here. We will use the following formula when doing algebra later on in this series.
OBP = (H + W) / PA where W = HBP + BB + IBB <– simplified OBP
To illustrate how complexity in the official OBP makes little difference let’s use an example. Suppose a player has an OBP = 0.400 (pretty good) with 600 Plate Appearances as defined by using our simplified equation above.
OBP = 0.400 = 240/600 <– simplified OBP , IBB in numerator and denominator
The 600 Plate Appearances above include Intentional Walks as well as Bunts (SH) and Sac Flies (SF) and everything else. Consider that IBB make up 0.7% of PA and SH make up 0.7% of PA (see above) both the numerator and denominator get 4.2 subtracted.
OBP = 0.398 = (240 – 4.2)/(600 – 4.2 – 4.2) <— official OBP, no SH/IBB in denominator, no IBB in numerator
The difference is negligible. Guys with a lot of IBBs like Bonds and Pujols will see their OBPs rise much more because they get Intentionally Walked far far more than the 0.7% league average. It makes no sense to add complexity to something especially when your reasoning is dubious and especially when the occurrence of that variable is so marginal. Keep It Simple Stupid is usually the right answer!
In our next part we’ll explore Total Bases and how that makes up not only the Slugging Ratio, the S in OPS, but also an important factor in estimating runs created from hits. Until then ….
Note: I probably screwed up the math above so corrections will follow this Note. The mistakes won’t be enough to change our premise.