The DH argument Part 2

It appears DH won’t be in NL until 2022 which will give everyone time to get used to the idea.  In Part 1 we looked at how NL and AL pitchers hit as a group.  tl;dr Not well.  Below is a truncated table showing their Win Percentages for the last two years.

Win % for AL and NL Pitchers

YEAR AL Win% NL Win%
2017 0.224 0.224
2018 0.220 0.215

Numbers mean nothing unless put into context.  If you add up all the wins of all 30 MLB teams and divided it by all wins and losses it would come to 0.500 exactly.  For every win someone must lose.  This data model operates on the same principle.  A completely average player would have a Win% of 0.500.  It goes above 0.500 for above average players, below 0.500 for below average players.  Not too complicated!

The hypothesis that NL pitchers would hit better than AL pitchers was proven false with that table.  Pitchers on both leagues are pretty equal with their horrible hitting.  To demonstrate how bad a sub 0.250 Win% is for hitting let’s look at a player from last season, Chris Davis.  Chris Davis was once a great player.  This model has him ranked #3 in 2013, Chris’ career season.  In 2018 things didn’t go so well for him.  He’s ranked #1 in the bottom 200 for WAR.

Chris Davis 2018 WAR

Year Rank WAR oWAR dWAR PA Name_TeamID Pos
2018 -001- -2.8 -2.5 -1.0 522 Chris_Davis_BAL 1B-DH

OK WAR thinks he was bad.  WAR rarely goes negative and this is very very negative.   Let’s see what this data model thinks.  Below is a long form record output showing batting average, OBP, plate appearances, RBIs, and runs.  We’ll need PA to calculate number of games used to calculate Win%.

Chris Davis 2018 WAA

Rank WAA BA OBP PA RBI R Name_TeamID Pos
-023- -3.44 0.168 0.243 522 49 40 Chris_Davis_BAL 1B-DH

Looks like both models are in agreement.  This model does not show Win% for MLB players because it’s a rate and can be deceptive.  For groups of players like lineups and relief staffs it provides context to the sum of WAAs.  The sum of WAAs however is used for tiering in simulation, not the rate.

That said let’s calculate Chris’ Win% according to the formula.

Win% = 0.5 * ( WAA ) / games + 0.5
where games = PA / 38.4

The number 38.4 PA per game is a baseball constant used for batters much like 9 innings exactly per game is used for pitchers.  Plugging in Chris’ numbers we get:

Chris Davis Win% for 2018 = 0.5 * -3.44 / 13.6 games + 0.5 = 0.373
where games = 522 / 38.4 = 13.6

A typical worst team in MLB will lose around 100 or more games.  That Win% is:

Typical bad MLB team Win% = 62 wins / 162 games = 0.383

As you can see from the two calculated Win%, Chris Davis played as well as a typical worst team in baseball.  Actually in 2018 BAL only won 47 games but that’s not typical.  The above demonstrates how dreadfully awful AL and NL pitching numbers are.

Clarification 3/5/2019:  Comparing Chris’ 0.373 Win% with  pitchers’ ~ 0.220 Win%  shows pitchers are far worse than one of the worst non pitching hitter in baseball.  A 0.373 Win% is around 60 wins in a 162 game season.  A 0.220 Win% is around 36.

Since the dreadful pitchers hitting in NL parks must be compared to dreadful hitters hitting in the 9th spot in the lineup for AL parks, what is the Win% between the two leagues.  The table below shows Win% for all 9 lineup positions using a dataset of all games from 2015 to 2018.  The AL Win% are for AL home games where DH is used, NL for when DH is not used in NL parks.

WAA can be calculated for any set of events.  The Win% average for both NL Win% and AL Win% comes to exactly 0.500 adhering to the principle for every win there is a loss.  Since WAA represents W-L that must always be true.

Pos NL Win% AL Win%
1 0.445 0.453
2 0.500 0.506
3 0.609 0.612
4 0.638 0.645
5 0.595 0.539
6 0.502 0.496
7 0.459 0.446
8 0.412 0.422
9 0.309 0.353

The number of runs hit in AL parks is 2.2% higher than NL parks with no DH.  Lineup position 9 shows how much dreadful hitting pitchers drag down the average.  At 0.309 it is above the ~0.220 Win% shown for AL and NL pitchers above.  That’s because pinch hitters take over for 1/3 of a game, possibly more since the 9th spot in the lineup gets the least number of PAs.

Lineup position 5 is the only other outlier between the DH and no DH.  Do not know why but without it the number of runs hit in DH parks would be much higher than 2.2% more than no DH parks.  These numbers could slightly move around if more years are added to this dataset.

Clarification 3/5/2019:  The 5th position in non DH parks (NL) has a far higher Win% than the 5th position in DH parks.  I do not know why or can come up with a theory as to why this would be.  If it’s a bug in a script making this table an update will be added here.

What does the above tell us?   To use Joe Maddon language; Pitchers suck at hitting.  Maybe having DH in NL is a good thing.  Not having pitchers hit often will greatly simplify this data model.  Right now every NL starting pitcher has both a BAT and PITCH record, with their BAT record filtered out but they do get counted for all the books to balanced in this model.

That is probably all for DH.  I might have changed my mind on this matter.  Perhaps more new guys when I get around to it.  Until then ….