Who are these new guys? Part 1

First two games of spring training for 2019 are in the history book.  Let’s go through a box score and take a look at the new guys on CHN using our newly updated minor league DB. Subsequent parts will highlight new guys that will rotate in throughout the month.

This post will be a lot of search tables into minor leagues.  Only AAA, AA, and A+ are catalogued.  First, here’s a screenshot taken from Reddit in r/ChiCubs.  What game this is doesn’t really matter.  Explanation as to how to read these tables below the fold.

chn02252018The order of new guy tables is determined by their order of appearance in the above box score starting with BATters and then PITCHers.

Winston Bernard

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.84 Wynton_Bernard_CHN BAT 0.397 aaa 27
2018 XXXXX -0.55 Wynton_Bernard_CHN BAT 0.411 aa 27
2017 XXXXX -0.88 Wynton_Bernard_SFN BAT 0.420 aaa 26
2016 XXXXX -0.78 Wynton_Bernard_DET BAT 0.408 aaa 25
2016 XXXXX 0.46 Wynton_Bernard_DET BAT 0.534 aa 25
2015 -177- -1.43 Wynton_Bernard_DET BAT 0.453 aa 24
2013 XXXXX 0.06 Wynton_Bernard_SDN BAT NA aplus 22

The above table format for a minor league career search is a work in progress.  This is a Keep It Simple Stupid (KISS) data model so only rank and WAA, the value  factor used to rank that represents W-L and is the foundation of this data model, are important  A WAA=0 is completely average.  Negative means below average, positive above average.  As of now we don’t pull in position data for minor leagues other than PITCH or BAT.

Like MLB, the dataset used to calculate WAA is all players for all 30 franchises.   For example, AAA has two leagues, International (int) and Pacific Coast League (pcl) which are combined into one dataset.   Pitchers and batters are ranked together, the sum of WAA for each adds to 0 exactly — like how real team W-L records add to exactly zero.  For every win counted a team must lose.  Not too complicated!

In the above table a rank of XXXXX means neither top or bottom 200 in that league for that year.  Since players move from league to league in minors playing time is cut short so WAA cannot accumulate as much as players playing a full season in that league.  Thus, rank is not as important as it is in MLB.  The Win% column puts the WAA calculated weighting factor in perspective.  See how it’s calculated here.  Win% are not shown for MLB records as it would be deceptive.

The last column is age which is an important factor in player development.   The above shows Bernard hasn’t been a very good hitter in minors and at age 27 probably won’t make it.  Decent MLB players usually dominate when they were in minor leagues but sometimes there are exceptions.  The above table shows Bernard is a below average minor league player except for 2016 in AA for the DET franchise (Detroit Tigers).

Trent Giambrone

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.02 Trent_Giambrone_CHN BAT 0.499 aa 24
2017 XXXXX -0.82 Trent_Giambrone_CHN BAT 0.468 aplus 23

Trent has been on our radar for two years now and bats almost completely average.

Note: This model only measures past results.   There could be fundamentals that scouts and player development coaches see in these players that could make them decent MLB players one day.

Jim Adduci

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.34 Jim_Adduci_DET 1B XXX mlb
2018 XXXXX 1.26 Jim_Adduci_DET BAT 0.582 aaa 33
2017 XXXXX 0.19 Jim_Adduci_DET RF XXX mlb
2017 XXXXX 0.10 Jim_Adduci_DET BAT 0.508 aaa 32
2014 XXXXX -0.34 Jim_Adduci_TEX LF XXX mlb
2014 XXXXX -0.10 Jim_Adduci_TEX BAT NA aaa 29
2014 XXXXX 0.40 Jim_Adduci_TEX BAT NA aa 29
2013 XXXXX -0.55 Jim_Adduci_TEX X XXX mlb
2013 XXXXX 1.45 Jim_Adduci_TEX BAT 0.551 aaa 28
2012 XXXXX 0.19 Jim_Adduci_CHN BAT 0.521 aaa 27
2012 XXXXX 0.21 Jim_Adduci_CHN BAT 0.514 aa 27
2011 XXXXX 0.04 Jim_Adduci_CHN BAT 0.503 aa 26
2010 XXXXX -0.63 Jim_Adduci_CHN BAT 0.470 aaa 25
2009 +101+ -0.88 Jim_Adduci_CHN BAT 0.468 aa 24
2008 XXXXX -0.50 Jim_Adduci_CHN BAT 0.482 aplus 23

At age 34 now Jim can be considered a professional minor league-er and based upon his 2018 numbers he could be a good hitting first baseman helping the Iowa Cubs win more than they lose this season.

Charcer Burks

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 -116- -1.83 Charcer_Burks_CHN BAT 0.432 aa 23
2017 -186- -1.41 Charcer_Burks_CHN BAT 0.449 aa 22
2016 XXXXX -0.44 Charcer_Burks_CHN BAT 0.484 aplus 21

Charcer is still very young.

Mark Zagunis

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.04 Mark_Zagunis_CHN BAT XXX mlb
2018 XXXXX -0.73 Mark_Zagunis_CHN BAT 0.467 aaa 25
2017 XXXXX -0.34 Mark_Zagunis_CHN BAT XXX mlb
2017 XXXXX 1.49 Mark_Zagunis_CHN BAT 0.570 aaa 24
2016 XXXXX 0.92 Mark_Zagunis_CHN BAT 0.584 aaa 23
2016 XXXXX 0.84 Mark_Zagunis_CHN BAT 0.576 aa 23
2015 +105+ 2.12 Mark_Zagunis_CHN BAT 0.580 aplus 22

Ryan Court

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX 1.13 Ryan_Court_CHN BAT 0.555 aaa 30
2017 XXXXX -1.57 Ryan_Court_BOS BAT 0.432 aaa 29
2016 XXXXX -0.06 Ryan_Court_BOS BAT 0.483 aaa 28
2016 XXXXX 0.63 Ryan_Court_BOS BAT 0.535 aa 28
2014 XXXXX -0.19 Ryan_Court_ARI BAT 0.477 aa 26
2014 XXXXX 0.04 Ryan_Court_ARI BAT 0.504 aplus 26
2013 XXXXX -1.01 Ryan_Court_ARI BAT 0.410 aa 25
2013 +164+ 1.95 Ryan_Court_ARI BAT 0.719 aplus 25

Ryan is getting kind of old to be in the minors but he had a good season last year.

Ian Rice

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.44 Ian_Rice_CHN BAT 0.475 aa 24
2017 XXXXX 0.36 Ian_Rice_CHN BAT 0.518 aa 23
2016 XXXXX 0.84 Ian_Rice_CHN BAT 0.564 aplus 22

Johnny Field

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.44 Johnny_Field_TOT LF-RF-CF XXX mlb
2018 XXXXX -0.69 Johnny_Field_TBA RF-LF-CF XXX mlb
2018 XXXXX 0.27 Johnny_Field_MIN LF XXX mlb
2018 XXXXX 0.65 Johnny_Field_CLE BAT NA aaa 26
2018 XXXXX -0.32 Johnny_Field_MIN BAT NA aaa 26
2018 XXXXX 0.02 Johnny_Field_TBA BAT NA aaa 26
2017 XXXXX 0.59 Johnny_Field_TBA BAT 0.524 aaa 25
2016 XXXXX -0.32 Johnny_Field_TBA BAT 0.478 aaa 24
2016 XXXXX 1.07 Johnny_Field_TBA BAT 0.600 aa 24
2015 +045+ 3.19 Johnny_Field_TBA BAT 0.625 aa 23
2014 XXXXX 0.94 Johnny_Field_TBA BAT 0.607 aplus 22

Jacob Hannemann

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 -045- -3.00 Jacob_Hannemann_CHN BAT 0.343 aaa 27
2017 XXXXX -0.08 Jacob_Hannemann_SEA BAT XXX mlb
2017 XXXXX -1.34 Jacob_Hannemann_CHN BAT 0.420 aaa 26
2017 XXXXX -0.92 Jacob_Hannemann_CHN BAT 0.375 aa 26
2016 XXXXX -0.44 Jacob_Hannemann_CHN BAT 0.474 aa 25
2015 XXXXX -0.29 Jacob_Hannemann_CHN BAT 0.489 aa 24
2015 XXXXX 0.13 Jacob_Hannemann_CHN BAT 0.537 aplus 24
2014 XXXXX -0.99 Jacob_Hannemann_CHN BAT 0.380 aplus 23

Phillip Evans

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.34 Phillip_Evans_NYN BAT XXX mlb
2018 +158+ 1.76 Phillip_Evans_NYN BAT 0.638 aaa 25
2017 XXXXX -0.42 Phillip_Evans_NYN BAT XXX mlb
2017 XXXXX -0.50 Phillip_Evans_NYN BAT 0.481 aaa 24
2016 XXXXX 0.71 Phillip_Evans_NYN BAT 0.535 aa 23
2016 XXXXX -0.27 Phillip_Evans_NYN BAT NA aplus 23
2015 XXXXX -0.88 Phillip_Evans_NYN BAT 0.440 aplus 22
2014 -036- -3.15 Phillip_Evans_NYN BAT 0.362 aplus 21

Phillip played well last year for the Mets affiliate in AAA.  It looks like he came up twice to MLB in September call up season and hit below average.

Duncan Robinson

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 +115+ 2.10 Duncan_Robinson_CHN PITCH 0.572 aa 24
2017 +165+ 1.66 Duncan_Robinson_CHN PITCH 0.688 aplus 23

Now we get to PITCHers.  Those are some pretty decent numbers.  A 0.688 Win% shows he dominated A+ in 2017.  He might have spent part of that year in A.  This model does not track A league.

Ryan Kellogg

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX 0.71 Ryan_Kellogg_CHN PITCH 0.541 aplus 24
2017 -032- -3.13 Ryan_Kellogg_CHN PITCH 0.350 aplus 23
2015 XXXXX -0.15 Micah_Kellogg_DET PITCH NA aplus 25
2014 XXXXX 0.34 Micah_Kellogg_DET PITCH NA aplus 24

Slightly above average in 2018 in A+ league.

James Norwood

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.00 James_Norwood_CHN PITCH XXX mlb
2018 XXXXX 1.24 James_Norwood_CHN PITCH 0.842 aaa 24
2018 XXXXX 1.16 James_Norwood_CHN PITCH 0.660 aa 24
2017 XXXXX -0.59 James_Norwood_CHN PITCH 0.358 aa 23
2017 XXXXX 1.39 James_Norwood_CHN PITCH 0.660 aplus 23
2016 XXXXX 0.44 James_Norwood_CHN PITCH NA aplus 22

He pitched 11 innings for the MLB Cubs last season and had very good numbers in AA and AAA leagues.

Dakota Mekkes

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 +163+ 1.74 Dakota_Mekkes_CHN PITCH 0.787 aaa 23
2018 +168+ 1.66 Dakota_Mekkes_CHN PITCH 0.835 aa 23
2017 +078+ 2.29 Dakota_Mekkes_CHN PITCH 0.776 aplus 22

Those are dominating numbers in all three leagues.  Dominating minor leagues does not guarantee dominating MLB but it does improve the odds.  Since he played AAA last year Cubs could bring him up mid season to help win a World Series.  You can never have enough pitchers.

Allen Webster

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 XXXXX -0.13 Allen_Webster_CHN PITCH XXX mlb
2018 XXXXX 0.44 Allen_Webster_CHN PITCH NA aa 28
2017 -053- -3.11 Allen_Webster_TEX PITCH 0.260 aaa 27
2015 XXXXX -1.39 Allen_Webster_ARI PITCH XXX mlb
2015 -001- -7.39 Allen_Webster_ARI PITCH 0.068 aaa 25
2014 -112- -1.83 Allen_Webster_BOS PITCH XXX mlb
2014 +044+ 3.49 Allen_Webster_BOS PITCH 0.629 aaa 24
2013 -040- -3.36 Allen_Webster_BOS PITCH XXX mlb
2013 XXXXX 1.43 Allen_Webster_BOS PITCH 0.561 aaa 23
2012 XXXXX -0.86 Allen_Webster_BOS PITCH NA aa 22
2012 XXXXX 1.01 Allen_Webster_LAN PITCH 0.537 aa 22
2011 -130- -1.87 Allen_Webster_LAN PITCH 0.408 aa 21
2011 +142+ 2.23 Allen_Webster_LAN PITCH 0.686 aplus 21

Allen Webster’s future is probably professional minor league-er.

This model only shows an accurate representation of the past.  The past can show proven capability but  cannot be used to predict the future because no on can predict the future — even if you’re time traveling here from the future.

Subsequent parts to this series will show any new guys I happen upon in box scores throughout spring training not mentioned above.   Those parts won’t be as long as this initial post.  Not much more to say.  There will be no spring training team statuses or any player rankings until perhaps the end of spring training.  Then in the regular season we have to wait until 1/6 of the season has been played to show player rankings.

Still working on 2014 – 2018 season simulations where we go mano a mano between this data model, fivethirtyeight’s ELO model, and Vegas betting lines.  Who is most accurate?  Since we’re from the future we know the outcomes for all these estimated probabilities.  More on that when I get around to it.  Until then ….

Foul Balls Are The Pace-Of-Play Problem Nobody’s Talking About

So why are there more foul balls?

Source: Foul Balls Are The Pace-Of-Play Problem Nobody’s Talking About | FiveThirtyEight

This is one of those fun with numbers articles throwing around a bunch of rate gains in an attempt to make a point using numbers to support a narrative.  This data model has a historical dataset of game events from around 1960 to present from retrosheet.org.  That dataset has pitch by pitch data from 1988 to present.

The premise of the above article is that there are more foul balls now than 20 years ago.

tl;dr: Article is correct.  There are more foul balls now.

There can be many reasons why that is.   Using Occam’s Razor which dictates the simplest answer is probably the correct answer.  The simplest reason would be there is less foul ball territory for fielders to catch foul balls.  The article mentions this as one of several possible reasons.  It mentions foul territory surface decreased around 20% and our data shows foul balls have increased 20%.  Reduced foul territory seems like a pretty solid reason.

The next question which is the premise of this article: Are foul balls a cause for increase game times in baseball?  That is unclear.  Let’s look at the output of this data model.  We use tables here instead of graphs.

MLB Pitch Counts from 1998 to 2018

Year Total Foul In Play
Strike Ball
1998 650813 16.0 20.2 25.8 38.0
1999 694856 16.0 20.0 25.4 38.7
2000 714344 16.3 19.7 25.4 38.6
2001 695287 16.7 20.0 26.2 37.1
2002 696317 16.6 20.0 26.1 37.3
2003 697785 16.5 20.2 26.1 37.2
2004 705505 16.7 19.9 26.0 37.4
2005 691180 16.8 20.3 26.0 36.9
2006 702107 16.9 20.0 25.9 37.2
2007 706552 16.8 19.9 26.1 37.2
2008 708078 16.9 19.5 26.2 37.4
2009 710781 16.7 19.3 26.5 37.5
2010 704222 16.6 19.2 27.0 37.2
2011 701083 16.9 19.4 27.0 36.8
2012 698670 16.8 19.0 27.6 36.5
2013 703321 17.0 19.0 27.5 36.5
2014 697839 17.2 19.0 27.8 36.1
2015 695003 17.5 19.0 27.4 36.1
2016 708762 17.5 18.4 27.6 36.5
2017 714188 17.6 18.1 27.8 36.5
2018 713881 17.6 18.0 28.2 36.3

Another table with a lot of numbers so let’s digest the above.  Total column shows total pitch count for the 30 x 162 = 2430 games in a season.  Numbers in the next 4 columns are percentages of that total which should add to 100%.

The length of a game is determined by pitch count, not by the type of pitch.  Colored in tan are total pitch counts under 700K and in blue the top two highest, which happens to be the last two years.   It is deceptive, however, to call this a trend because 2014 and 2015 had under 700K pitch counts and 2000 had the highest pitch count in this entire table.

It could be a trend based upon the last two seasons but not based upon the last 20 just by eyeballing that table.

As a percentage of the total foul balls have clearly increased over the past 20 years.  The last 20 years have seen many new ball parks.  Even Wrigley Field reduced foul territory.  The above data makes sense.  What can MLB do about it?  Probably nothing.   Teams want to place seats closer to the field in order to increase revenue which pay the ever increasing player contracts.

The above also confirms the article’s observation of the reduction of In Play contacts which had its highest percentage of total 20 years ago and the lowest last season.  What the article didn’t mention however was that Strikes increased while Balls decreased meaning Total has been relatively constant.

Stats can be deceptive and manipulated to “prove” some narrative.  The time between pitches is probably very similar between a foul ball and a called strike, thus, the Total pitch count is all that matters.  If the last two years are indicators of what the next two years will bring then maybe there is a problem.

The problem, however, isn’t foul balls, it’s the increase in strikes.  In 1998 there were 168K pitches as strikes.  By 2018 that number rose to 201K, an almost 20% increase.  IMHO this is due to Strikeouts being a overvalued in Sabermetric stats like WAR and FIP and even pitch count type ratios.  A ground out or a fly out counts just as much as a strikeout but not to a Draft Kings or other kind of fantasy teams.  Pitchers have learned the value of their contracts depend on keeping Draft Kings stats high.  So even though they might get a guy to ground out on 1 or two pitches, they go for strikeout which requires a minimum of three pitches.  This keeps their K/9 and K/BB and whatever other nonsense ratios I’m missing high.

I’m tired now but perhaps in a follow up we’ll do a comparison of pitch counts between the various types of outs made.  That is all for now.  Until then ….

The Prediction Racket Part 1

The DH series requires historical event data to be compiled for 2018.  Since I have once again been lazy this off season I have been reluctant to revisit those scripts because 1) I may have forgotten how they work and 2) scripts always seem to break when sitting around for a year.

Historical scripts need to be rewritten.  The historical dataset is the foundation for simulation and required to prove or disprove their accuracy.   That’s why today I was pleasantly amused to find some distraction by this tweet.

robarthur

Troll level for @No_Little_Plans :  Expert!

What better way to get back into the baseball season than arguing over valuation systems and Rob delivers.  The first rule of the Prediction Racket is:

No one can predict the future

Unless you’re a time traveler from the future no one can possibly know what will happen.  Even time travelers can alter their future by affecting something in the past which is called the Butterfly Effect.  Second rule of the prediction racket:

Past results do not affect future results

If you roll a 6 three times in a row it doesn’t mean it’s more or less likely that you’ll roll a 6 again.  The probability is exactly the same no matter what happened in the past.  If a slot machine hasn’t paid out in a long time that doesn’t make it more likely to pay out no matter what compulsive gamblers want to believe.

Third rule in the Prediction Racket:

Use the past as a template for the future

The third rule is more of a how to for those interested in being part of the Prediction Racket.  The first step is take standings from last season and adjust up or down the teams.  If you’re wrong no one will remember.  If you’re right you make sure everyone knows.  Win Win.

Let’s look at standings from last season with a screenshot of baseball-reference.com before they update for this season.

brstandings

Now let’s look at a current screenshot of standings as reported by Baseball Prospectus using PECOTA.  A screenshot is used because this page probably will get updated and changed.

nl

They didn’t put much effort into NL West as that’s almost exactly the same.  They think ATL will be somewhat worse than last season, NYN somewhat better, PHI and MIA about the same.  No real insight except for the Cubs which is the click bait troll of this entire article.  They think NL West as a division will be about the same but NL Central will be worse.

This model uses 3 year career splits to rank teams by strength.  Below is a truncated table of top 15 MLB teams created at the beginning of 2018 when we had complete 25 man rosters.

Top 15 MLB teams April 2018

TeamID Hitters Pitchers Starters Relief Total W-L
HOU 49.53 70.04 35.28 34.76 119.57 0
CHN 49.94 55.63 36.47 19.16 105.57 0
CLE 32.33 59.88 26.13 33.75 92.21 0
WAS 36.34 55.19 40.93 14.26 91.53 0
BOS 55.55 35.24 23.90 11.34 90.79 0
LAN 17.97 65.55 48.05 17.50 83.52 0
TOR 40.74 41.30 26.12 15.18 82.04 0
NYA 30.25 51.65 17.22 34.43 81.90 0
COL 59.60 10.11 -3.68 13.79 69.71 0
NYN 25.85 34.70 25.38 9.32 60.55 0
BAL 35.79 9.31 -15.11 24.42 45.10 0
MIL 15.53 24.69 8.49 16.20 40.22 0
MIN 15.78 19.32 4.18 15.14 35.10 0
ANA 24.16 10.20 5.10 5.10 34.36 0
SFN 13.20 17.68 0.33 17.35 30.88 0

This table takes WAA career value for 2015, 2016, and 2017 and sorts by Total.  Total is the sum of Pitchers and Hitters.  Pitchers is the sum of Relief and Starters as we knew at the beginning of the 2018 season and posted here.   Since we’re from the future we know how this season turned out.  Colored in bold green are the NLDS contenders and bold blue the World Series contenders.

Eight of the top twelve ranked teams made it into the playoffs with only Atlanta and Oakland missing from the top half of MLB.  Atlanta was ranked 27 out of 30 teams and they ended up winning NL East.  OAK is literally the team Michael Lewis wrote about in Moneyball and ranked #18.  They are constantly cycling new players through their system.

The above only measures career so the potential of new players don’t get counted.  In 2015 both the Cubs and Astros were ranked at the bottom of this list.  Both had a lot of new guys.  The Cubs ended up in the NLCS that season and both teams rose to the top of the league with career players in subsequent years.

The above table will be reproduced for this season in April when we get a complete dataset of roster data.   Rules 1 and 2 of the Prediction Racket preclude making projections.  The above table demonstrates, however, a team with top career talent will most likely do well in the regular season.  A team that ends up in the bottom half of this list will probably not make the playoffs but as Rule 1 states, No one can predict the future.

That is all for now.  Subsequent parts to this series will be written if there are any funny tweets during Spring Training or in April when complete 25 man rosters are known and we can have some fun with PECOTA like we do with WAR.  Also, Part 2 of the DH series will be forthcoming.

Since the minor league database has been updated we’ll take a look at the new guys on the Cubs in Spring Training as well as the White Sox.  The White Sox are ranked 6th in AL according to the Vegas book as to who is going win the ALCS.  PECOTA isn’t so favorable according to this:

al

Refer to Rule #1.  Until then ….

The DH argument Part 1

This will be a multi part series that explores various aspects of this DH issue.  The first question that needs to be answered is which league has better hitting pitchers.  Now that AL and NL play each other AL pitchers must bat when they play in NL parks.  My initial conjecture was NL pitchers would be better because they get more practice at the plate.  Let’s examine this.

Since this data model produces a value metric that is what we will use for this determination.  The raw WAA value system used to rank individual players cannot be used because AL pitchers have more than an order of magnitude less plate appearances per year than NL pitchers.  Since all but a few pitchers are below average hitters that would skew the numbers in favor of AL.

This is where the rate, WinPct is needed.  This model uses WinPct to place minor league player stats into context because those players typically move from league to league.  WinPct provides context to the WAA weighting value.  WinPct is not shown for MLB players because it is deceptive at that level.

Why can WinPct be deceptive?  For example,  a typical 26 mile marathon can be finished by the best marathon runner in a little over 2 hours making their average rate of speed to be around 13 mph.  A good runner 3 hours or a 9 mph rate; average runner 4 hours, 6+ mph rate and so on and so on.

A top runner of a mile can do it in 4 minutes or 15 mph.  If you just look at rates, the mile runner runs faster than the top marathon runner.  Since 15 mph is higher than 13 mph does that make the mile runner a better runner?  Is a golfer who shoots 3 under par for 9 holes ( -0.333 shots/hole ) better than the golfer who shoots 3 under par for 18 holes ( -0.166  shots/hole )?

The answer is no.  They could be better but you can’t tell by the rate.  MLB ranks players and give awards based upon batting average because it is/was a sideshow for baseball to garner interest for the sport. If your favorite team wasn’t doing well then you could root for your favorite player instead.  Now with fantasy leagues and actual gambling sites like Draft Kings that reward certain stats over others this concept has become even more extreme.

That’s all fine and well but batting average or WHIP does not represent value anymore than average running speed represents a runner’s ability or value as a runner.  A high batting average and low ERA often does translate into value that can be ranked but the raw number itself cannot.

This model does not show rate for MLB nor does it ever rank on rates, unlike most  of Sabermetrics.  That said we must use the rate for dissimilar groups of players like  AL and NL pitchers and sometimes it’s useful to provide context for lineups, relief squads, and starting pitchers.  Tiering which has been discussed throughout however uses raw WAA weighting.

What does all of this have to do with DH?

Nothing other to explain why in these next few exercises we will be using rates instead of raw value.  First let’s explain how WinPct is calculated again.  By definition:

WAA = wins – losses

Not too complicated.  It’s easy to calculate for teams and this model calculates it for players.  Players with positive WAA provide more wins to their teams than losses, vice versa for negative valued players.   The following must also be true:

Sum Team(WAA) = 0

Add all wins – losses for all teams in any league  and it adds to 0.  IOW, for every team that  wins, a team must lose.  Not too complicated!  The following is also true according this this data model.

Sum Player(WAA) = 0

If you add WAA of every player who played in a season it adds to exactly 0.

Sum Player_Team(WAA) = Team(WAA)

The above states that the sum of all players who played for a team while they played on that team is equal to their real win/loss record.  The Cubs had a record of 95-68 last season which is a WAA=+27.  WAA for all players tagged CHN in 2018 will add to that number.

Therefore, Player(WAA) has the same properties as Team(WAA) where a winPct can be calculated as follows.

Win% =  0.5*WAA/(number of games played) + 0.5

For the Cubs last season that was

Win% = 0.5 * 27 / 163 + 0.5 = 0.583

To calculate a player Win% the number of games played is not the actual games they play in.  Time in baseball is measured by plate appearance for hitters, innings pitched for pitchers.  Baseball has always used 9 innings to represent a game when calculating ERA.  An average game in baseball is not exactly 9 innings but it’s a close enough approximation, easy to remember and easy to calculate before there were calculators.

This model uses the constant 38.4 plate appearances to represent a game for hitters.  Javier Baez had 645 PA last season which translates into 645/38.4 = 16.8 games.  His WAA for his almost MVP season was 7.29 thus,

Javier Baez Win% = 0.5 * 7.29 / 16.8 games  + 0.5 = 0.717

and for context:

Christian Yelich Win% = 0.5 * 8.,44 / 17 games + 0.5 = 0.749

The above is merely an illustration to how this is calculated.  The WAA value ( 8.44 for Yelich, 7.29 for Baez )  is all that matters for ranking purposes.  This model also gives Yelich MVP even though Baez led until the final week  of the  2018 season.

Would you get to the point of all this?

OK.  We meandered a bit with some background as to how all this is calculated showing it’s not very complicated.   The next set of tables will walk through the variables used to make Win%.  First let’s look at plate appearance numbers for AL and NL pitchers throughout the years.

AL and NL Pitching Plate Appearances

YEAR AL PA NL PA
2008 637 4998
2009 642 4994
2010 638 5152
2011 621 5023
2012 605 4908
2013 345 4836
2014 332 4893
2015 333 4643
2016 361 4674
2017 329 4648
2018 311 4526

Plate appearances translates into baseball time.  The above table clearly shows what we already know — that NL pitchers bat far more often than AL pitchers — because NL does not have DH.  The number of plate appearances for both AL and NL  pitchers declined from a peak in 2010 until last season.  Not sure why but it is what it is.  Let’s look at total pitcher hitting WAA for each league.

 AL and NL Pitching BAT WAA

YEAR AL WAA NL WAA
2008 -10.12 -73.75
2009 -9.03 -72.58
2010 -11.38 -68.54
2011 -8.95 -65.79
2012 -8.95 -66.49
2013 -5.23 -62.45
2014 -5.21 -66.13
2015 -5.08 -66.91
2016 -6.32 -61.76
2017 -4.72 -66.86
2018 -4.54 -67.20

This table does not tell you much other than pitchers bring losses to their teams from their poor hitting.  We saw in the previous table that plate appearances have gone down since 2010 yet WAA remains kind of constant.

With 15 teams in NL, pitchers contribute and average around -4 in the win/loss column per team due to hitting.   For AL it’s much less and the above shows AL pitchers have become much better hitters over the years.  Can’t really tell what’s going on without doing the Win% calculation.

AL and NL Pitching BAT Win%

YEAR AL Win% NL Win%
2008 0.195 0.217
2009 0.230 0.221
2010 0.157 0.245
2011 0.223 0.249
2012 0.216 0.240
2013 0.209 0.252
2014 0.199 0.241
2015 0.207 0.223
2016 0.164 0.246
2017 0.224 0.224
2018 0.220 0.215

It must be stressed that these only include hitting stats that have nothing to do with their pitching.  The last couple of years AL and NL pitchers are more or less equal in hitting ability but very very poor.  As shown above, MVP quality hitting is above 0.700.  A textbook completely average hitter would have a WAA = 0 translating to a Win% of exactly 0.500.

The above clearly shows just how bad pitchers in general are at hitting which is one of the reasons for DH.  In order to put the above in context we must compare the above numbers to the worst hitters in each lineup.

Since AL teams have DH they normally do not make pitchers hit.  In order to put the above in context we’ll look at the 9th hitter in each lineup last year and if I get motivated, the last ten years.  The bottom of a lineup is where managers put hitters they want to have the least amount of plate appearances.   What kind of Win% do these players put up?  We’ll see.  Until then ….

Hall of Fame Part 3

In this part we’ll cover the other 3 MLB hall of fame inductees from the latest vote which can be seen here.   Below is a bunch of career tables showing year by year valuations for both WAR and WAA value systems.  According to this data model all 3 deserve HOF induction with the weakest being Edgar Martinez who squeaked in on his last year of eligibility.

This data model abhors tables of numbers but there is no other way to present these long careers.  Comments will be interspersed among the tables.  Order is their appearance on the HOF voting ballot according to this baseball-reference web site.

Edgar Martinez WAA

Edgar Martinez is ranked #214 of all post 1900 MLB players according to this data model which just barely gets him in.  The threshold should be somewhere between 200 and 250.  Ranking score for this data model is 1120.  WAR has him ranked much higher with a ranking score of 1974.

Year Rank WAA Name_TeamID Pos
1987 XXXXX -0.02 Edgar_Martinez_SEA 3B
1988 XXXXX -0.34 Edgar_Martinez_SEA 3B
1989 XXXXX -0.15 Edgar_Martinez_SEA 3B
1990 XXXXX -0.32 Edgar_Martinez_SEA 3B
1991 XXXXX 1.13 Edgar_Martinez_SEA 3B
1992 +034+ 5.17 Edgar_Martinez_SEA 3B-DH
1993 XXXXX -0.53 Edgar_Martinez_SEA DH-3B
1994 XXXXX 0.10 Edgar_Martinez_SEA 3B-DH
1995 +005+ 8.53 Edgar_Martinez_SEA DH
1996 +024+ 7.33 Edgar_Martinez_SEA DH
1997 +041+ 5.50 Edgar_Martinez_SEA DH
1998 +117+ 2.88 Edgar_Martinez_SEA DH
1999 +180+ 1.83 Edgar_Martinez_SEA DH
2000 +013+ 8.13 Edgar_Martinez_SEA DH
2001 +030+ 6.36 Edgar_Martinez_SEA DH
2002 XXXXX 0.82 Edgar_Martinez_SEA DH
2003 +129+ 2.77 Edgar_Martinez_SEA DH
2004 -093- -2.44 Edgar_Martinez_SEA DH
Total 46.75  1120

Edgar Martinez WAR

Below is an extended WAR table available for hitters in WAR.  WAR has an offensive component oWAR and a defensive component dWAR.  The two cannot be added together to make WAR because WAR does not have additive properties.

Normally this model adheres to a Keep It Simple Policy (KISS) meaning fewer entries in a table the better.  The total (or whatever) WAR is used for sorting and ranking both pitchers and batters together, like what is done for this data model throughout.

Year Rank WAR oWAR dWAR PA Name_Tm Pos
1987 XXXXX 0.2 0.4 -0.2 46 Edgar_Martinez_SEA 3B
1988 XXXXX -0.1 0.1 -0.2 38 Edgar_Martinez_SEA 3B
1989 XXXXX 0.5 0.0 0.6 196 Edgar_Martinez_SEA 3B
1990 +022+ 5.5 4.2 1.5 572 Edgar_Martinez_SEA 3B
1991 +017+ 6.1 5.5 0.8 642 Edgar_Martinez_SEA 3B
1992 +011+ 6.6 7.1 -0.7 592 Edgar_Martinez_SEA 3B-DH
1993 XXXXX 0.2 0.5 -0.4 165 Edgar_Martinez_SEA DH-3B
1994 +079+ 3.0 2.4 0.5 387 Edgar_Martinez_SEA 3B-DH
1995 +006+ 7.0 7.2 -1.4 639 Edgar_Martinez_SEA DH
1996 +021+ 6.4 6.4 -1.1 634 Edgar_Martinez_SEA DH
1997 +020+ 6.2 6.1 -1.3 678 Edgar_Martinez_SEA DH
1998 +044+ 5.6 5.6 -1.4 672 Edgar_Martinez_SEA DH
1999 +046+ 4.9 4.8 -1.1 608 Edgar_Martinez_SEA DH
2000 +029+ 5.6 5.6 -1.2 665 Edgar_Martinez_SEA DH
2001 +051+ 4.8 4.8 -1.1 581 Edgar_Martinez_SEA DH
2002 +163+ 2.6 2.6 -0.8 407 Edgar_Martinez_SEA DH
2003 +117+ 3.3 3.3 -1.2 603 Edgar_Martinez_SEA DH
2004 XXXXX -0.3 -0.4 -1.0 549 Edgar_Martinez_SEA DH
Total 68.1 1974

WAR has a ranking score of 1974, significantly higher than this data model’s 1120.   We know from his poor dWAR numbers this is totally due to offense which makes him a good direct comparison between the two models.  This model shows why it took him 10 years to get in.  He should have gotten in much sooner according to WAR.

WAR tends to over value hitters based upon anecdotal observation.  We know the sum of WAR hitters consists of 60% of the league total of 1000 year after year.  This might be due to overvaluing oWAR.   Perhaps we’ll explore this further … perhaps not.  It doesn’t really matter.

Roy Halladay WAA

This data model has Roy Halladay ranked #125 out of all post 1900 MLB players so he clearly qualifies for HOF and he gets in on first ballot with 85% of the vote.  He had very bad years in 2000 and 2013 but made all that negative value back and more with many superb top ten years.

WAR and WAA are almost in complete agreement according to ranking scores highlighted in brown.  Both systems pegged him #1 in the bottom 200 in 2000.

Year Rank WAA Name_TeamID Pos
1998 XXXXX 0.80 Roy_Halladay_TOR PITCH
1999 +129+ 2.77 Roy_Halladay_TOR PITCH
2000 -001- -9.24 Roy_Halladay_TOR PITCH
2001 +111+ 2.94 Roy_Halladay_TOR PITCH
2002 +014+ 7.58 Roy_Halladay_TOR PITCH
2003 +015+ 7.33 Roy_Halladay_TOR PITCH
2004 XXXXX 1.09 Roy_Halladay_TOR PITCH
2005 +024+ 6.15 Roy_Halladay_TOR PITCH
2006 +017+ 7.08 Roy_Halladay_TOR PITCH
2007 +061+ 3.97 Roy_Halladay_TOR PITCH
2008 +006+ 8.84 Roy_Halladay_TOR PITCH
2009 +008+ 8.42 Roy_Halladay_TOR PITCH
2010 +003+ 9.37 Roy_Halladay_PHI PITCH
2011 +006+ 8.32 Roy_Halladay_PHI PITCH
2012 -128- -1.81 Roy_Halladay_PHI PITCH
2013 -027- -4.20 Roy_Halladay_PHI PITCH
Total 59.41  1362

Roy Halladay WAR

Year Rank WAR IP Name_Tm Pos
1998 XXXXX 0.4 14.0 Roy_Halladay_TOR PITCH
1999 +160+ 2.6 149.1 Roy_Halladay_TOR PITCH
2000 -001- -2.8 67.2 Roy_Halladay_TOR PITCH
2001 +125+ 3.0 105.1 Roy_Halladay_TOR PITCH
2002 +005+ 7.4 239.1 Roy_Halladay_TOR PITCH
2003 +004+ 8.1 266.0 Roy_Halladay_TOR PITCH
2004 +179+ 2.4 133.0 Roy_Halladay_TOR PITCH
2005 +025+ 5.5 141.2 Roy_Halladay_TOR PITCH
2006 +029+ 5.2 220.0 Roy_Halladay_TOR PITCH
2007 +098+ 3.5 225.1 Roy_Halladay_TOR PITCH
2008 +019+ 6.2 246.0 Roy_Halladay_TOR PITCH
2009 +011+ 6.9 239.0 Roy_Halladay_TOR PITCH
2010 +002+ 8.3 250.2 Roy_Halladay_PHI PITCH
2011 +001+ 8.9 233.2 Roy_Halladay_PHI PITCH
2012 XXXXX 0.9 156.1 Roy_Halladay_PHI PITCH
2013 -065- -0.9 62.0 Roy_Halladay_PHI PITCH
Total 65.6  1408

Mike Mussina WAA

Mike Mussina gets voted in after 6 years with 76% vote.  This data model has him ranked #123, almost exactly tied with Roy Halladay above.   WAR has his career valued much higher than all current HOF inductees based upon ranking score.

Even though Mussina and Halladay are virtually tied in career WAA,  Mussina has a much higher ranking score.   Career WAA is the only factor used for ranking purposes, both seasonal and year to year.  Ranking scores are only computed to compare how WAA values a player with WAR.

Year Rank WAA Name_TeamID Pos
1991 +134+ 2.10 Mike_Mussina_BAL PITCH
1992 +012+ 6.66 Mike_Mussina_BAL PITCH
1993 XXXXX -0.99 Mike_Mussina_BAL PITCH
1994 +008+ 6.38 Mike_Mussina_BAL PITCH
1995 +019+ 6.38 Mike_Mussina_BAL PITCH
1996 XXXXX -1.49 Mike_Mussina_BAL PITCH
1997 +030+ 5.96 Mike_Mussina_BAL PITCH
1998 +065+ 4.79 Mike_Mussina_BAL PITCH
1999 +037+ 5.84 Mike_Mussina_BAL PITCH
2000 +037+ 5.61 Mike_Mussina_BAL PITCH
2001 +026+ 6.64 Mike_Mussina_NYA PITCH
2002 XXXXX 1.03 Mike_Mussina_NYA PITCH
2003 +055+ 4.77 Mike_Mussina_NYA PITCH
2004 XXXXX -0.44 Mike_Mussina_NYA PITCH
2005 XXXXX -0.36 Mike_Mussina_NYA PITCH
2006 +051+ 4.64 Mike_Mussina_NYA PITCH
2007 -099- -2.44 Mike_Mussina_NYA PITCH
2008 +049+ 4.51 Mike_Mussina_NYA PITCH
Total 59.59 1776

Mike Mussina WAR

Year Rank WAR IP Name_Tm Pos
1991 XXXXX 2.2 87.2 Mike_Mussina_BAL PITCH
1992 +004+ 8.2 241.0 Mike_Mussina_BAL PITCH
1993 XXXXX 1.5 167.2 Mike_Mussina_BAL PITCH
1994 +012+ 5.4 176.1 Mike_Mussina_BAL PITCH
1995 +014+ 6.1 221.2 Mike_Mussina_BAL PITCH
1996 +086+ 3.6 243.1 Mike_Mussina_BAL PITCH
1997 +027+ 5.5 224.2 Mike_Mussina_BAL PITCH
1998 +055+ 5.0 206.1 Mike_Mussina_BAL PITCH
1999 +065+ 4.4 203.1 Mike_Mussina_BAL PITCH
2000 +026+ 5.6 237.2 Mike_Mussina_BAL PITCH
2001 +013+ 7.1 228.2 Mike_Mussina_NYA PITCH
2002 +056+ 4.5 215.2 Mike_Mussina_NYA PITCH
2003 +013+ 6.6 214.2 Mike_Mussina_NYA PITCH
2004 +177+ 2.4 164.2 Mike_Mussina_NYA PITCH
2005 +110+ 3.4 179.2 Mike_Mussina_NYA PITCH
2006 +035+ 5.0 197.1 Mike_Mussina_NYA PITCH
2007 XXXXX 1.0 152.0 Mike_Mussina_NYA PITCH
2008 +038+ 5.2 200.1 Mike_Mussina_NYA PITCH
Total 82.7  2269

Next season Clemens and Bonds will probably get in breaking the no PEDs seal.  We’ll run through a couple of interesting careers who didn’t make the 75% in the next, and possibly final, part to this series.  Until then ….