Top Ten MLB Players 6/23/2017

It has been 12 days since we last did this.   Both AL and NL, pitchers and batters, get ranked together.  This model assigns wins to all players equally according to the runs they produce (batters) or the runs the don’t let score (pitchers).  Both are equally important.   Here are the top ten MLB players according to this data model.

Rank WAA Name_TeamID Pos
+001+ 5.80 Paul_Goldschmidt_ARI 1B
+002+ 5.75 Max_Scherzer_WAS PITCH
+003+ 5.19 Aaron_Judge_NYA RF
+004+ 4.58 Bryce_Harper_WAS RF
+005+ 4.54 Dallas_Keuchel_HOU PITCH
+006+ 4.45 Jason_Vargas_KCA PITCH
+007+ 4.45 Ryan_Zimmerman_WAS 1B
+008+ 4.28 Charlie_Blackmon_COL CF
+009+ 4.01 Cody_Bellinger_LAN LF-1B
+010+ 3.97 Clayton_Kershaw_LAN PITCH

Here is the top ten according to WAR.

Rank WAR Name_TeamID Pos  WAA Rank
+001+ 4.2 Aaron_Judge_NYA RF
+002+ 4.1 Paul_Goldschmidt_ARI 1B
+003+ 3.9 Max_Scherzer_WAS PITCH
+004+ 3.6 Jason_Vargas_KCA PITCH
+005+ 3.4 Mookie_Betts_BOS RF  XXXXX
+006+ 3.4 Mike_Trout_ANA CF  41
+007+ 3.3 Nolan_Arenado_COL 3B  16
+008+ 3.3 Dallas_Keuchel_HOU PITCH
+009+ 3.3 Jose_Altuve_HOU 2B  XXXXX
+010+ 3.2 Carlos_Carrasco_CLE PITCH  20

The highlighted names in the above table are those where this model and WAR both have these players in the top ten somewhere.  We have Carrasco #20 and Trout #41.   The bad outliers are Betts and Altuve who we have unranked, out of top 200.  WAR folds defense into its weighting factor which is what caused Darwin Barney to have a WAR=4.8 and ranked #39 in 2012.  Darwin Barney’s WAR is the inspiration behind the creation of this data model back in 2013.  We’ll get into the flaw in how Sabermetrics measures defense later because it’s a big topic.  For now let’s look at the full lines of the two outlier players above.

Rank WAA BA OBP PA RBI R Name_TeamID Pos
XXXXX 0.90 0.273 0.350 317 40 45 Mookie_Betts_BOS RF
XXXXX 0.90 0.322 0.394 313 34 47 Jose_Altuve_HOU 2B

Mookie Betts had an incredible season last year.  We had him ranked #5 at the end of the 2016 season and WAR had him ranked #2.   His BA and OBP isn’t that spectacular.  We have him above average at 0.90 and coincidentally Altuve is at the same WAA value.  Altuve has a very high BA and OBP and the runs created formula favor that immensely which ends up in WAR’s final rating.  Altuve’s high WAR is understandable.  The only reason for Betts’ high WAR is either carry over from 2016 or a Darwin Barney like fluke defense addition.

Altuve has 34 RBIs and 47 Runs which is his run production.  He scampers around the bases more than he drives guys in but that’s just as important.  You can’t tell if the above is good or bad or what just by looking at those numbers.   You need to know team and league averages to come to the conclusion that Jose Altuve brought 0.9 wins above league average this season and that’s it.

Update 6/24/2017: The above now in italics is not entirely correct.  Basic Runs Created formula also relies heavily on total bases which is described in more detail here.  If a player hits a lot of home runs they accumulate fast in total bases — even if most of them are solo.  A player who hits a double to drive in guys on first and second scoring 2 runs will only add 2 to his Total Bases while a player who hits a solo home run adds 4.  Many players playing for their Draft Kings team want to hit home runs because it boosts  RC and WAR significantly.

Hit stats are game stats, stats Joe Maddon needs to know managing a game.  This data model computes a value stat.   WAR is a value stat which is why we compare the two.  We don’t keep track of home runs or hits of any type.  We have them if needed for proofs or disproofs but that’s all we use them for.  As mentioned above looking at Altuve’s run production provides no information without context.  That is a big problem blindly throwing around baseball stats.  Let’s look at Nolan Arenado’s numbers who we have at #16 and WAR has him slightly above.

Rank WAA BA OBP PA RBI R Name_TeamID Pos
+016+ 3.30 0.301 0.352 321 59 47 Nolan_Arenado_COL 3B

Arenado has 25 more RBIs than Altuve with the same number of runs.  That’s why we more or less agree with WAR over him.  Those extra 25 RBIs turned into real wins for the Rockies this yearEnd of Update 6/24/2017

To make this even more clear as to why we have Altuve at 0.9 let’s take a look at HOU which is one of the hottest teams in MLB this season.  They could be  World Series contenders  this year.

BAT PITCH Rs Ra W L UR LR TeamID
63.7 40.1 408 288 50 24 12.1 4.1 HOU

At 50-24 they have a real WAA=26 or some people say they’re 26 games above 0.500 (or average). In this data model, that 26 wins must be divvied up amongst all contributers on HOU. Let’s take a look at them.  Their UR is excellent which means HOU has very good fielding as a team (i.e. they don’t make a lot of costly errors).

Rank WAA Name_TeamID Pos
+005+ 4.54 Dallas_Keuchel_HOU PITCH
+022+ 3.04 George_Springer_HOU CF-RF
+025+ 2.96 Lance_McCullers_HOU PITCH
+031+ 2.71 Carlos_Correa_HOU SS
+047+ 2.44 Marwin_Gonzalez_HOU LF-1B-3B
+070+ 2.10 Jake_Marisnick_HOU CF
+099+ 1.70 Chris_Devenski_HOU PITCH
+103+ 1.68 Brian_McCann_HOU CR
+120+ 1.49 Will_Harris_HOU PITCH
+123+ 1.47 Brad_Peacock_HOU PITCH
+183+ 1.09 Evan_Gattis_HOU CR-DH
XXXXX 0.90 Jose_Altuve_HOU 2B

This model counts real runs, assigns them, which gets estimated into wins.  Those above rankings truly reflect the wins each of those players contributed to make HOU 24 games above 0.500.  We could place a WAR value next to each of these players, add them up, and it won’t come anywhere close to reality.  There is a mathematical proof to this model and it is consistent from player to player, league to league, year to year.  Our ranking is correct here just like our ranking was correct for Darwin Barney in 2012.

Rank WAA BA OBP PA RBI R Name_TeamID Pos
-126- -1.87 0.254 0.299 588 44 73 Darwin_Barney_CHN 2B 2012

That -126- means #126 in the bottom 200, a list no one wants to be #1.