Career Rankings Part 2

Today is opening day at Wrigley Field and there’s almost an inch of snow on the ground — almost enough to think about shoveling it.  Arggghhh!  Normally we do an analysis of each Cubs’ series at the start and if there are some strange shifts in the Ouija board we’ll look at that particular game and talk through it.

Most likely they’ll call this game and play a double header sometime later in the season.  In the old days when they played two you only had to buy one ticket.  Those days are long gone.

Since we finished the career scripts, instead of analyzing the CHN PIT matchup, which we can’t really do because we have no current year data to crunch, let’s look at all 30 MLB teams based upon career.  The table below is the same format used for playoff horse race last September.  The win loss column is meaningless right now so it’s zeroed out.  Total is the sum of Pitchers and Hitters, Pitchers is the sum of Starters and Relief.  All players categorized by how they’re listed on the active roster.

Careers are limited to the last three years service (i.e. what have you done for me lately).  Although Albert Pujols is clearly the highest ranking career player in baseball and most likely unanimous HOF first ballot, he’s near the end of his career.  He’s ranked #71 for his last three years which is still very very productive.  More on him later.

The below table is sorted by Total value from best to worst.

TeamID Hitters Pitchers Starters Relief Total W-L
HOU 49.53 70.04 35.28 34.76 119.57 0
CHN 49.94 55.63 36.47 19.16 105.57 0
CLE 32.33 59.88 26.13 33.75 92.21 0
WAS 36.34 55.19 40.93 14.26 91.53 0
BOS 55.55 35.24 23.90 11.34 90.79 0
LAN 17.97 65.55 48.05 17.50 83.52 0
TOR 40.74 41.30 26.12 15.18 82.04 0
NYA 30.25 51.65 17.22 34.43 81.90 0
COL 59.60 10.11 -3.68 13.79 69.71 0
NYN 25.85 34.70 25.38 9.32 60.55 0
BAL 35.79 9.31 -15.11 24.42 45.10 0
MIL 15.53 24.69 8.49 16.20 40.22 0
MIN 15.78 19.32 4.18 15.14 35.10 0
ANA 24.16 10.20 5.10 5.10 34.36 0
SFN 13.20 17.68 0.33 17.35 30.88 0
ARI 10.90 13.62 19.42 -5.80 24.52 0
TEX 24.26 -1.94 -4.78 2.84 22.32 0
OAK 7.10 12.83 -4.28 17.11 19.93 0
SLN 15.17 4.10 8.72 -4.62 19.27 0
SEA -7.55 15.31 6.93 8.38 7.76 0
TBA -17.78 17.43 9.99 7.44 -0.35 0
MIA 0.65 -3.47 -13.06 9.59 -2.82 0
CIN 1.02 -4.12 -2.82 -1.30 -3.10 0
PHI 0.62 -6.06 -5.49 -0.57 -5.44 0
PIT -9.62 3.55 -0.73 4.28 -6.07 0
CHA -4.03 -4.82 -12.68 7.86 -8.85 0
KCA -25.54 7.30 -0.22 7.52 -18.24 0
ATL -11.08 -8.34 -17.33 8.99 -19.42 0
SDN -16.12 -15.16 -10.71 -4.45 -31.28 0
DET -14.45 -23.97 -11.29 -12.68 -38.42 0

The Cubs are #2 behind Houston as having the best career talent.  Theo Epstein is using the same model as ours.   In Part 3 of this series we’ll look at past career rankings at the beginning of various seasons and, since we are from the future, compare that to how things turned out that year.  Until then….

Career Rankings Part 1

Finally finished roster parsing and career tabulation scripts and we’re all up to date.  In this part we’re only going to show top 10 MLB players according to their total career numbers and top ten Cubs.

In subsequent parts we’ll do historical rankings as well as rank the current MLB rosters based upon career.  For now let’s look at the top ten players in the MLB according to their accumulated career from when they started until the end of 2017.

Rank WAA Name_TeamID Pos
+001+ 111.0 Albert_Pujols_ANA BAT
+002+ 84.8 Miguel_Cabrera_DET BAT
+003+ 76.1 Clayton_Kershaw_LAN PITCH
+004+ 53.7 Ryan_Braun_MIL BAT
+005+ 53.6 Felix_Hernandez_SEA PITCH
+006+ 44.2 Robinson_Cano_SEA BAT
+007+ 43.4 Edwin_Encarnacion_CLE BAT
+008+ 43.2 Zack_Greinke_ARI PITCH
+009+ 41.6 Justin_Verlander_HOU PITCH
+010+ 40.8 Cole_Hamels_TEX PITCH

Those are your top ten according to this data model who are currently playing on an MLB roster.  Let’s look at the Cubs.

Rank WAA Name_TeamID Pos
+031+ 29.3 Jon_Lester_CHN PITCH
+053+ 17.9 Anthony_Rizzo_CHN BAT
+063+ 15.3 Kyle_Hendricks_CHN PITCH
+067+ 15.0 Kris_Bryant_CHN BAT
+078+ 12.6 Steve_Cishek_CHN PITCH
+081+ 12.3 Jose_Quintana_CHN PITCH
+083+ 12.2 Ben_Zobrist_CHN BAT
+084+ 11.8 Yu_Darvish_CHN PITCH
+123+ 8.0 Pedro_Strop_CHN PITCH
+167+ 5.6 Justin_Wilson_CHN PITCH

Not bad at all.  Eight guys in the top 100 out of all 30 teams.  An average distribution would yield 3 or 4 for an average team.  When we use career data to rank MLB teams according to starters, relief, and lineups we will only count the last 3 years of service.  This will level the playing field for the young guys and knock guys like Pujols down a few notches.  That will be fodder for Part 2 of this series.  Until then….

Cubs Brewers matchup

Way too early in the season to evaluate players or team status.   Team statuses will start to make sense in about 2 weeks and we’ll start sorting players and handicapping second week in May.  For now let’s look at what the Ouija Board says about today.

DATE 04_06 8:10_PM CHN MIL
LINEAWAY CHN [ 0.588 ] < 0.580 >
STARTAWAY 0.00(NA) Kyle_Hendricks_CHN
LINEHOME MIL [ 0.429 ] < 0.439 >
STARTHOME 0.00(NA) Brandon_Woodruff_MIL
CHN 92 70 MIL 86 76

We call the betting market a Ouija Board because thousands of bettors all move the market in each direction.  Where that market settles is the expected probability that all these people settled upon.  Like a Ouija board settles on a phrase from the beyond, the beyond somehow comes ups with an expected probability that is very accurate in most games.  How all these anonymous bettors can come up with this is a mystery.

The Cubs started at 58.8% chance of winning and dropped to 58%.  If you want to bet the Brewers you must think they have a greater than 43.9% chance of winning today.

The DeltaWAA line is using last year’s win loss results.  The Cubs had a real WAA of 92-70=22.  The Brewers’ real WAA was 86-76=10.  The DeltaWAA between the two teams is 22-10=12.  This gets looked up in a table and the resultant expected probability using these numbers would be 58.2% in favor of the Cubs.  Perhaps Vegas compiled the same table and that’s how they set the line today.  Who knows?

Usually we would add up lineups and relief staff and provide that data but there is no data yet.  All we can do is analyze the two starters and that’s that.  Woodruff doesn’t have an MLB career yet.  He pitched in AAA  and 43 innings in MLB for MIL last season.  Not enough data!

We know Kyle Hendricks but here are his career numbers.

Year WAA Name_TeamID Pos Rank
2014 2.3 Kyle_Hendricks_CHN PITCH +131+
2015 -0.1 Kyle_Hendricks_CHN PITCH XXXXX
2016 8.9 Kyle_Hendricks_CHN PITCH +002+
2017 4.3 Kyle_Hendricks_CHN PITCH +047+
Total 15.4

Solid career.

In order to bet the Cubs you would need 7-10% margin which means they have a 2/3 chance of winning today.  To bet the Brewers you would have to think they have better than 50/50.  Using career numbers the Cubs have a better team and we’re almost ready to present that data.  Is it 2/3 chance better.  Not sure.  With the data we have the Brewers are underdog and the line is most likely exactly where it should be.  Both lines a complete discard.  We do not want to gamble.

Opening Day!

Today is opening day and it’s not even April.  We count this game as being in April however.  Cubs play the Marlins in Miami.  It appears MLB has it figured out that these northern cities can be pretty cold this time of the season.  No one wants to watch or play in a baseball game when it’s below freezing.

Vegas never misses an opportunity to bet on anything and the Ouija board is running right out of the gate.  Since we won’t have any useful data to evaluate players the Ouija board is all we can look at.  Here are today’s lines for the Cubbies.

DATE 03_29 12:40_PM CHN MIA
LINEAWAY CHN [ 0.661 ] < 0.677 > 
STARTAWAY 0.00(NA) Jon_Lester_CHN
LINEHOME MIA [ 0.353 ] < 0.345 > 
STARTHOME 0.00(NA) Jose_Ureña_MIA
CHN 92 70 MIA 77 85

Each baseball game consists of two lines, one for each team.  Above are the probabilities you would need in order to break even.  If you add both lines they exceed 1 because the house plays both sides and the excess is their cut.  The house can’t lose in the long run.

The number in [] is the starting guess Vegas made, the number in <> is the current line.  These will vary somewhat between bookies but we’re not here to arbitrage.  At 0.677  you need to risk $200 to win $100 today.  That is a very big premium to pay — especially since we don’t know anything about any player in the regular season.

Career wise Jon Lester is solid.  Here are his career numbers.

Year WAA Name_TeamID Pos Rank
2006 -0.4 Jon_Lester_BOS PITCH XXXXX
2007 -0.1 Jon_Lester_BOS PITCH XXXXX
2008 5.5 Jon_Lester_BOS PITCH +034+
2009 4.4 Jon_Lester_BOS PITCH +052+
2010 3.9 Jon_Lester_BOS PITCH +064+
2011 2.0 Jon_Lester_BOS PITCH +164+
2012 -3.8 Jon_Lester_BOS PITCH -027-
2013 0.6 Jon_Lester_BOS PITCH XXXXX
2014 4.0 Jon_Lester_BOS PITCH +010+
2014 2.4 Jon_Lester_OAK PITCH +010+
2015 2.8 Jon_Lester_CHN PITCH +090+
2016 8.0 Jon_Lester_CHN PITCH +004+
2017 0.0 Jon_Lester_CHN PITCH XXXXX
Total 29.3

Best year was with the World Champion Chicago Cubs and worst year was 2012 with the Red Sox.   Very above average career.  Soon we’ll do a series in April to put the 29.3 value into context.   If you rank MLB on career numbers Lester is around the top.  His playoff numbers are extremely high which will probably propel him into the HOF one day.

Edit: His second best year was with BOS/OAK in 2014.  Those two numbers added together make his seasonal number.  He was ranked #10 that year and #4 in 2016.

Who is Jose Urena?

Year WAA Name_TeamID Pos Rank
2015 -1.8 Jose_Urena_MIA PITCH -138-
2016 -3.8 Jose_Urena_MIA PITCH -029-
2017 2.1 Jose_Urena_MIA PITCH +162+
Total -3.5

An overall below average career but he pitched very well last season ranking in the top 200 at 162.  Cubs may have an edge on starters but not sure about a 2-1 edge.

According to the DeltaWAA chart using team wins and losses from last year the Cubs should be around 0.639 advantage.  I’m not so sure that’s a valid measure  for this new season.

tl;dr Cubs line a clear discard.  Really don’t know anything about anyone to bet the MIA line either.  Happy new baseball season!

Another baseball season soon starts anew…

…and I have done very little in the off season.   This model and handicapping needs data which means no real player analysis until May.  The betting markets start on day one so someone might have the percentages worked out.

In this model everyone starts a new season at 0.  Clayton Kershaw is ranked equal to the worst player you can imagine still on an MLB active roster.   We need data to separate them but it is possible to use career data for the first month.  There might be a way of incorporating career potential into handicapping.  In the beginning of the season all we know is career data.  By the end of the season a player needs to be judged by what have you done for me lately stats.

As soon as the rosters come out I will post a table like what was done for the playoffs.  The simulations need to be finished as well but we have until May for that.  Until then….