Playoff Horse Race Part 3

Now that we’re into the home stretch of the season, let’s look at the Playoff Horse Race once again according to this data model.  The table below is sorted by Total team WAA according to current team rosters.  Total is the sum of Hitters and Pitchers, Pitchers is the sum of Starters and Relief.

Due to expanded rosters distorting relief value, only the top 7 relievers are counted.  The model counts all Hitters and Starters as expanded rosters do not affect those categories as much.  New guys coming from the minors start out at WAA=0.  Teams tend to label high negative pitchers as relief greatly distorting that measure in September.  This won’t be a problem for final playoff rosters nor is it a problem all the other months of a season.

Playoff Horse Race

TeamID W-L Total Hitters Pitchers Starters Relief
BOS 55 49.1 28.5 20.7 11.7 9.0
HOU 38 47.5 16.2 31.3 18.9 12.4
LAN 15 46.5 21.9 24.6 14.9 9.7
ATL 18 36.3 15.5 20.8 11.3 9.4
CLE 18 35.9 19.4 16.5 11.0 5.5
OAK 31 35.2 14.5 20.7 5.4 15.3
NYA 34 34.0 20.8 13.2 4.7 8.4
CHN 26 31.6 12.0 19.6 7.2 12.4
MIL 21 29.3 12.2 17.1 4.5 12.7
SLN 13 28.5 11.6 16.9 11.2 5.7
WAS 1 24.6 10.5 14.1 8.1 6.0
COL 14 21.7 13.3 8.4 -0.4 8.7
ARI 7 18.5 2.7 15.9 6.1 9.8
PHI 5 17.9 3.0 15.0 6.8 8.2
ANA -3 14.6 6.4 8.2 1.0 7.2

The Cubs (CHN) rose in this table from Part 2 of this series due to the new way of counting relief in September rosters.  For brevity the above table only shows the top 15 MLB teams and all playoff contenders are now included as Colorado (COL) climbed back onto this list.

Blue teamids are those leading their divisions.  Green are wild card leaders, and tan are those still in the hunt.  There are no AL teams (other than those with a playoff spot) in the hunt right now.   Playoff teams tend to be buyers at trade deadlines while those not in contention are sellers trading high value players for future prospects.   One would expect playoff teams to top a list like this and they do.

Bold blue in the other columns are the highest among playoff contenders, regular blue second highest in that category.  This should provide an idea of what to expect during various AL and NL playoff matchups.

The Yankees talent still looks low compared to their W-L (WAA) value of +34.  Ironically this is opposite to last year when they had a rather low (for playoff teams) real team WAA=20 yet near the top in team value.  They ended up losing to a better Houston team in game 7 of ALCS.  After manually looking at Yankees roster it looks like only Aaron Judge is missing which would only propel them to around middle of the pack, around where the Cubs are now.

Dodgers and Oakland ranked very high for not being a divisional leader.  Except for the wild card game which is a crap shoot, regular season roster talent is more important than regular season wins and losses.  This model spits out these tables automatically.  There is no way I can discern what caused LAN or OAK to rank so high unless I keep track of every team’s transactions.  One of the purposes of this model is not to have to do that.

There will be one more part to this series using expanded rosters and then a final part using playoff rosters — which usually all come out after the wild card games.  Right now AL looks pretty strong and Cubs in the middle of the pack — again.

Playoff Horse Race  ASG to present

I had been toying with the idea of presenting the above table showing just the second half of the season.    I’m not a fan of streaks because in golf, you can’t just count the back 9 and ignore how you performed on the front 9.  Also past results don’t affect future results, only show capability.  Would it be interesting however to see what teams did only counting the back 9?

What started as something I thought could be simple to jury rig turned complicated and 300 lines of perl code later we get the following.

The real halftime to an MLB season is around the end of June.  The ceremonial halftime is  All Star break.  I chose to use  All Star break date as the start date, and today’s date as  end date.   All Star break give players time off and they may reflect upon their season so far and have epiphanies — like what happen to the Cubs last season.   This can be done because the WAA value measure generated by this model has proven additive properties.

TeamID W-L Total Hitters Pitchers Starters Relief
OAK 18 16.8 7.2 9.5 2.8 6.7
BOS 17 19.9 10.3 9.6 7.8 1.8
TBA 13 14.6 0.1 14.4 6.1 8.3
SLN 11 22.0 11.4 10.6 3.3 7.3
MIL 9 11.2 4.1 7.1 3.0 4.0
HOU 9 12.2 3.2 9.0 5.3 3.7
CLE 9 13.2 5.6 7.7 -1.3 9.0
CHN 9 6.4 -2.2 8.5 3.1 5.5
COL 8 10.6 0.0 10.6 0.1 10.5
ATL 8 15.8 6.3 9.5 3.5 6.0
NYN 6 8.8 1.8 7.0 3.2 3.7
NYA 5 13.1 9.9 3.2 4.0 -0.8
LAN 5 24.0 12.9 11.1 6.2 4.9
WAS 1 9.7 5.6 4.1 2.2 1.9
PIT 0 9.8 -6.2 16.0 7.5 8.5

The above is sorted by the W-L column (WAA) because sorting on Total value using an interval produces deceptive results.  This interval only represents accrued team value not true team value going into the playoffs like shown in the first table.

Edit for clarification 9/17/2018: Accrued team value can occur from getting rid of negative value players, acquiring positive value players through trades or DL activation, and earning it through play.  The above table does not discern this.  The W-L (WAA) column represents  real accrued team value for this interval  that is 100% accurate.  </>

Oakland clearly leads the league in the second half.  Unfortunately for them Houston is in their division so OAK will be stuck playing a wild card crap shoot game.  Boston has been chugging along the second half like their first half.

According to Total roster value Dodgers and Cardinals improved the most during second half as well in the real win/loss columns.  Cubs roster value increased the least among the 15 teams in the above list followed closely by the Yankees.

Since the start date and end date of this delta are two snapshots, it’s possible players on either end could be DL distorting delta value somewhat.  I’m not sure what value the above table provides.  It  uses a reduced dataset which will increase error.

Streaks are funny and many TV announcers like to cherry pick streak intervals to further some narrative.  In the realm of TV and fandom that really doesn’t matter but it’s deceptive, purposely sometimes, if you want a true analysis of the current state of a team or player.

Streaks are used by players thinking they can beat craps, roulette, slot machines, etc. etc. which keep the hotels in places like Las Vegas filled and casino profits high.  They also get people to lose money in the stock market or crypto currency LOL.  Every time you hear JD spout some streak nonsense cover your ears because BS usually follows.

This is why the delta table above may be worthless.  A true measure of a player and a team is a complete season.  More data equal less error and the MLB commissioner does not pick playoff teams based on who won the most games post All Star break.  The above was an interesting illustration and the 300 lines of code to make that table may be useful for other purposes.

That is all for now.  Part 4 of this series in a week or so and then Part 5 will be using official playoff rosters.  The wild card games will be handicapped the old fashion way and then the real playoff season begins.  Until then ….