Category Archives: Outside Articles

Cubs lose 5th straight 1-run game

On Sunday, they became just the second team in 100 years to get swept at home in a four game series and lose all four by a single run. They also became the first major league team since 2011 to lose five straight one-run games. It’s the first time it’s happened to the franchise since 1915.

Source: Crumbling Cubs lose 5th straight 1-run game

There are three color coded assertions above.  Let’s look at all three starting with the assertion colored in blue. After hearing about this 5 game streak it occurred that a question like this is something this data model should easily handle.  This however required modifying a script that counts stuff in historical game logs since I hadn’t envisaged this use case.

First things first.  What is the probability of losing 5 one run games in a row?  Since we have no other information other than there are two outcomes to each event.   We can assume

P(lose) = 1/2 = P(win) , just like a flip of a coin.

The probability of a one run game is 0.30 using data from 1970 – 2018 , thus

P(one run game) = 3/10.

The probability of losing a one run game is P(L) * P(one run game) = 0.5 * 0.3 = 0.15.  This is the same as the probability of winning a one run game.  Thus, the probability of losing 5 one run games in a row would be 0.15^5 which is around 1/13169.

Since there are 157 * 30 / 2 = 2355 possible starts to a 5 game series that means we should expect one occurrence every 6 or 7 seasons.  Between May 8 and May 13 Arizona lost 5 straight 1 run games which is around what we would expect   Before that in September 1988, 23 seasons before 2011,  Atlanta lost 6 one run games in a row which is a larger gap than we would expect.  Distributions are never perfect — especially with small sample sizes.

Let’s look at the second assertion colored in brown. Each team plays around 11 four game home stands per season or a little over half of their 81 home games.  In 100 years that would be around 1000 events where a sweep like that can happen.

P(losing 4 one run games in a row) = [P(Lose) * P(one run game)]^4 = 0.15^4 = 1/1975

Probability of going so long without losing 4 one run games in a home stand after 1000 events or 100 years is around 60%.   The probability of it not happening next season is around 99.44%.  In other words, this assertion has nothing to do with the quality of the Cubs as a team and more how the pachinko ball bounces.

Update 10/1/2019:  In other words, had the Cubs lost a game by 2 runs in the middle of that losing streak noone would be talking about this.  Whether a team loses by 2 runs or 1 run is irrelevant but streaks make for click bait and give sportscasters something to pontificate about.

The Cubs went 2-7 in the last 10 days of the season losing 5 games.  How many a team loses in a row or how they lose those games is irrelevant.  Had they went 7-2 instead there would have been a 3 way scrum like last year for two playoff spots and who knows how that would have turned out.

Bottom line:  A baseball season is a marathon and the final record of a team encompasses 162 games played over 6 months — not a mere 5 in less than a week.  From all 10 parts to the Playoff Horse Race series of posts here it was clear very early Cubs didn’t have the horses to win an NL pennant let alone a World Series this year.  Cubs remained stagnant albeit above average all season so they ended the season about where they should have.

End of Update

And finally, for the assertion in green.  1915 is 104 years of baseball or around 16,000 games played.  We saw above that the probability of losing 5 one run games in a row is 1/13169.  The probability of going 16,000 games without losing 5 one run games in a row is

P’ = ( 1 – (1/131619 ) ) ^ 16000 =~ 30%

Thus it was a 70% possibility of it happening again in the time frame since 1915.

The probability of it not happening next season is 99.8%.  What does this have to do with the current Cubs team?  Nothing. Cubs simply couldn’t pull off wins at the end of this season and sometimes numbers align funny.

Wild Card handicapping and playoff coverage starting tomorrow.  Until then ….

DISCLAIMER: There are probably one or more errors in the math above.

5 storylines to follow the remainder of the Iowa Cubs season

Iowa hasn’t been to the playoffs since 2008. But that could change this season. Iowa finished the first half of the season with a 52-38 mark, tops in the Pacific Coast League’s American Northern Division. It owns a 10-game lead over the second place Omaha Storm Chasers.

Source: 5 storylines to follow the remainder of the Iowa Cubs season

Our analysis of this Iowa Cubs team here.

Cardinals To Extend Miles Mikolas

This time one year ago, Cardinals fans were unsure what to think of Mikolas, the team’s primary rotation addition last winter. At the time, Mikolas was a 29-year-old who’d never established himself in the Majors but put himself firmly on MLB radars with a brilliant three-year run for the Yomiuri Giants of Japan’s Nippon Professional Baseball.

Source: Cardinals To Extend Miles Mikolas – MLB Trade Rumors

I looked up Miles Mikolas after seeing this article mentioned on Twitter and found his career trajectory interesting.  Let’s look at his minor league career search table.

Miles Mikolas

Year Rank WAA Name_TeamID Pos WinPct League Age
2018 +018+ 6.11 Miles_Mikolas_SLN PITCH XXX mlb
2014 -028- -3.53 Miles_Mikolas_TEX PITCH XXX mlb
2014 XXXXX 1.20 Miles_Mikolas_TEX PITCH 0.621 aaa 25
2013 XXXXX 1.34 Miles_Mikolas_SDN PITCH 0.599 aaa 24
2012 XXXXX 0.32 Miles_Mikolas_SDN PITCH XXX mlb
2012 XXXXX 0.55 Miles_Mikolas_SDN PITCH 0.626 aaa 23
2012 XXXXX 0.27 Miles_Mikolas_SDN PITCH 0.599 aa 23
2011 +161+ 1.78 Miles_Mikolas_SDN PITCH 0.748 aa 22
2011 +092+ 2.73 Miles_Mikolas_SDN PITCH 0.809 aplus 22

If you track the Win% column for minors he was pretty good when SDN brought him up on their MLB team for 25 games of relief in 2012.  Back to AAA he went until resurfacing for TEX with a horrible season ranked 28 in bottom 200 — a list no one wants to be #1.

The article states he played 3 years in Japan between 2015 and 2017.  Japan leagues used to be compiled here but haven’t since 2013 so there is a gap in the above table.  Cardinals were impressed enough to gamble and it paid off handsomely last season.  Mikolas was ranked #18 according to this data model out of all players on 30 teams, both pitchers and batters.

I found the above table interesting because it clearly demonstrates that past results don’t necessarily predict future results.  His 0.600+ Win% in minors was no match for MLB hitters  in 2014, yet, he somehow figured it out.

A lot more goes into evaluating talent than looking at past stats.  Providing a lens that accurately shows the past is important.   If a team views a player had a great season when he actually didn’t, that could be a problem and what separates consistent winning franchises from consistent losing franchises.

That is all for now.  Crunching some DH numbers for another report maybe tomorrow or the next day.  Until then ….

Foul Balls Are The Pace-Of-Play Problem Nobody’s Talking About

So why are there more foul balls?

Source: Foul Balls Are The Pace-Of-Play Problem Nobody’s Talking About | FiveThirtyEight

This is one of those fun with numbers articles throwing around a bunch of rate gains in an attempt to make a point using numbers to support a narrative.  This data model has a historical dataset of game events from around 1960 to present from  That dataset has pitch by pitch data from 1988 to present.

The premise of the above article is that there are more foul balls now than 20 years ago.

tl;dr: Article is correct.  There are more foul balls now.

There can be many reasons why that is.   Using Occam’s Razor which dictates the simplest answer is probably the correct answer.  The simplest reason would be there is less foul ball territory for fielders to catch foul balls.  The article mentions this as one of several possible reasons.  It mentions foul territory surface decreased around 20% and our data shows foul balls have increased 20%.  Reduced foul territory seems like a pretty solid reason.

The next question which is the premise of this article: Are foul balls a cause for increase game times in baseball?  That is unclear.  Let’s look at the output of this data model.  We use tables here instead of graphs.

MLB Pitch Counts from 1998 to 2018

Year Total Foul In Play
Strike Ball
1998 650813 16.0 20.2 25.8 38.0
1999 694856 16.0 20.0 25.4 38.7
2000 714344 16.3 19.7 25.4 38.6
2001 695287 16.7 20.0 26.2 37.1
2002 696317 16.6 20.0 26.1 37.3
2003 697785 16.5 20.2 26.1 37.2
2004 705505 16.7 19.9 26.0 37.4
2005 691180 16.8 20.3 26.0 36.9
2006 702107 16.9 20.0 25.9 37.2
2007 706552 16.8 19.9 26.1 37.2
2008 708078 16.9 19.5 26.2 37.4
2009 710781 16.7 19.3 26.5 37.5
2010 704222 16.6 19.2 27.0 37.2
2011 701083 16.9 19.4 27.0 36.8
2012 698670 16.8 19.0 27.6 36.5
2013 703321 17.0 19.0 27.5 36.5
2014 697839 17.2 19.0 27.8 36.1
2015 695003 17.5 19.0 27.4 36.1
2016 708762 17.5 18.4 27.6 36.5
2017 714188 17.6 18.1 27.8 36.5
2018 713881 17.6 18.0 28.2 36.3

Another table with a lot of numbers so let’s digest the above.  Total column shows total pitch count for the 30 x 162 = 2430 games in a season.  Numbers in the next 4 columns are percentages of that total which should add to 100%.

The length of a game is determined by pitch count, not by the type of pitch.  Colored in tan are total pitch counts under 700K and in blue the top two highest, which happens to be the last two years.   It is deceptive, however, to call this a trend because 2014 and 2015 had under 700K pitch counts and 2000 had the highest pitch count in this entire table.

It could be a trend based upon the last two seasons but not based upon the last 20 just by eyeballing that table.

As a percentage of the total foul balls have clearly increased over the past 20 years.  The last 20 years have seen many new ball parks.  Even Wrigley Field reduced foul territory.  The above data makes sense.  What can MLB do about it?  Probably nothing.   Teams want to place seats closer to the field in order to increase revenue which pay the ever increasing player contracts.

The above also confirms the article’s observation of the reduction of In Play contacts which had its highest percentage of total 20 years ago and the lowest last season.  What the article didn’t mention however was that Strikes increased while Balls decreased meaning Total has been relatively constant.

Stats can be deceptive and manipulated to “prove” some narrative.  The time between pitches is probably very similar between a foul ball and a called strike, thus, the Total pitch count is all that matters.  If the last two years are indicators of what the next two years will bring then maybe there is a problem.

The problem, however, isn’t foul balls, it’s the increase in strikes.  In 1998 there were 168K pitches as strikes.  By 2018 that number rose to 201K, an almost 20% increase.  IMHO this is due to Strikeouts being a overvalued in Sabermetric stats like WAR and FIP and even pitch count type ratios.  A ground out or a fly out counts just as much as a strikeout but not to a Draft Kings or other kind of fantasy teams.  Pitchers have learned the value of their contracts depend on keeping Draft Kings stats high.  So even though they might get a guy to ground out on 1 or two pitches, they go for strikeout which requires a minimum of three pitches.  This keeps their K/9 and K/BB and whatever other nonsense ratios I’m missing high.

I’m tired now but perhaps in a follow up we’ll do a comparison of pitch counts between the various types of outs made.  That is all for now.  Until then ….

The Prediction Racket Part 1

The DH series requires historical event data to be compiled for 2018.  Since I have once again been lazy this off season I have been reluctant to revisit those scripts because 1) I may have forgotten how they work and 2) scripts always seem to break when sitting around for a year.

Historical scripts need to be rewritten.  The historical dataset is the foundation for simulation and required to prove or disprove their accuracy.   That’s why today I was pleasantly amused to find some distraction by this tweet.


Troll level for @No_Little_Plans :  Expert!

What better way to get back into the baseball season than arguing over valuation systems and Rob delivers.  The first rule of the Prediction Racket is:

No one can predict the future

Unless you’re a time traveler from the future no one can possibly know what will happen.  Even time travelers can alter their future by affecting something in the past which is called the Butterfly Effect.  Second rule of the prediction racket:

Past results do not affect future results

If you roll a 6 three times in a row it doesn’t mean it’s more or less likely that you’ll roll a 6 again.  The probability is exactly the same no matter what happened in the past.  If a slot machine hasn’t paid out in a long time that doesn’t make it more likely to pay out no matter what compulsive gamblers want to believe.

Third rule in the Prediction Racket:

Use the past as a template for the future

The third rule is more of a how to for those interested in being part of the Prediction Racket.  The first step is take standings from last season and adjust up or down the teams.  If you’re wrong no one will remember.  If you’re right you make sure everyone knows.  Win Win.

Let’s look at standings from last season with a screenshot of before they update for this season.


Now let’s look at a current screenshot of standings as reported by Baseball Prospectus using PECOTA.  A screenshot is used because this page probably will get updated and changed.


They didn’t put much effort into NL West as that’s almost exactly the same.  They think ATL will be somewhat worse than last season, NYN somewhat better, PHI and MIA about the same.  No real insight except for the Cubs which is the click bait troll of this entire article.  They think NL West as a division will be about the same but NL Central will be worse.

This model uses 3 year career splits to rank teams by strength.  Below is a truncated table of top 15 MLB teams created at the beginning of 2018 when we had complete 25 man rosters.

Top 15 MLB teams April 2018

TeamID Hitters Pitchers Starters Relief Total W-L
HOU 49.53 70.04 35.28 34.76 119.57 0
CHN 49.94 55.63 36.47 19.16 105.57 0
CLE 32.33 59.88 26.13 33.75 92.21 0
WAS 36.34 55.19 40.93 14.26 91.53 0
BOS 55.55 35.24 23.90 11.34 90.79 0
LAN 17.97 65.55 48.05 17.50 83.52 0
TOR 40.74 41.30 26.12 15.18 82.04 0
NYA 30.25 51.65 17.22 34.43 81.90 0
COL 59.60 10.11 -3.68 13.79 69.71 0
NYN 25.85 34.70 25.38 9.32 60.55 0
BAL 35.79 9.31 -15.11 24.42 45.10 0
MIL 15.53 24.69 8.49 16.20 40.22 0
MIN 15.78 19.32 4.18 15.14 35.10 0
ANA 24.16 10.20 5.10 5.10 34.36 0
SFN 13.20 17.68 0.33 17.35 30.88 0

This table takes WAA career value for 2015, 2016, and 2017 and sorts by Total.  Total is the sum of Pitchers and Hitters.  Pitchers is the sum of Relief and Starters as we knew at the beginning of the 2018 season and posted here.   Since we’re from the future we know how this season turned out.  Colored in bold green are the NLDS contenders and bold blue the World Series contenders.

Eight of the top twelve ranked teams made it into the playoffs with only Atlanta and Oakland missing from the top half of MLB.  Atlanta was ranked 27 out of 30 teams and they ended up winning NL East.  OAK is literally the team Michael Lewis wrote about in Moneyball and ranked #18.  They are constantly cycling new players through their system.

The above only measures career so the potential of new players don’t get counted.  In 2015 both the Cubs and Astros were ranked at the bottom of this list.  Both had a lot of new guys.  The Cubs ended up in the NLCS that season and both teams rose to the top of the league with career players in subsequent years.

The above table will be reproduced for this season in April when we get a complete dataset of roster data.   Rules 1 and 2 of the Prediction Racket preclude making projections.  The above table demonstrates, however, a team with top career talent will most likely do well in the regular season.  A team that ends up in the bottom half of this list will probably not make the playoffs but as Rule 1 states, No one can predict the future.

That is all for now.  Subsequent parts to this series will be written if there are any funny tweets during Spring Training or in April when complete 25 man rosters are known and we can have some fun with PECOTA like we do with WAR.  Also, Part 2 of the DH series will be forthcoming.

Since the minor league database has been updated we’ll take a look at the new guys on the Cubs in Spring Training as well as the White Sox.  The White Sox are ranked 6th in AL according to the Vegas book as to who is going win the ALCS.  PECOTA isn’t so favorable according to this:


Refer to Rule #1.  Until then ….