Category Archives: Outside Articles

An Important Moment in Baseball History Captured in a Panoramic Photo

Every U.S. president from William Howard Taft to John F. Kennedy tossed out a ceremonial first pitch from the ballpark’s stands. This was even true on that first Opening Day in 1911. Indeed, somewhere in our mystery photo, Taft is sitting in the stands, enjoying the ball game. Thankfully, his first pitch was captured in a different photo published in the Evening Star the following day:

Source: Baseball Researcher: An Important Moment in Baseball History Captured in a Panoramic Photo

Tampa Bay’s “Opener” Experiment Could Spark a Baseball Revolution

The average starter in 2018 faces just 23 batters—the lowest total ever and the latest point in a decades-long decline. That number means a starting pitcher faces a lineup two and a half times, on average. Because teams know that starters tend to perform worse as they progress later into a game, it follows that they’d restrict those third-time-through-the-order matchups to the worse hitters at the bottom of a lineup, rather than the Mike Trouts and Justin Uptons at the top.

via Tampa Bay’s “Opener” Experiment Could Spark a Baseball Revolution – The Ringer.

This is some very interesting managerial thinking outside the box. When debugging the dataset for tiering simulations I noticed an interesting trend in relief value. In the 70s teams had very negative value for relief squads as if they put all their washed up pitchers there. It started increasing early 80s and now average relief is very high.

On the todo list is an entry to present this data in some readable fashion that will contrast the graphs presented in this article. Until then ….

Bill James: Judge and Altuve

Update 2/18/2018: I started writing this a couple months ago and couldn’t finish after reading Bill James quote OPS, a very flawed baseball statistic which is a tangent I don’t really care about.  If people want to throw around these kind of stats that makes the results from this data model more valuable.

tl;dr This model reflects a team’s Win/Loss record based upon its players.  WAR does not.  This model uses the estimated Win/Loss record based upon Bill Jame’s own PE formula.  We could, like James stated with the Yankees, adjust to real wins and losses very easily but we don’t.  That is all….

——————————cut here——————————

I got directed to this article: Judge and Altuve | Articles | Bill James Online written by Bill James and there are some interesting tidbits that I need to comment on.  It’s difficult thinking about baseball in the winter and I have been putting this off.  This post will be updated throughout the winter as I think of something different to say.

The article is about the value of Judge and Altuve as MVP.  This data model is clear and unambiguous,  Aaron Judge is the MVP of AL right behind Giancarlo who we have as MVP of NL also.  Here are our top 5 MLB players.

Rank WAA Name_TeamID Pos
+001+ 10.00 Corey_Kluber_CLE PITCH
+002+ 9.66 Giancarlo_Stanton_MIA RF
+003+ 8.92 Aaron_Judge_NYA RF-DH
+004+ 8.55 Max_Scherzer_WAS PITCH
+005+ 8.38 Paul_Goldschmidt_ARI 1B

AL, NL, Pitchers and batters are all ranked together in this data model.   Apparently Bill James agrees with the MVP voters that Altuve is AL MVP.  Whatever.  He has some interesting things to say in the article which is a good read.  Here’s a blurb:

The first indication that there is a problem with applying the normal and general relationship is this.   The Yankees, by the normal and general relationship, should have won 102 games, when in fact they won only 91.   That’s a BIG gap. The Yankees played poorly in one-run games (18-26) and other close games, which is why they fell short of their expected wins.   I am getting ahead of my argument in making this statement now, but it is not right to give the Yankee players credit for winning 102 games when in fact they won only 91 games.   To give the Yankee players credit for winning 102 games when in fact they won only 91 games is what we would call an “error”.   It is not a “choice”; it is not an “option”.   It is an error.

When you express Judge’s RUNS. . .his run contributions. . . when you express his runs as a number of wins, you have to adjust for the fact that there are only 91 wins there, when there should be 102.  (The Astros should have won 101 games and did win 101 games, so that’s not an issue with Altuve.)  But back to the Yankees, one way to do that is to say that the Yankee win contributions, rather than being allowed to add up to 102, must add up to 91.

He makes an assumption which is not true.   WAR does not add up to anything as we have shown here over and over.  This model has the the sum of Yankees players adding up to 102 games exactly according to Bill james’ Pythagorean Expectation formula.  Bill James is talking about this model, not WAR.

There is a simple method to make this adjustment in this model.  We would tax NYA 11 games and ding every player according to playing time.  According to our above table Aaron Judge has a WAA=8.92.    He would lose 0.6 on an adjustment and drop to 8.32.  Since everyone in the league would be adjusted the rankings could change but in no way change enough for Jose Altuve to move ahead.

Right now I don’t want to do this.  Runs are the currency that achieves wins and they are what players accumulate above or below average.  We can assign run production with virtually 100% accuracy.  This gets converted to wins according to Pythagorean Expectation which is the WAA value measure players carry from team to team when they get traded.  This value measure is the same for all leagues from MLB to A+ to JPL to even little league.  This model must work for all leagues the same. The disparity between PE and real wins and losses can be magnified in lower leagues which could obfuscate players who are only there to prove themselves, where wins and losses may not even matter to those teams.

I’m torn by this.  It can easily be done with this model.  it would create a split in valuations and, like Sabermetrics, which value is correct.  I prefer the value that reflects the estimated wins and losses.  In the end I don’t think it would matter that much anyway.  Perhaps we’ll run some numbers and see.

The logic for applying the normal and usual relationship is that deviations from the normal and usual relationship should be attributed to luck. There is no such thing as an “ability” to hit better when the game is on the line, goes the argument; it is just luck.   It’s not a real ability.

We don’t know what causes a team to exceed or not exceed expectations.   We can’t predict the future.  We can only estimate it.   Reality is the goal post, all estimates are a source of error.  Luck has nothing to do with it.


Update 2/18/2018:  This is where I need to stop commenting.

Kyle Schwarber

The Cubs lost yesterday with Hendricks on the mound and I may have been a bit fatalistic because it’s still early in the season.  We’ll to Cubs’ status tomorrow and it’s not all gloom and doom.  Today I read an article on about Kyle Schwarber.  Here’s a quote from Joe Maddon.

“I don’t care about [the batting average]. I’m looking at at-bats, the process, what he’s doing for the team, getting on base. But for the guy, when he looks up at the scoreboard and sees numbers everywhere, and they evaluate themselves based on numbers, I don’t want him to do that. I want him to get back to the process.”  — Joe Maddon

This data model agrees and has been preaching this from the beginning.  Runs win baseball games not hits.  Although Kyle Schwarber is playing absolutely terrible for his Draft Kings teams, he’s still not playing that badly for his real team.  Today we’ll analyze Kyle Schwarber.  Here is a full line for Kyle Schwarber so far.

Rank WAA BA OBP PA RBI R Name_TeamID Pos
XXXXX -0.36 0.177 0.297 192 19 21 Kyle_Schwarber_CHN LF

The Draft Kings people focus on his 0.177.  This model focuses on run production, RBIs and Runs.  Schwarber is a little underwater at -0.36 but that’s nothing he can’t overcome in just a couple of big games and he’ll be above average.  At -0.36 Schwarber right now is not helping the Cubs to be above average but he’s not hurting them that much either.  The Cubs are an average team because most everyone on the team is playing average — not just Schwarber.  Even Rizzo is hovering around average.

Rank WAA BA OBP PA RBI R Name_TeamID Pos
XXXXX 0.61 0.239 0.369 222 30 26 Anthony_Rizzo_CHN 1B

Rizzo was underwater last week and since has  climbed above average.  His BA isn’t Draft Kings stellar either right now.  A baseball season is a marathon and the Cubs may be starting out slow compared to last season but they aren’t tanking.

Blast from the Past

Let’s look at Kyle Schwarber in 2015 when he came to the Cubs and was a major force propelling them to the NLCS that year as well as coming off a season ending injury significantly helping the Cubs win a World Series.    I would show a graph here but I’m still working on a script to make them so we’ll have to muddle through this analysis with tables for now.

Rank WAA Name_TeamID Pos
06202015 0.38 Kyle_Schwarber_CHN BAT
06242015 0.73 Kyle_Schwarber_CHN BAT
07282015 1.11 Kyle_Schwarber_CHN BAT
08102015 1.85 Kyle_Schwarber_CHN CR
08152015 2.16 Kyle_Schwarber_CHN CR
08192015 2.83 Kyle_Schwarber_CHN CR-LF
08232015 3.57 Kyle_Schwarber_CHN CR-LF
08272015 3.38 Kyle_Schwarber_CHN LF-CR
08312015 3.61 Kyle_Schwarber_CHN LF-CR
09052015 3.99 Kyle_Schwarber_CHN LF-CR
09142015 4.37 Kyle_Schwarber_CHN LF-CR
10062015 3.76 Kyle_Schwarber_CHN LF-CR

He reaches a high of WAA=4.37 mid September and finishes ranked #49 top MLB player for the year.

Rank WAA BA OBP PA RBI R Name_TeamID Pos
+049+ 3.76 0.246 0.355 273 43 52 Kyle_Schwarber_CHN LF-CR

His BA is OK by Draft Kings standards but his run production is incredible for only 273 plate appearances.  It was his run production that won games for the Cubs, not his lack of hits.  There is a long season ahead and Schwarber is not in any hole right now.  Ignore the stat heads.  :-)

Update 6/3/2017: I’d like to add one more thought to put the above numbers into better context.  Here is Kyle Schwarber’s peak value on September 12, 2015.

Rank WAA BA OBP PA RBI R Name_TeamID Pos
+029+ 4.56 0.267 0.365 204 42 46 Kyle_Schwarber_CHN C

Here is Bryce Harper as of 6/2/20132017 He is currently ranked by this data model as the second best player in MLB behind a pitcher making him the best hitter.

Rank WAA BA OBP PA RBI R Name_TeamID Pos
+002+ 4.09 0.328 0.438 208 43 44 Bryce_Harper_WAS RF

As explained throughout this site WAA is calculated based upon run production and league and team wide averages.  Playing time which translates into games played is measured through plate appearances (PA).    For similar playing time, Schwarber on September 12, 2015 was better than the best hitter in MLB right now.  That was then this is now but the above numbers show what kind of production Schwarber is capable of bringing the Cubs.  He may be slumping now but when he breaks out, and when Rizzo and Bryant break out, the Cubs will put up a lot of offensive numbers that could carry them into the playoffs this year.

And one final thought:  A home run counts the same whether it’s hit 500 feet, 1000 feet, or barely makes the basket.  It counts the same whether it leaves the bat at 200 mph or 55 mph as long as it goes over the fence and the umpire twirls his hand.   There is too much hype about these irrelevant stats with respect to Schwarber and that needs to stop.

Pace Is the Most Consistent Pitching Stat

That suggests that the biggest factor affecting pace is the pitcher. Pitchers can change catchers, teams or leagues; face a mix of batters; and pitch in front of a different defensive alignment or in different contexts. Yet their pace of play stays largely the same from season to season.

This is evident in the leaderboard: Five of the 10 slowest pitchers with at least 100 innings last year were among the 10 slowest in 2012.

via Pace Is the Most Consistent Pitching Stat | FiveThirtyEight.

Not sure the purpose of  measuring pace and this analysis.  I would expect pitchers who pitch slowly will pitch slowly from year to year because that is their style.   What is the point to this article?

I found there was essentially no correlation between pace and FIP or WPA.

Assuming FIP is a valid form of measurement he concludes that pace doesn’t even correlate to performance.  Here is my disproof of the FIP formula.