Category Archives: Information

Update 4/24/2019

Under normal conditions we would be one week away from showing player rankings. Rosters would be available to talk about and Part 1 of playoff horse race would have already been published using 3 year split data.

Unfortunately none of that has happened or most likely will happen this year. It’s difficult to work on this when there isn’t a stream of live data to deal with. Off season projects of rebooting the simulator and moving everything into a formal database is almost complete — albeit nothing ever completes in these kind of projects.

The White Sox looked to have a very good team this year according to this data model based upon some of their off season moves. Since we don’t have roster data there isn’t any way to measure and show that.

As for the Cubs, I don’t know. Right about now we would be doing the first Cubs status for this season. If they get baseball season started at training camps we’ll start collecting data and doing reports. Until then ….

Simulation Reboot Part 3

In order to test the integrity of the database used in simulation we need to run tests.  Without accurate data or bugs in scripts the estimated probability it produces is inaccurate.  In this part we’ll look at Tier Combo data from three baseball eras; 2000-2019, 1980-1999 and 1950-1969 to test the integrity of this data model.

Real team WAA, using real team wins and losses, the only stat in baseball that determines who makes the playoffs, was tiered in Part 2 of this series.  There is no dispute over real team WAA but there may be dispute over how this data model calculates it for players.  This exercise will deomnstrate if the player WAA and theories espoused by this data model has any merit.

A baseball season is much like any long race like a running marathon, Tour de France, or Indy 500.   Everyone is equal at start and as the race proceeds contestants become more and more separated where winners and losers and those mediocre become more and more defined.

Real team WAA is simply wins – losses.  This data model calculates and assigns WAA to players where the sum of WAA for all players on a team equals that team’s win/loss record.  In April and much of May not only are teams more bunched together with real team WAA, so are players making tiering much more error prone.  This model doesn’t start handicapping now until day 60 which is around third week in May nowadays.  This allows for standard deviations for lineups, starters, and relief squads used to calculate tiers to increase — meaning teams are separated enough to somewhat determine who is truly good this season and who is not good.

Much like marathons or Indy 500s, teams and players often crash and burn by the end of season.  This model quickly adjusts to reflect that.  Stats like batting averages do not.

There are two types of tier combos used in simulation; lineup -> starter and lineup -> relief.  Each game contains two pairs; one pair for away team and a pair for home team.

Tier combos are calculated by subtracting the pitching component tier number (starter or relief) from the lineup tier number.  Tier numbers are calculated by this simple formula:

Tier Number = 2 * ( WAA – league WAA average ) / league standard deviation

WAA for a lineup is the sum of player WAA for that lineup.  WAA league average and standard deviation is a running average of 30 teams’ last 3 lineups ( 90 lineups ).  A snapshot is taken at the beginning of each day, then averages and tier numbers for each team are calculated.

WAA for starters rely on a single player.  WAA for relief is the sum of a relief squad.   Relief squads are estimated from event data and are pretty accurate.

Tier numbers are floating point numbers.  When subtracted to make a tier combo they get rounded up or down to make an integer.  Right now tier numbers have a range of -4 to +4 and tier combos have a range of -6 and +6.  The simulator only cares about tier combos.

The run used to make the below tables looks at all games between 6/1 and 8/31.  Tiers fluctuate too much in April and May and in September player expansion can distort roster value.  Although we may handicap games in September and late May, we’re sticking to a much narrower window for the dataset simulation draws from.

The below tables show all the tier combo sets from -6 to +6 with columns runs/inning, number of innings pitched per game for both the lineup -> relief and lineup-starter.

First let’s look at the modern era from 2000-2019 which encompasses around 25,000 baseball games from 6/1 to 8/31.

2000 – 2019 Tier Combos

TC Lineup -> Relief Lineup -> Starter
R/Inn Outs R/Inn Outs
-6 0.359 8.76 0.353 19.86
-5 0.391 8.93 0.372 19.40
-4 0.390 8.86 0.409 18.93
-3 0.415 9.07 0.432 18.47
-2 0.424 9.14 0.449 18.19
-1 0.429 9.09 0.473 17.88
0 0.442 9.21 0.488 17.71
1 0.462 9.38 0.512 17.42
2 0.470 9.30 0.514 17.47
3 0.488 9.47 0.534 17.26
4 0.490 9.37 0.546 17.05
5 0.526 9.62 0.585 16.90
6 0.561 9.60 0.600 16.87

Tier Combo of -6 is a terrible lineup facing a very good relief squad or starter.  The opposite is true for a Tier Combo of +6.  The above shows runs per inning for starters goes from 0.353 at TC = -6 to  0.600 per inning at +6, the best lineups vs. worst starter.  Runs per innings increase almost the same with the lineup -> relief combos.

The number of outs for starters goes from 19.86 outs per game with the best starter facing the worst lineups down to 16.87 outs for the worst starter facing the best lineups.  Divide by 3 to get innings.  Outs per game for relief does not vary much between -6 and +6 probably due to the number of outs a relief staff must pitch has more to do with the starter than the value of the relief squad.

The number of runs given up by relief is much less than by starters which should be expected.  Tier Combo 0 is even steven between lineups and relief or starter.  The starter runs per inning is almost exactly league average for this 20 year span.

All runs counted for pitchers above are earned runs.  When determining who wins a baseball game, the commissioner counts unearned runs equally with earned runs.  This model counts and tiers  unearned runs separately for use in simulation because all runs must be accounted for to make the books balance here.  A pitcher should not be blamed for runs not his fault and an official scorekeeper keeps track of that for every play in every game since the beginning of baseball.

The next table will show the 1980 to 1999 era.

1980 – 1999 Tier Combos

TC Lineup -> Relief Lineup -> Starter
R/Inn Outs R/Inn Outs
-6 0.361 8.04 0.355 20.84
-5 0.378 8.40 0.374 20.44
-4 0.392 8.11 0.390 19.99
-3 0.375 8.19 0.409 19.51
-2 0.404 8.10 0.435 19.18
-1 0.416 8.12 0.456 18.87
0 0.418 8.41 0.466 18.77
1 0.449 8.27 0.477 18.46
2 0.465 8.49 0.500 18.37
3 0.460 8.24 0.503 18.25
4 0.458 8.67 0.536 17.92
5 0.504 8.63 0.532 17.92
6 0.495 8.92 0.553 18.03

The league had 26 teams for most this era and went to 30 teams in 1998 which means less pitchers.  A 30 team league will have around 150 starters, a 26 team league 130.  The above shows much narrower differences between -6 and +6 tier combos for both relief and starter which should be expected because talent is more concentrated.

This can be a problem in simulation that is still a work in progress.  As we go back to 1950-1969 we get to 16 team leagues with around 1/2 the number of players.  It may not be possible without some kind of adjustment to pull values from a tier combo in a 24 or 16 team league when we’re handicapping a 30 team league with much higher disparity of talent.

As we go back in time starters pitch more outs and relief less.  This means we can’t simply pull a pitchers innings pitch/earned runs from an early era and use that directly in simulation either.

Below is a look at the Tier Combo spread from 1950 to 1969.

1950 – 1969 Tier Combos

TC Lineup -> Relief Lineup -> Starter
R/Inn Outs R/Inn Outs
-6 0.320 7.57 0.325 22.06
-5 0.361 6.99 0.344 21.50
-4 0.358 7.14 0.356 21.13
-3 0.365 6.87 0.380 20.48
-2 0.385 7.28 0.389 20.19
-1 0.414 7.22 0.399 19.97
0 0.416 7.23 0.416 19.78
1 0.433 7.69 0.439 19.35
2 0.425 7.72 0.443 19.21
3 0.478 7.84 0.458 19.04
4 0.493 7.92 0.479 19.03
5 0.485 8.41 0.504 18.35
6 0.560 8.63 0.502 18.65

The above are averages.  When looking at % of 9 innings pitched by starters it skyrockets almost an order of magnitude (10x)  higher than modern era baseball.  Runs/inning are even more constricted with mostly 16 team leagues.

In past years this data model pulled data from 1970 – present without any alteration.  This probably introduced error even though it beat Vegas albeit not by enough to advertise.

Adjustments will have to be made on an era by era basis.  There is too much variation to come up with factoring coefficients on a yearly basis.  The eras shown above were thrown together arbitrarily to fit with the logistics of rebuilding this database.  Right now I’m thinking 1920-1960, 1961-1976, 1977-1997, 1998 -2019.

The biggest factor in narrowing Tier Combo results is number of players in a league which is directly related to number of teams.  1961 – 1977 went from 20 teams to 24.  The next era went from 26 to 28, and our modern era since 1998 has been at 30 teams.

The number of innings starters pitched has also declined a lot in recent years but that’s fodder for another post.

Looks like baseball season might be cancelled  <insert sad emoji>.  This model was going to get detailed box scores from mlb.com this season which would have made regular season handicapping much more interesting as roster value — especially relief, will be far more accurate than past seasons.  Unfortunately we may have to wait until next year.

Still working this simulation and the baseball-handbook.com website which will allow easy click through for any team, any player since 1900 and any game since 1920.  Until then ….

Simulation Reboot Part 2

In order to properly simulate we need to know what happened in the past by actually counting it.   The difference in pitching innings between old era baseball and modern baseball is a problem.  We can’t simply dip into a game from the 1950s and pull a starter’s earned runs and innings pitched because starters pitch far less innings than than now.

DeltaWAA is the difference between an away team WAA and home team WAA where WAA is simply wins – losses.  League average across all deltaWAAs must equal 0 exactly because for every W a team receives, another team receives an L.  A standard deviation can be calculated however which means wins and losses can also be tiered like lineups, starters, and relief squads.

A tier as defined by this data model represents 1/2 standard deviation above or below league average.  A tier combo is an away team tier minus home team tier.  Negative tier combo means away team worse than home team and vice versa for a positive value,   In this data model tier combo are integers; each representing a set of values.

Below is a table showing home team win percentages for each tier combo set from  years 2000-2019.

Tier Combo deltaWAA 2000-2019

TC Away% Home% Games Away R Home R
-6 0.248 0.752 929 3.608 5.473
-5 0.281 0.719 1070 3.817 5.524
-4 0.314 0.686 1984 3.845 5.235
-3 0.350 0.650 2871 3.960 5.130
-2 0.373 0.627 3479 4.152 5.029
-1 0.418 0.582 4077 4.355 4.769
0 0.465 0.535 4041 4.503 4.625
1 0.491 0.509 3921 4.632 4.472
2 0.547 0.453 3335 4.819 4.288
3 0.580 0.420 2872 5.048 4.155
4 0.607 0.393 1842 5.221 4.105
5 0.640 0.360 1089 5.488 4.134
6 0.713 0.287 999 5.710 3.869

Home team wins 75% of games at tier combo -6 which one would expect.  Away team wins 71.3% of the time with tier combo +6 when they have max advantage over home team.  The Games column shows the number of games in each tier combo set.  The last two columns show run differential per game that led to the win percentages.

Tier combo 0 is even steven between the two teams according to wins and losses.  Win% at TC 0 is almost exactly equal to overall home field advantage win% as one would expect.  The above represents what actually happened the last two decades with values that will be used to test the accuracy of the new simulator.

Not sure if or when baseball will resume.  Next part to this series will cover lineup -> starter and lineup -> relief tier combos through the various eras of baseball.  Right now we’re separating data into 1950-1969 , 1970 – 1999, and 2000 – 2019.  Eventually this simulator will look back to 1920 – 1949.  More on this later.  Until then ….

Baseball-Handbook.com

New domain started today that currently points to this log book.  The name “baseball handbook” was inspired by Cook’s Traveler’s Handbook, one of the first travel guides.  This  guide will aid  those interested in baseball through the labyrinth of 30 MLB teams employing thousands of players each year and the tens of thousands who have played since 1900.

This handbook employs a Keep It Simple (KISS) methodology to everything.

Baseball-handbook.com  will eventually point to a work in progress browsable web site.

The Simulation Part 5

This post will continue to explain how this simulation works.  There are no magic bullets in handicapping and this model only looks at a subset of information.  It doesn’t take into account weather, traveling schedules, righty/lefty matchups, etc. etc.

The premise of these simulations is by looking at similar historical matchups we can better understand similar matchups for future events.

The generated probabilities are based upon player value assigned by this data model which is solely based upon run production.  Defensive stats are very subjective and are not part of this model. This model counts runs scored, runs scored against, assigns them to players, and converts that into a W-L number.  Every run is accounted for and in the end, all numbers must match.  The sum of players on a team must equal the team numbers.

Defense is very important but it involves imaginary runs; runs that should have scored but didn’t due to above average defense, and runs scored but shouldn’t have due to sub par defense.  None of these runs are counted in box scores.  This model counts errors on a team level which are measured by an official score keeper.  We know with 100% certainty how many unearned runs a team let’s up.  Making errors is bad defense so we know teams who suffer.

There are stats that attempt to measure defensive ability but hat is a completely separate measure that cannot be integrated into this WAA measure.  WAR integrates defensive stats  which is why WAR has flaws.  It is how Darwin Barney got ranked part of the top 50 players with WAR in 2012 where this model had him in the bottom 200 based upon his hitting.  There is no way whatsoever fielding can make up for that — especially on a team that lost 101 games with a team WAA = -40.  Losing is a team effort.

Results from this simulation should be used as a lens into the past to clearly see strength and weakness between two teams.  Lineups and Relief are groups of players, a starter is a single player.  These get matched up as follows:

  • AWAY Lineup –> HOME Starter
  • AWAY Lineup –> HOME Relief
  • HOME Lineup –> AWAY Starter
  • Home Lineup –> AWAY Relief

Lineups don’t face other lineups, they face pitching just like starters don’t face other starters either.  Games are usually framed by the two starters listed prominently in game promos.   Strength of lineups and relief are not.   This model breaks down all three aspects of each team.

Each of the above 4 bullet items ( Tier Combos ) generates an integer from +6 to -6 with reference to a lineup.  The best lineup facing the worst pitcher would be +6.  Worst lineup facing best pitcher would be -6.

There are around 100K games since 1970.   A snapshot of every player, every team, every relief squad, and every lineup is measured at the beginning of each day.  Tier Combos get assigned and we count how many runs each lineup scored that day.  That gets pushed into a distribution based upon its Tier Combo integer.

Let’s look at today’s CHN game with SEA.

processing CHN_SEA_05_01_6:40_PM -1 2 1 1
processing CHN_SEA_05_01_6:40_PM 25622 177168 24769 208265

The above processing records show -1 2 1 1 representing the 4 Tier Combos described in the above bullet list.  First number, -1 , means AWAY Lineup (Cubs) one tier worse than HOME Starter.  but two tiers better than HOME Relief, second number, +2.

Seattle has an exceptional top top tier lineup but Jon Lester is pitching today and he’s having another decent season so SEA lineup is only 1 tier above Lester and one tier above CHN Relief.

Lineups score runs.  How many depends upon the value of pitching they face.  The next 4 numbers shows how many games are in the distribution for lineup –> starter combos, how many innings are in the distribution for lineup –> relief combos.  The -1 lineup –> starter was seen in 25622 instances since 1970.  There are 100K games and two lineup –> starter combos per game.  Thus 25K represents around 1/8 or 12% of total instances.

The number of instances drop off as the difference in talent increases.  For example, a +6 Tier Combo pitting the best lineup against the worst pitcher only has around 2K instances to draw from in its distribution.

The simulator runs 1 million iterations randomly grabbing runs scored from whatever distribution for lineups against starter, lineups against relief, home and away, counts who wins and loses, and in the end calculates a Win%.  This uses  historical games that actually happened in real life.

A lineup –> starter combo lookup returns a number of runs and how many innings pitched for a starter.  Better starters will pitch longer and use less relief, vice versa for worse starters.  Lineup –> relief returns number of runs/inning.   That number will be low for bad lineups against good relief, high for the opposite.

The foundation for everything above is the WAA value generated for players by this data model.  In Part 6 we’ll go over more examples using current game data.  Right now a rudimentary prototype page showing all current games is up here.  Still learning HTML5 and javascript.   We’ll step through in more detail with examples what those numbers mean.  Even though you may not agree with the EV value calculated on the differences between Vegas and TC SIm, probabilities the underlying L, S, R info is a valid representation of that team at that moment in time.

There are some issues however and those will be discussed in subsequent parts to this series.  That is all for now.  Until then ….