The Simulation Part 4

This post will show new output from the simulation that will be employed here this season.  This data model cannot compile current year data until May when there is enough of it so in the meantime checking bugs and verifying results of  the handicapping system needed to done and it’s almost complete.

In past years this log book usually showed the starting game of each Cubs series with a dump of data from the data model showing Vegas and simulation estimated probabilities and then I had to talk through the numbers.   The Vegas probabilities were the gold standard to beat.  This simulation had no proof.  Winning or losing a couple of games proves nothing.

Today we’ll run through the new Game entity which contains all relevant data from each game down to the lineups, starters, relief, who won, lines, etc.  Our simulations are based upon all games from 1970 – 2018 excluding March and April.  This comes to around 100K games, two teams per game.

Although the model has been verified against Vegas automatically, it still needs to be spot checked manually for anomalies and other things that can be exploited to make it better.  I picked one game at random used for debugging output so decided to just go with it here.

Here’s a game between the Pirates and Dodgers on May 8, 2017.  The format of all of this is still a work in progress.  Commentary on what every section means will be interspersed.

GAME PIT LAN 201705080

201705080 is a game number using retrosheet.org nomenclature.  A 0 attached to the YYYYMMDD date means single game.  A double header will attach a 1 or 2 depending upon which game.  Keeping track of double headers was a big problem.

--------LINES----------
VEGAS PIT 0.345 LAN 0.688
SIM PIT 0.421 LAN 0.579
NSIM PIT 0.400 LAN 0.600
ELO PIT 0.417 LAN 0.583
DELTA PIT 0.435 LAN 0.565
---------EV------------
NSIM PIT 116 LAN 87
ELO PIT 121 LAN 85
-----------------------

In the last few years we showed Vegas lines with LINEHOME and LINEAWAY text records.  The above consolidates those records and shows the other systems we compare to.   Line records now contain

  • Type of system
  • away team – away teamid
  • away probability – break even probability for away team
  • home team – home teamid
  • home probability – break even probability for home team

There are 5 different systems shown under Lines:

  • VEGAS – These are probabilities derived from end of day betting lines
  • SIM – These probabilities derived from old simulation (deprecated)
  • NSIM – Probabilities derived from new simulation
  • ELO – Probabilities derived from Nate Silver’s ELO system
  • DELTA – Probabilities derived from old DeltaWAA (deprecated)

SIM was the original system used second half of last season.  Those simulations had too much error and needed to be fixed.  DeltaWAA is the away team WAA – home team WAA where WAA = W-L, the value that is the foundation of this data model.  Last season we had a table lookup showing a probability based upon that derived from historical data.  That now has been integrated into the simulation called NSIM above which is the system that beat ELO and Vegas in Part 3 of this series.

In May when we start doing this for live games only VEGAS and NSIM will be shown.  If we can easily acquire ELO data through wget that will be included but not counting on that right now.

The EV section above shows Expected Value on a $100 bet calculated by the differences between various break even probabilities and VEGAS the house break even probability.   EV records show both away and home bets.

In the above example, the AWAY bet is above our threshold of 115 and it happens that both ELO and NSIM are betting Pittsburgh this game.  And they both lose LOL.  There is a lot of give and take in handicapping and just because a system loses does not mean it wasn’t a good bet.  Let’s take a look at that.

AWAY PIT 000001000 --> 1
HOME LAN 60220020 --> 12

Above is a line score for this game.  We really got demolished in this game.  You can get info about how ELO works from the source.    The last part to this series showed ELO beat Vegas in our first accuracy test — which I’m still researching its accuracy.

Let’s dive into the simulation data for this game.

--------------------------
TIERDATA PIT LAN 201705080 -3 -4 3 -1 -2 LAN
---- AWAY L -> HOME S ----> -3 ---- AWAY L -> HOME R ----> 4
AWAY LINEUP -1.89 PIT --> -1.73
HOME STARTER 0.48 Alex_Wood_LAN 24.7 --> 0.85
HOME RELIEF 2.54 LAN --> 2.11
---- HOME L -> AWAY S ----> 3 ---- HOME L -> AWAY R ----> -2
HOME LINEUP 3.21 LAN --> 1.82
AWAY STARTER -0.36 Trevor_Williams_PIT 11.7 --> -0.74
AWAY RELIEF 2.86 PIT --> 2.52

The first two numbers in the TIERDATA record ( -3 , -4 ) are what we call tier deltas.  They are an integer difference between home lineup , away starter and home lineup , away relief.  They range from -6 to +6 and are discrete integers.  A +6 means the best lineup (+3) against the worst starter (-3) or ALS = +3 – (-3).

The second two numbers is the opposite; home lineup against away starter , home lineup against away relief.  Pluses in all these numbers means lineup team is favored, minuses means lineup team is not favored in this category.  Clear as mud?

This gets very confusing and I got confused writing this.  These reports were produced to debug the output.  Because it’s so confusing bugs could be introduced or things get assigned backwards.

Since WAA has additive properties the value of a lineup and relief staff is merely the sum of player WAAs.    Each day these, along with single starter values, get calculated for each team at the beginning of the day — much like what we’ll see at the beginning of each day starting this May.  Averages and standard deviations are taken among all 30 MLB teams for that day and tiering is assigned to each group or starter.

The numbers in brown above are measured in 1/2 standard deviations away from the mean +/-.   An AWAY lineup-starter tier would be calculated like this:

ALS = Lineup – Starter = –1.730.85 = -2.58 = -3 ( rounded down for negative )

The bold blue -3 , the AWAY lineup starter combo is the first simulation number in the TIERDATA record used for simulation.  The other three combo numbers are calculated similarly.  The hard part is taking snapshots of 100K games and curating those numbers, the foundation of which rests upon the WAA player value generated by this data model.  ALR, HLS, and HLR are calculated similarly with their numbers shown above.

The last number is DeltaWAA which we talked about last season.

DeltaWAA = Away WAA – Home WAA

Here is DeltaWAA for this game.

---- DELTAWAA -----------> -2
DELTAWAA PIT 14 17 LAN 17 14 (-6 -2)

A team WAA is simply W-L which is -3 for PIT , + 3 for LAN.  The numbers highlighted in brown show this deltaWAA and a calculated tier which is another discrete integer between -21 and +21 .  DeltaWAA for this game is -6 which favors HOME team and that gets assigned to a -2 tier in the simulator.

The simulation takes these numbers and runs a Monte Carlo simulation of ! million games calculating wins and losses and converting that into an expected win percentage or probability.

ELO, NSIM and Vegas all had the Dodgers favored in this game.  The above breakdown shows why.   Even at 0.400, NSIM’s probability and ELO’s 0.412 for Pittsburgh was much higher than Vegas’  0.345 so the underdog became a betting opportunity.  Irrational exuberance for home games in Los Angeles has been observed but is fodder for another time.

The above shows the ingredients that go into simulation producing a handicapping probability that can be compared to Vegas lines.   There are many more variables that this model does not take into account which may influence the outcome.  The purpose of posting all the variables used is so people view the bet no bet decisions here critically.  It’s possible a way to adjust inputs can be done on the fly if you disagree with the model’s input.  There are no guarantees when it comes to handicapping future events.

That is all for now.  The above will become a standard feature this season.  Might have to break out of this WordPress format though to properly show this in a more intuitive manner.  Until then ….