The Ouija Board is once again expanding. It is called a Ouija Board here because markets are influenced by many people resulting in an outcome that comes very close to an actual probability — as if it was controlled by the beyond or the future. We can measure error in the market by comparing them to actual results (because we’re from the future now). The market, or the house line, is very accurate in most cases.
The purpose of this handicapping model is to validate player and team evaluation math behind this data model. If this model is a more accurate representation of the past than Sabermetrics, then our handicapping should have less error than the market which means the market can be exploited to gain a percentage edge on the house — much like what is done counting cards in blackjack. The house only sets the opening line, the market, all the people pushing and pulling on that Ouija Board, set line adjustments during the day.
In Part 4 of this series a league wide table will be introduced showing Expected Value instead of probabilities. This concept can get very complicated but our use case is rather simple. We only have a single probability and a single value so our equation looks like this:
EV = P(win) * Value
In past Ouija Board sections we have been eyeballing percentages. Our margin was 0.07 over the break even probability in order to for a line to become a betting opportunity. The break even probabilities in all Ouija Board sections have an expected value of $100 on a $100 bet. This means in the long run you break even betting a line where your expected probability equals that. It makes no sense to bet if you only break even, less so if your expected value is less than your bet. In all games in Vegas your expected value on every bet is less than your bet. The house always gets a cut which pays for all the flickering lights.
Although adding a rough margin to probabilities was fine for eyeball estimates to see if we’re in the right ballpark (pun not intended), it’s not proper to use that kind of math in an algorithm. We must manipulate expected values, not probabilities which are ratios between 0 and 1.
A simple example of expected value is flipping a coin. Heads and tails are two equal outcomes so each has a probability = 0.500 or 1/2. If you bet $1 on heads or tails your expected value of that bet would be:
EV = P(heads or tails) * $(1+1) = 0.5 *$2 = $1
Not too complicated. Since your expected value on a $1 bet is exactly $1 if you played this bet a trillion times you would end up losing nothing and gaining nothing. It would be a complete waste of time. If you had a loaded coin where you increase the probability of your pick to say 0.600 instead of 0.500 then your EV would be:
EV = 0.600 * $2 = $1.20
This means for every bet you will average $0.20 profit and if you play this a trillion times you’ll be very rich with virtual certainty.
Like a flip of a coin, the probability behind a baseball game is just as simple. There are two possible outcomes, home team wins or away team wins — that’s it. If you knew nothing of either team or where they’re playing you would have to assume each team has a probability of 0.500 — like a flip of a coin.
We have more information however. Home field advantage is historically 0.540 home team, 0.460 visiting team. If you knew nothing other than which team is home and away you could assume home team probability is 0.540 and away field is 0.460.
Just knowing home/away is not good enough however. There are differences in win loss records and differences in starting pitching, lineups, and relief. These are somewhat independent from each other in that win loss record has some influence on the value of talent playing for a single game but we don’t know how much. Teams with great records have been known to slide into oblivion at various points in the season and vice versa. This demise or rise would be due to the makeup of the talent on the team. This is where Tier Combo simulations come into play and the basis for our handicapping.
Update 7/13/2018: There is a relationship between win/loss record differences (deltaWAA) and the Tier Combo simulations based upon player talent. That relationship is unknown right now.
As has been explained in the many matchup posts here, deltaWAA is a table lookup that represents the differences in wins and losses between the two teams. DeltaWAA is double what most people call games behind. We chose a dataset from 1970 – 2016 and counted wins and losses based upon deltaWAA and derived a table of win/loss percentages from it. So if deltaWAA is say 10 the higher team should have a probability greater than 0.500. That is one way of handicapping two teams.
The Tier Combo simulations take a snapshot of value for each day of those 46 years and run the tiering calculations that are done each day for this current season. Historical data based on the combination of talent between the two teams are turned into a distribution which is used in simulation to estimate win/loss percentage — which is an expected probability for that game.
Which probability is correct; deltaWAA or Tier Combo? Both of these could be independent or somewhat dependent. A team that enjoys a high deltaWAA should have high value talent playing. If they don’t then perhaps Tier Combo simulation results should take precedent.
The following is a first draft of a table showing all the expected values on $100 bets for each game today 7/12/2018. The Cubs start a new series tomorrow with SDN so more will be explained then. Don’t like showing tables with a lot of numbers and the below will be consolidated in the future.
Expected Values for 7/12/2018
Away | Home | Away simEV | Home simEV | Away dWAA | Home dWAA | dWAA Fav | dWAA Pct | |
---|---|---|---|---|---|---|---|---|
ARI | COL | 92 | 103 | 112 | 83 | ARI | 0.565 | |
NYA | CLE | 93 | 103 | 115 | 78 | NYA | 0.619 | |
TBA | MIN | 92 | 105 | 107 | 88 | TBA | 0.582 | |
PHI | BAL | 105 | 90 | 139 | 56 | PHI | 0.711 | |
LAN | SDN | 90 | 110 | 94 | 103 | LAN | 0.619 | |
MIL | PIT | NEW | GUY | STARTER | FOR | MIL | ||
SEA | ANA | 100 | 95 | 121 | 74 | SEA | 0.619 | |
TOR | BOS | 111 | 90 | 87 | 104 | BOS | 0.664 | |
OAK | HOU | 126 | 83 | 110 | 91 | HOU | 0.619 | |
WAS | NYN | 81 | 128 | 89 | 111 | WAS | 0.616 |
We’re not going to get into how the above was calculated until we do the Cubs tomorrow and can show it in detail for an individual game. The expected values in bold blue show two possible betting opportunities. Currently $120 is our EV threshold. This may be relaxed in the future. We only bet on simulation results — never deltaWAA. DeltaWAA EVs are used to wave off bets. Simulations are based upon the entire spectrum of teams with different wins/loss records. If there are extreme differences in wins/losses it is better to play it safe and take a pass. This also may change as we gather more data.
Highlighted are two possible betting opportunities ; Oakland and Mets. Each has EV greater than 120 based upon Tier Combo simulations. The brown colored numbers represent their corresponding EV based upon deltaWAA. OAK drops to 110 and NYN drops to 111. If you average the two both EVs fall under 120 making both a wave off discard. This means there are no betting opportunities today. It also means we can’t lose.
Since MIL is starting a pitcher without enough innings (Wade Miley) this season the entire game is tossed for lack of information. A starter is one of the three important factors in the Tier Combo simulations. Games do not get tossed when there are new guys in a lineup or relief.
That is all for now. This is still a work in progress and the above EV table will become simpler to visualize in the future. Tomorrow CHN starts a series with SDN so an new and expanded Ouija Board will be introduced to hopefully make more sense to the above table. Until then ….