Category Archives: Study Results

Lineup Relief Table Part 2

We recalculated the lineup-relief tables using innings pitched instead of games.  This is a more accurate measure.  The table below is reformatted to show the lineup-relief combo pair,  innings pitched per game and the average runs per inning scored by  lineup, also given up by relief.

Lineup-Relief IP/Game Avg Runs
1-1 2.77 0.481
1-2 2.79 0.493
1-3 2.84 0.500
1-4 2.73 0.522
1-5 2.91 0.540
2-1 2.83 0.438
2-2 2.75 0.476
2-3 2.81 0.468
2-4 2.90 0.501
2-5 2.88 0.502
3-1 2.72 0.416
3-2 2.75 0.440
3-3 2.73 0.454
3-4 2.76 0.477
3-5 2.83 0.495
4-1 2.77 0.415
4-2 2.72 0.426
4-3 2.78 0.438
4-4 2.72 0.466
4-5 2.78 0.458
5-1 2.77 0.388
5-2 2.65 0.413
5-3 2.63 0.407
5-4 2.61 0.421
5-5 2.69 0.467

Not sure the innings pitched/game column means anything for each lineup-relief pair.  The average runs show a low of 0.388 for the worst lineup against the best relief to 0,540 for the best lineup against the worst relief (highlighted in green above).  This should be expected and the range is rather significant and should provide for interesting results in simulation.

Below is the table above condensed making it easier to see the trend.

Lineup-Relief IP/Game Avg Runs
1-5 2.91 0.540
2-4 2.90 0.501
3-3 2.73 0.454
4-2 2.72 0.426
5-1 2.77 0.388

Average runs scored goes down with worse lineups facing better relief squads as we would expect.  The data looks correct so far.  It’s possible that the best lineup against the worst relief has highest IP/Game because the best lineup will  knock out starters faster than worse lineups making relief pitchers pitch more innings regardless of value.

Since we’re here let’s do this for lineup-starters as well.  Same table format as above.

Lineup-Starter IP/Game Avg Runs
1-1 6.81 0.471
1-2 6.38 0.533
1-3 5.99 0.567
1-4 5.84 0.575
1-5 5.87 0.602
2-1 6.74 0.466
2-2 6.44 0.484
2-3 6.00 0.540
2-4 5.85 0.555
2-5 5.90 0.571
3-1 6.85 0.426
3-2 6.52 0.466
3-3 6.09 0.510
3-4 5.93 0.534
3-5 5.95 0.542
4-1 6.86 0.419
4-2 6.49 0.450
4-3 6.13 0.484
4-4 5.92 0.521
4-5 5.95 0.521
5-1 6.94 0.400
5-2 6.70 0.410
5-3 6.18 0.469
5-4 6.07 0.470
5-5 6.02 0.502

The difference between the two extremes in lineup-starter combos is around 0.2 runs per inning.  For lineup-relief combos that difference is around 0.15 runs per inning.  The innings pitched per game column shows how higher tier pitchers pitch more innings which should  be expected.  The high is 6.94 for the 5-1 lineup-starter pair and drops to a low of 5.87 for the 1-5 pair.

Below is a condensed version of the above table.

Lineup-Starter IP/Game Avg Runs
1-5 5.87 0.602
2-4 5.85 0.555
3-3 6.09 0.510
4-2 6.49 0.450
5-1 6.94 0.400

The trend of average runs follows what we expect with the best linups facing the worst starters to score the most runs which decreases as starter value increases and lineup value decreases.  We will use the inning numbers for simulation.

That is all for now.  The next step is running simulations.

Lineup Relief Table Part 1

Finally got first results of lineup-relief.  Doing this required recompiling the entire daily database.  In order to estimate relief value you have to know  the current set of relievers on a team for each day.  Rosters were estimated using the event data from retrosheet.org.

After knowing the roster we can separate relief staffs into our tier system and run that against lineups.  Here are the average runs scored by the lineup for each lineup – relief pair.

Lineup-Relief # Games Avg Runs
1-1 2085 1.33
1-2 2354 1.37
1-3 5810 1.42
1-4 2661 1.43
1-5 3253 1.57
2-1 1828 1.24
2-2 1872 1.31
2-3 4292 1.32
2-4 1740 1.45
2-5 2068 1.45
3-1 5501 1.13
3-2 5238 1.21
3-3 11466 1.24
3-4 4328 1.32
3-5 5105 1.40
4-1 2815 1.15
4-2 2670 1.16
4-3 5226 1.22
4-4 1993 1.27
4-5 2270 1.27
5-1 2696 1.08
5-2 2638 1.10
5-3 5174 1.07
5-4 1835 1.10
5-5 2000 1.25

The Avg Runs above is the number of runs scored in relief per game.  Column 2 is the number of games in this tier combo for this study.  Column one is the lineup – relief tier combo.

3-3 is a completely average lineup against an average relief staff.  The lineup starter combo table published here shows a 3-3 pair has an average of 4.6 runs scored in 9 innings.    If you divide by games instead of innings pitched that number is 3.11 per game.  Adding 1.24 from the 3-3 row above places the average runs scored per game at about 4.35 runs per game which is almost the real average.

Highlighted in green shows the range from worst lineup against best relief to best lineup against worst relief goes from a low of 1.08 to a high of 1.57.  This isn’t very much of a range probably due to the fact that relief pitchers pitch only a third of the innings that starters pitch.  It might make more sense to use runs/innings pitched in these tables.  Tier 1 starters will pitch more innings per game than Tier 5 starters.  We’ll redo these tables later.  We know the innings pitched for both starters and relievers.

Below is a schmoo of best-worst lineup relief to worst-best by selecting rows from the above table.

Lineup-Relief # Games Avg Runs
1-5 3253 1.57
2-4 1740 1.45
3-3 11466 1.24
4-2 2670 1.16
5-1 2696 1.08

Update: I’m a little behind on all of this.  There is an interesting article from Bill James that we will look at and we also need to analyze MVP, Cy Young, and best reliever AL/NL picks which happened awhile ago.  Like All Star picks these are voted on by people who have  biases.  tl;dr We’re 3/4 on MVP, 2/2 picking Cy Young and 2/2 on best reliever for each league.  Only outlier is Jose Altuve which will tie in nicely with the Bill James’ article on WAR.  More to come and we’re almost done with lineup-relief combo tables and ready for simulations and then running the results against the real daily lines.  Until then….

Lineup Starter Combo Part 4

In  part 2 of the lineup starter combo series we showed a win% for each pair.  Since a starter rarely pitches a complete game it may not be appropriate to use raw wins even though raw wins are a 100% accurate measure of performance.  The next level down in a runs based model is, of course, runs.

For each of the 86,000 games in the dataset we know how many runs a pitcher gave up which is exactly how many runs a lineup gets credit for.  The following table shows average runs scored for each lineup-starter tier pair.  Tiers are introduced in part 1.

Lineup-Starter # Games Avg Runs
1-1 1939 4.3
1-2 1605 4.7
1-3 4861 5.1
1-4 1848 5.2
1-5 2015 5.4
2-1 1607 4.3
2-2 1231 4.5
2-3 3868 4.9
2-4 1408 5.1
2-5 1404 5.2
3-1 4747 3.9
3-2 3599 4.3
3-3 10544 4.6
3-4 3804 4.9
3-5 3761 5.0
4-1 2294 3.8
4-2 1714 4.1
4-3 4908 4.5
4-4 1715 4.7
4-5 1629 4.8
5-1 2129 3.7
5-2 1605 3.8
5-3 4651 4.3
5-4 1662 4.3
5-5 1574 4.7

The tan highlights show two disparate pairs, best lineups and worst lineup paired with each starter tier.   The difference between the best lineup vs. best starter and best lineup vs. worst starter is 4.3 to 5.7.  It is expected that runs should increase as the starter value decreases.  The same difference can be seen win the 5-1 to 5-5 rows above.

Highlighted in green is average lineup vs. average starter which makes up a significant plurality of the games in this dataset.   If 4.6 is considered the average (it is close) then runs above or below average can be calculated.

What does this mean?  We don’t know.  It shows our allocation into tiers may be correct.  The above table only shows starters.  In the next part we’ll cover lineup-relief for which we also place into tiers.  Until then….

The Playoff Season Part 1

This is the first of a multi part series describing the playoff season which can be very different from the regular season.   Mediocre players become superstars and superstars can play mediocre in the playoff season.

This data model treats all playoff games from 1903 – present as a single season.  This creates a large pool of playoff data to draw averages from.  We treat all playoff games as of equal stature.  There are many more playoff games in the modern era so guys like Babe Ruth and Lou Gehrig will suffer but that’s just how it has to be.   This data pool is still very small even with all playoff games.

In part 1 we’ll illustrate the size of this pool by showing a deltaWAA lookup table we compiled for regular season.  The regular season from 1970-present consisted of 86,000 games.  There are only 1500 playoff games for the playoff season dataset.

Here is the deltaWAA table for playoffs from 1903-2016.  DeltaWAA is the win/loss record from the regular season.

Category # Games Total % Home % Away %
1-3 271 0.531 0.527 0.536
4-6 213 0.526 0.555 0.495
7-9 183 0.541 0.552 0.531
10-12 156 0.564 0.543 0.594
13-15 137 0.474 0.487 0.459
16-18 88 0.523 0.533 0.512
19-21 71 0.380 0.385 0.375
22-24 58 0.466 0.417 0.545
25-27 38 0.579 0.636 0.500
28-30 24 0.708 0.786 0.600
31-33 28 0.464 0.231 0.667
34-36 16 0.312 0.500 0.000
37-39 10 0.500 0.833 0.000
40-42 20 0.600 0.538 0.714
43-45 7 0.714 0.667 0.750
46-inf 159 0.516 0.511 0.521

The number of games tapers off as the difference in win/loss records (deltaWAA) increases which should be expected.  When we get into very low number of games outliers are going to affect the measured win%.  The three total win% highlighted in tan show a wide variation.  Obviously a team who is 35 wins ahead of its opponent isn’t at a 0.312 disadvantage.   Outliers are a problem with not much data so let’s consolidate.

Below is a consolidated version of the above table creating larger sets of games.

Category # Games Total %
1-9 667 0.532
10-21 452 0.500
22-inf 360 0.522

What does this table tell us?  Not sure.  Between 10-21 the win% drops to even steven which doesn’t make sense.  The 22 – infinity category is almost the same as the 1-9 category.  This may indicate that regular season win/loss records may slightly increase win%, perhaps 0.52 – 0.53, but it remains constant with deltaWAA.   This means, on average, the  team with the best record in MLB only has a slight advantage over the  team with the worst record in MLB during the playoff season according to the compilation of all historical playoff game results.

A slight advantage is better than no advantage.  During the regular season our lineup-pitcher combo tables seemed secondary to the deltaWAA table.  According to the above findings, the opposite may be true in the playoffs.

We’ll pursue this more later.  Lineup/pitcher combo tables part 3 coming.  Until then….

Lineup Starter Combo Tables Part 3

Lineup starter combos are not independent from the real deltaWAA tables.  Real deltaWAA is calculated by

deltaWAA = | realWAA(home) – realWAA(away) |

The table posted here will provide a probability for the team with the higher WAA if there is no other information available.  We have other information in the value of starters and lineups which we have discussed throughout this season.  In each game there are two lineup starter combination, on for each team.   A very good team with a high WAA will also have either high value starters or high value lineups or both.

Today we’ll run some  numbers to find out what is the lineup starter composition of teams if each tier in their real deltaWAA tier.  Tiers are categorized into 5 groups.

  1. average + 1 standard deviation
  2. average + 1/2 standard deviation
  3. average
  4. average – 1/2 standard deviation
  5. average – 1 standard deviation.

We can do this for real team wins and losses which is a real WAA which is the most accurate measure in baseball and the only measure the Commissioner of MLB looks at when determining who goes to the playoffs.  From the average comes a standard deviation from which should get us 5 more or less equal sets.  Tier 1 is the best.  The Dodgers would be tier 1 as well as perhaps WAS and HOU.

The next step is evaluating strength of a team’s lineup and starter.  We have each separated into 5 tiers.  If we exclude lineup or starter from mattering meaning a 2 lineup, 3 starter is equivalent to a 3 lineup, 2 starter.  If the pairs are unordered we get 9 combinations which we’ll label 1 through 9 to make things simple.

The calculation of getting one through nine is simple.

Strength of team = lineup tier + starter tier  – 1

A 1-1 team would have a strength of 1 + 1 – 1 = 1.   A 5-5 teams would have a strength of 5 + 5 -1 = 9.  Not too complicated.

Now that is setup I present you two tables.   Here is the first table comments will follow.

realTier Stregth #Games % for Realtier
1 1 1519 0.085
1 2 2304 0.130
1 3 5384 0.303
1 4 3598 0.202
1 5 3359 0.189
1 6 1109 0.062
1 7 471 0.026
1 8 35 0.002
1 9 4 0.000
2 1 450 0.028
2 2 1029 0.063
2 3 3652 0.223
2 4 3625 0.222
2 5 4547 0.278
2 6 1929 0.118
2 7 964 0.059
2 8 123 0.008
2 9 29 0.002
3 1 345 0.009
3 2 884 0.023
3 3 5071 0.133
3 4 6940 0.182
3 5 11378 0.298
3 6 7126 0.187
3 7 4897 0.128
3 8 1182 0.031
3 9 375 0.010
4 1 35 0.002
4 2 139 0.009
4 3 915 0.059
4 4 1689 0.109
4 5 3987 0.256
4 6 3502 0.225
4 7 3621 0.233
4 8 1191 0.077
4 9 466 0.030
5 1 27 0.001
5 2 65 0.004
5 3 450 0.025
5 4 951 0.052
5 5 3327 0.184
5 6 3842 0.212
5 7 5612 0.310
5 8 2465 0.136
5 9 1387 0.077

That’s a lot of numbers and here is an abbreviated table.

real Tier Strength
1 3.58
2 4.30
3 5.02
4 5.78
5 6.43

Above are averages of strength with the real wins tier.  As tier increases strength decreases at almost the same rate.

That is all for today and I’m not sure what all this means.  Cubs start a new series with NYN tomorrow so we’ll do a series analysis.  Until then….