Post Season 2021 Updated

This model treats all post season games from 1903-present as a single season. Players like Babe Ruth are ranked together with every post season player who played this season and last season and so on.

Just recently post season data has been recompiled to include 2021 data so now Kris Bryant and Anthony Rizzo show up on teams other than CHN. The following is a lot of inside baseball on how this data model is put together but is written for when I have to compile post season data next year. End of season data starts with retrosheet.org and ends with table upon table in yearly and end of year databases used to display the over 10M pages on this site.

Every year this is a challenge because I forget how to use the scripts used to close out a season. The first step is download post season event data from retrosheet.org and unzip it. Then we run

./parse_events.pl post_season

This script figures out a lot of things and makes a .event file which gets reordered into a .daily file using ./order_events.pl. The resulting data gets made into a .csv file using

./mkeventcsv.pl post_season

This makes the event table csv file. Once read into post_season.db we can run

  • ./mkebox.pl post_season — make box score tables and
  • ./nextgen_runs.pl post_season — make table that contains all run records.
  • ./mkplayoffs.pl 2021 — makes a bunch of other tables

Then cat post season .ROS files into year.post_season.roster and use it in ./mkeroster.pl post_season roster_file to make enhanced roster csv.

Once erosters are made run ./db_day_ebox.pl post_season which will compile the entire post season by reading the event table which has been updated above. The generated _pitch and _bat csv files can be imported into their tables and that’s all there is to it.

Once the database is updated then all player data will contain data from when they started until present (2021). Some players will go up, others down.

When baseball moves to 12 and 14 team playoffs this model will ignore all wild card level games. In 16 team leagues, which make up half of all MLB seasons, there were only a single 7/9 game series each year with two teams. Now there is a single game, 5 game, 7 game, and 7 game series featuring 10 teams. This season will be 3 game, 5 game, 7 game 7 game and subsequent seasons will be more.

Including single game wild card was no big deal but in the future it’s probably better to not think of these games as playoffs and limit our playoff season to DS,LC, and WS. This still leaves modern players overwhelming historical players but it will be limited and somewhat consistent with past years.

Expanding playoffs dilutes the talent pool of players making it into the playoffs. Ideally it might have been better to limit playoffs to League Championships and World Series but we didn’t and the expanded player set for Divisional Series didn’t affect rankings much. So for now we’ll keep all past wild card games as well as whatever it was that counted as playoffs in 2020.

Future wild card series games will still be listed in the playoff section. We just won’t count player stats generated from those games becuase including such lessor teams will cause players to pad their playoff stats on such easy competition. One game Wild Card doesn’t make a difference. Three and five game series will.

BTW: There is an issue with Mariano Rivera. We counted 14 runs, 12 earned in his 141 post season IP and google is saying he only gave up 13 runs with 11 earned.

Since Rivera, the #1 player in post season according to this data model, gave up so few runs it was simple to check them all out and this model was right. He gave up 14 runs, 12 earned and even if they counted runs like those two wild pitches that gave up a run as unearned, that run would still count against Rivera. I don’t see any errors in the event data either.

Not that it makes much of a difference for Rivera. He goes from 0.70 ERA according to most of google to 0.77 ERA here in post season which is insignificant. You can peruse for yourself by clicking here.