Team status Part 2

The purpose of this post is to explain the BAT and PITCH fields in team status lines that will be used here throughout the year.  This isn’t very complicated.  The use of this variable is to simplify clutter in stats.  The following tables lists the top 9 teams in baseball based upon their combined BAT+PITCH which is the TOTAL column.

TOTAL BAT PITCH W L TeamID 04-19-2017
26.5 13.2 13.3 10 5 NYA
16.5 14.2 2.3 10 6 ARI
16 8.6 7.4 9 6 CIN
16 3.7 12.3 8 8 LAN
12.9 -3.4 16.3 7 7 MIN
12.4 15.1 -2.7 9 5 WAS
9.1 5.7 3.4 8 7 NYN
8 4.7 3.3 8 7 CHN
7.5 1.2 6.3 10 5 HOU

This is data up to and including 4/19/2017 games.  BAT is calculated  as follows:

BAT{RAA} = Rs(Team) – Rs(Team Average) – LR
PITCH(RAA) = Ra(Team Average) – Ra(Team) – UR

Rs = Runs scored
Ra = Runs scored against
LR = Lucky Runs above average
UR = Unearned runs above average

UR = (Total League Unearned Runs)/(number of teams) – UR(Team)
LR =  LR(Team) – (Total League Lucky Runs)/(number of teams)

The number of teams is 30 and in the first half of the 20th century it was 16.  UR(Team) is the total unearned runs a team has incurred.  Ditto for LR(Team).

UR and LR are necessary to balance the books and they do affect team BAT and PITCH.  Lucky Runs are when a run scores from something like a wild pitch or balk where no one gets an RBI.  There are very few of these but they still count in determining who wins a game and need to be accounted for.  To keep things simple just ignore them for now.  The scripts that make these tables keep track of that as well as integrity checks in case the daily stat dataset is corrupt.

Since every time a run scores it generates a run scored against:

Rs(Team Average) = Ra(Team Average) = R(League)/(number of teams)

A team that scores a lot of runs will have a high BAT(RAA) and a team that lets a lot of runs scored against them can have a negative PITCH(RAA) and visa versa.  A completely average team will have BAT=0 and PITCH=0 or BAT+PITCH=0.   In the above table:

TOTAL = BAT+PITCH = Rs(Team) – Ra(Team)

The above is commonly referred to as the team run differential.  This model separates batting and pitching because it makes it more clear with a single number where a team’s strength lies.  UR and LR were left out in the above to show the gist of this  calculation.

Run differential is used in calculating Pythagorean Expectation which is a long time Bill James’ invented formula to link runs to wins. It is used in this data model which will be explained more when we have enough data to start ranking players.  The only columns in the above table that matter when MLB chooses who goes to the playoffs are the W and L.  This early in the season there are some wild swings which is why ranking players now makes no sense and is pointless.

That is all for now. Until then ….