Probability Distribution of Batter Counts

This post will describe a solution to a question that had to be derived from event data.

What is the Probability Distribution of Batter Counts?

One day I wondered what was the most frequent count a batter had before either making an out or getting a hit or walk.  One of the fields in event data shows the final batter count of each plate appearance.  There are 12 possible batter counts from 0-0 to 3-2.  This script simply runs through all event records and tabulates each type of count then divides by total number of plate appearance.  I chose to just do the 2012 season although I could have included more seasons.  I think the data set is deep enough from one season to come up with a pretty accurate probability distribution.    This distribution can only be derived from event data.

The table below shows a probability distribution of the 12 different counts.  The sum of the P column equals 1.  The most common final batter count is 1-2 followed closely by 2-2 then 3-2.   This makes sense since a batter must have two strikes in order to strike out.   The counts 3-0 and 2-0 are least likely.   The below distribution can be used for betting purposes

Ct    Total       P
32    23099    0.13
31    8351    0.05
30    3828    0.02
22    25468    0.14
21    9677    0.05
20    4554    0.02
12    26943    0.15
11    16145    0.09
10    12716    0.07
02    16026    0.09
01    17128    0.09
00    20291    0.11