FIP The Proof

I have been meaning to do this for quite some time and the recent thread
over FIP here stirred my curiosity. I finally finished the proof or
disproof of FIP using brute force analysis.

FIP is purported to be able to predict the future. Since we know the
past and we know that past’s future, it is relatively simple to measure
the error in FIP, the error being how far off was that predictor from
predicting a real life ERA the next season.

FIP is calculated as follows:

FIP = (13*HR + 3*BB – 2*K)/IP + constant

The constant makes the average FIP be equal to the average ERA.

For example, picking a year by random, year 2010, actual FIP without a
constant had a range of +1.3 to -1.3 if you slice off top ten outliers
on each end. Because strikeouts are subtractive FIP goes negative.

To counter this is a constant which is 3.18 for the year 2010 making the
actual range of FIP to be 1.9 – 4.5. In the year 2010 there were 302
pitchers who pitched greater than 35 innings. I wanted to eliminate
very part time players.

The methodology for measuring error is rather simple. A delta is the
difference between FIP and the ERA of year n+1.

delta = abs (FIP – ERA(n+1))

abs stands for absolute value. So if ERA is above or below FIP we add
up how much it’s off. A FIP=3 and ERA(n+1)=4 is a delta=1 which is the
same as FIP=3 ERA(n+1)=2.

Add up all the deltas and divide by number of players, 302 in 2010, and
we get an average error.

Before presenting the data for 2010 and then for the last 25 years we
need something to compare FIP to. I chose 3 different systems as
follows:

1) Random: Throw all player FIPs into a hat. Each player picks a
random one to use as their FIP.
2) ERA: It has been asserted that FIP is a better predictor than ERA
even though no one ever claimed ERA as being a predictor. We’ll see.
Calculate delta using ERA instead of FIP.
3) Average ERA: Just give all players the average ERA for the season
which in 2010 was 4.08 and calculate deltas around that.

OK. If all this is as clear as mud we can proceed to results.

For 2010 we have the following error results:

year | # | FIP | Rand | era |avgera
2010 | 302 | 0.89 | 1.05 | 0.91 | 0.88

As you can see FIP has a 0.89 average error predicting the next season’s
ERA. FIP is very slightly better than random and era but very slightly
worse than just using 4.08 for everyone.

That’s all fine and well for one year let’s go back to 1980 until now.
Cut to the chase here are the average error results for the entire lot:

years | FIP |random| era |avgera
1980-2013 | 0.95 | 1.03 | 0.99 | 0.87

FIP can’t beat the average era system which means it’s not very useful
as a predictor. It barely beats random and regular era.

Therefore, the assertion that FIP is more accurate a predictor of future
ERA than simple ERA is true — but barely.

Since FIP can’t beat simple average era it fails as prediction tool.
I’m somewhat surprised it beat random but not by much.

Here are all the individual years data. 2005 and 1984 left out because
I discovered some corruption in their csv files that happened last April
that I don’t feel like chasing down right now. Strike year also left
off. I can provide logs of the yearly detail data if anyone is
interested.

year |  #  | FIP  | Rand | era  |avgera
1980 | 213 | 0.80 | 0.89 | 0.90 | 0.80
1982 | 223 | 0.82 | 0.94 | 0.89 | 0.76
1983 | 239 | 0.77 | 0.89 | 0.84 | 0.70
1985 | 233 | 0.85 | 0.88 | 0.93 | 0.70
1986 | 257 | 0.93 | 0.98 | 0.93 | 0.80
1987 | 254 | 0.96 | 0.94 | 0.92 | 0.84
1988 | 244 | 0.87 | 0.99 | 0.96 | 0.85
1989 | 267 | 0.81 | 0.91 | 0.85 | 0.73
1990 | 239 | 0.87 | 0.95 | 1.02 | 0.83
1991 | 257 | 0.86 | 1.01 | 0.97 | 0.90
1992 | 273 | 1.00 | 0.97 | 1.06 | 0.83
1993 | 266 | 1.09 | 1.08 | 1.06 | 0.89
1995 | 282 | 1.05 | 1.06 | 1.02 | 0.90
1996 | 285 | 0.96 | 1.09 | 0.97 | 0.92
1997 | 304 | 1.01 | 1.10 | 1.03 | 0.89
1998 | 330 | 0.96 | 1.01 | 0.98 | 0.81
1999 | 306 | 1.02 | 1.08 | 1.09 | 0.88
2000 | 317 | 0.96 | 0.94 | 1.01 | 0.87
2001 | 319 | 0.98 | 1.12 | 0.93 | 0.91
2002 | 306 | 1.00 | 1.20 | 1.07 | 0.96
2003 | 304 | 0.97 | 1.08 | 1.04 | 0.86
2004 | 300 | 0.98 | 1.12 | 1.06 | 0.94
2006 | 333 | 1.01 | 1.07 | 1.03 | 0.93
2007 | 296 | 0.97 | 1.11 | 1.08 | 0.98
2008 | 301 | 1.02 | 1.06 | 1.15 | 0.92
2009 | 324 | 0.98 | 0.96 | 0.97 | 0.89
2010 | 302 | 0.89 | 1.08 | 0.91 | 0.88
2011 | 303 | 0.99 | 1.17 | 1.00 | 0.93
2012 | 320 | 0.95 | 0.98 | 0.94 | 0.83
2013 | 307 | 0.94 | 1.00 | 0.99 | 0.88