I have been meaning to do this for quite some time and the recent thread

over FIP here stirred my curiosity. I finally finished the proof or

disproof of FIP using brute force analysis.

FIP is purported to be able to predict the future. Since we know the

past and we know that past’s future, it is relatively simple to measure

the error in FIP, the error being how far off was that predictor from

predicting a real life ERA the next season.

FIP is calculated as follows:

FIP = (13*HR + 3*BB – 2*K)/IP + constant

The constant makes the average FIP be equal to the average ERA.

For example, picking a year by random, year 2010, actual FIP without a

constant had a range of +1.3 to -1.3 if you slice off top ten outliers

on each end. Because strikeouts are subtractive FIP goes negative.

To counter this is a constant which is 3.18 for the year 2010 making the

actual range of FIP to be 1.9 – 4.5. In the year 2010 there were 302

pitchers who pitched greater than 35 innings. I wanted to eliminate

very part time players.

The methodology for measuring error is rather simple. A delta is the

difference between FIP and the ERA of year n+1.

delta = abs (FIP – ERA(n+1))

abs stands for absolute value. So if ERA is above or below FIP we add

up how much it’s off. A FIP=3 and ERA(n+1)=4 is a delta=1 which is the

same as FIP=3 ERA(n+1)=2.

Add up all the deltas and divide by number of players, 302 in 2010, and

we get an average error.

Before presenting the data for 2010 and then for the last 25 years we

need something to compare FIP to. I chose 3 different systems as

follows:

1) Random: Throw all player FIPs into a hat. Each player picks a

random one to use as their FIP.

2) ERA: It has been asserted that FIP is a better predictor than ERA

even though no one ever claimed ERA as being a predictor. We’ll see.

Calculate delta using ERA instead of FIP.

3) Average ERA: Just give all players the average ERA for the season

which in 2010 was 4.08 and calculate deltas around that.

OK. If all this is as clear as mud we can proceed to results.

For 2010 we have the following error results:

year | # | FIP | Rand | era |avgera

2010 | 302 | 0.89 | 1.05 | 0.91 | 0.88

As you can see FIP has a 0.89 average error predicting the next season’s

ERA. FIP is very slightly better than random and era but very slightly

worse than just using 4.08 for everyone.

That’s all fine and well for one year let’s go back to 1980 until now.

Cut to the chase here are the average error results for the entire lot:

years | FIP |random| era |avgera

1980-2013 | 0.95 | 1.03 | 0.99 | 0.87

FIP can’t beat the average era system which means it’s not very useful

as a predictor. It barely beats random and regular era.

Therefore, the assertion that FIP is more accurate a predictor of future

ERA than simple ERA is true — but barely.

Since FIP can’t beat simple average era it fails as prediction tool.

I’m somewhat surprised it beat random but not by much.

Here are all the individual years data. 2005 and 1984 left out because

I discovered some corruption in their csv files that happened last April

that I don’t feel like chasing down right now. Strike year also left

off. I can provide logs of the yearly detail data if anyone is

interested.

year | # | FIP | Rand | era |avgera 1980 | 213 | 0.80 | 0.89 | 0.90 | 0.80 1982 | 223 | 0.82 | 0.94 | 0.89 | 0.76 1983 | 239 | 0.77 | 0.89 | 0.84 | 0.70 1985 | 233 | 0.85 | 0.88 | 0.93 | 0.70 1986 | 257 | 0.93 | 0.98 | 0.93 | 0.80 1987 | 254 | 0.96 | 0.94 | 0.92 | 0.84 1988 | 244 | 0.87 | 0.99 | 0.96 | 0.85 1989 | 267 | 0.81 | 0.91 | 0.85 | 0.73 1990 | 239 | 0.87 | 0.95 | 1.02 | 0.83 1991 | 257 | 0.86 | 1.01 | 0.97 | 0.90 1992 | 273 | 1.00 | 0.97 | 1.06 | 0.83 1993 | 266 | 1.09 | 1.08 | 1.06 | 0.89 1995 | 282 | 1.05 | 1.06 | 1.02 | 0.90 1996 | 285 | 0.96 | 1.09 | 0.97 | 0.92 1997 | 304 | 1.01 | 1.10 | 1.03 | 0.89 1998 | 330 | 0.96 | 1.01 | 0.98 | 0.81 1999 | 306 | 1.02 | 1.08 | 1.09 | 0.88 2000 | 317 | 0.96 | 0.94 | 1.01 | 0.87 2001 | 319 | 0.98 | 1.12 | 0.93 | 0.91 2002 | 306 | 1.00 | 1.20 | 1.07 | 0.96 2003 | 304 | 0.97 | 1.08 | 1.04 | 0.86 2004 | 300 | 0.98 | 1.12 | 1.06 | 0.94 2006 | 333 | 1.01 | 1.07 | 1.03 | 0.93 2007 | 296 | 0.97 | 1.11 | 1.08 | 0.98 2008 | 301 | 1.02 | 1.06 | 1.15 | 0.92 2009 | 324 | 0.98 | 0.96 | 0.97 | 0.89 2010 | 302 | 0.89 | 1.08 | 0.91 | 0.88 2011 | 303 | 0.99 | 1.17 | 1.00 | 0.93 2012 | 320 | 0.95 | 0.98 | 0.94 | 0.83 2013 | 307 | 0.94 | 1.00 | 0.99 | 0.88