Stat SPaz: BABIP, part 1 - 3/17/09
The Starting Line
Stat SPaz: BABIP, part 1
by Evan "the Censor" Dickens
spaz -noun Slang.
1. a grotesquely awkward person.
2. an eccentric person.
3. someone who enjoys starting pitching stats way too much.
Starting pitchers are generally drafted in fantasy leagues to help you win four categories: wins, ERA, strikeouts, and WHIP. Of course, most of these standard fantasy categories are terrible measurements of a pitcher's actual talent level and potential to contribute to your team, which is why hardcore SP dorks like me use a whole array of peripheral sabermetric stats to isolate true pitching talent from the lies that basic statistics can tell.
This is the first edition of an ongoing column I will be writing on those peripheral stats. You will learn which ratios identify talent, and which identify luck. The first statistic we will be addressing is, most definitely, the latter: batting average on balls in play, or BABIP.
BABIP is fairly easy to define. It is similar to normal batting averages--however, home runs are not considered for either portion of the ratio, and the denominator also discards strikeouts and adds back sacrifice flies. As an example, I'll take a random game from Johan Santana's game log: 8/12 against the Nationals. Santana faced 30 batters in 7 innings, giving up 8 hits, 1 HR, and 2 BB, while striking out 6. BABIP is calculated as hits minus HR (7) divided by AB minus K & HR (28 - 6 - 1 = 21) -- equaling a .333 BABIP, as opposed to actual BA given up, which would be 8 hits / 28 AB = .285.
So that's the calculation. Now the importance--BABIP is the most fluky, non-skill-related SP statistic that you can see. It is calculated the same way for hitters, but is not nearly as fluky and regression-prone for hitters. It has been demonstrated time and time again that all starting pitchers regress to a BABIP close to .300. There are natural inclinations to think this wouldn't be the case for certain types of pitchers, so let's look at some 2008 statistics (limited to ERA qualifiers) to see if this bears out. For now I'm covering up all the names.
First, let's look at the numbers. Of the 88 ERA qualifiers (at least 162 IP), the lowest BABIP posted was .245. If regression holds true, this pitcher was extremely lucky with the distribution of balls in play and is prone to have many more of those balls in play turn into base hits. Beware a big downswing coming. The highest BABIP posted was a .366, which indicates a serious run of bad luck and a pitcher who is probably at least a bit better than his numbers and could have some value at a cheap price. The median BABIP was .302, which is right in line with where regression should take them. In the cases, very often what you see is what you get.
The natural questions when looking for holes in the usage of BABIP to determine fluky pitchers--wouldn't BABIP naturally swing for SPs based on their home run rate, since home runs are not considered? Wouldn't high-strikeout pitchers have a higher BABIP? Wouldn't a higher groundball rate lead to a lower BABIP? Let's look at the top five and bottom five in these categories in 2008:
Top 5 (lowest) HR/9 rates BABIP: .313, .305, .306, .297, .311 = .306 average BABIP
Bottom 5 (highest) HR/9 rates BABIP: .326, .317, .289, .301, .309 = .308 average BABIP
Top 5 K/9 rates BABIP: .313, .306, .328, .323, .306 = .315 average BABIP
Bottom 5 K/9 rates BABIP: .345, .308, .289, .315, .327 = .317 average BABIP
Top 5 GB% BABIP: .297, .287, .308, .306, .273 = .294 average BABIP
Bottom 5 GB% BABIP: .280, .306, .290, .304, .283 = .293 average BABIP
Although these individual samples as a whole are fairly fluky, they demonstrate that even the largest differences in these seemingly relevant categories mean absolutely nothing when it comes to BABIP. What matters, is really one thing: fluky luck.
There is plenty of other published research on the luck nature of BABIP, but I think enough of a case is made here without belaboring the point. If we can all agree that BABIP is a stat that can identify luck, then the next step is applying it to 2008 seasons--and seeing how we can identify the under and the overvalued SPs by doing so. And that will be the subject of my next column!
See you soon,
~Evan the Censor
Stat SPaz: BABIP, part 1
by Evan "the Censor" Dickens
spaz -noun Slang.
1. a grotesquely awkward person.
2. an eccentric person.
3. someone who enjoys starting pitching stats way too much.
Starting pitchers are generally drafted in fantasy leagues to help you win four categories: wins, ERA, strikeouts, and WHIP. Of course, most of these standard fantasy categories are terrible measurements of a pitcher's actual talent level and potential to contribute to your team, which is why hardcore SP dorks like me use a whole array of peripheral sabermetric stats to isolate true pitching talent from the lies that basic statistics can tell.
This is the first edition of an ongoing column I will be writing on those peripheral stats. You will learn which ratios identify talent, and which identify luck. The first statistic we will be addressing is, most definitely, the latter: batting average on balls in play, or BABIP.
BABIP is fairly easy to define. It is similar to normal batting averages--however, home runs are not considered for either portion of the ratio, and the denominator also discards strikeouts and adds back sacrifice flies. As an example, I'll take a random game from Johan Santana's game log: 8/12 against the Nationals. Santana faced 30 batters in 7 innings, giving up 8 hits, 1 HR, and 2 BB, while striking out 6. BABIP is calculated as hits minus HR (7) divided by AB minus K & HR (28 - 6 - 1 = 21) -- equaling a .333 BABIP, as opposed to actual BA given up, which would be 8 hits / 28 AB = .285.
So that's the calculation. Now the importance--BABIP is the most fluky, non-skill-related SP statistic that you can see. It is calculated the same way for hitters, but is not nearly as fluky and regression-prone for hitters. It has been demonstrated time and time again that all starting pitchers regress to a BABIP close to .300. There are natural inclinations to think this wouldn't be the case for certain types of pitchers, so let's look at some 2008 statistics (limited to ERA qualifiers) to see if this bears out. For now I'm covering up all the names.
First, let's look at the numbers. Of the 88 ERA qualifiers (at least 162 IP), the lowest BABIP posted was .245. If regression holds true, this pitcher was extremely lucky with the distribution of balls in play and is prone to have many more of those balls in play turn into base hits. Beware a big downswing coming. The highest BABIP posted was a .366, which indicates a serious run of bad luck and a pitcher who is probably at least a bit better than his numbers and could have some value at a cheap price. The median BABIP was .302, which is right in line with where regression should take them. In the cases, very often what you see is what you get.
The natural questions when looking for holes in the usage of BABIP to determine fluky pitchers--wouldn't BABIP naturally swing for SPs based on their home run rate, since home runs are not considered? Wouldn't high-strikeout pitchers have a higher BABIP? Wouldn't a higher groundball rate lead to a lower BABIP? Let's look at the top five and bottom five in these categories in 2008:
Top 5 (lowest) HR/9 rates BABIP: .313, .305, .306, .297, .311 = .306 average BABIP
Bottom 5 (highest) HR/9 rates BABIP: .326, .317, .289, .301, .309 = .308 average BABIP
Top 5 K/9 rates BABIP: .313, .306, .328, .323, .306 = .315 average BABIP
Bottom 5 K/9 rates BABIP: .345, .308, .289, .315, .327 = .317 average BABIP
Top 5 GB% BABIP: .297, .287, .308, .306, .273 = .294 average BABIP
Bottom 5 GB% BABIP: .280, .306, .290, .304, .283 = .293 average BABIP
Although these individual samples as a whole are fairly fluky, they demonstrate that even the largest differences in these seemingly relevant categories mean absolutely nothing when it comes to BABIP. What matters, is really one thing: fluky luck.
There is plenty of other published research on the luck nature of BABIP, but I think enough of a case is made here without belaboring the point. If we can all agree that BABIP is a stat that can identify luck, then the next step is applying it to 2008 seasons--and seeing how we can identify the under and the overvalued SPs by doing so. And that will be the subject of my next column!
See you soon,
~Evan the Censor





0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home