Between velocity, command, movement, and stamina, there are so many different skills that a pitcher needs to have, so it’s important to be able to quantity them. However, that doesn’t mean that all of those abilities are created equal. We’ve already discussed how much more valuable a pitcher’s “stuff” is over their command, but today, we’ll take it a step further- we’ll conclude which metrics should be used to evaluate a pitcher.

In this study, the goal is to discover which metrics are the most stable. The goal of a front office is to have players who they believe will perform well on a yearly basis, and in order to feel good about a pitcher’s prospects, it’s best that they rate well on stable metrics. Meanwhile, if a pitcher’s success is tied to less stable metrics, they are far less reliable and will likely struggle with inconsistency. To assess the stability of each metric, I plotted how each qualifying pitcher fared in a statistic from one year to the next, going through 2015 to acquire a large enough sample size.

With pitchers, there are generally two different categories of statistics. There are the “peripheral” statistics, such as their strikeouts, walks, and home runs allowed, and then there are the “run prevention” metrics. Essentially, the run prevention metrics take into account certain peripherals to compute a statistic based on the runs they are or should be allowing per nine innings, so the two go hand and hand with one another.

To start, we’ll examine the stability of each of the main peripheral statistics. For this exercise, they’ll be split into the following categories: batted ball, luck, and strikeout/walk ability.

**Batted Ball**

For the batted ball peripherals, the results are extremely mixed. Interestingly, pitchers generally have a good amount of control (r^2=64%) of the trajectory of the ball that the opposing batter puts in play. However, they have far less control over how hard the contact they allow is, which makes some sense- it’s largely dependent on the quality of hitter they are facing. What fascinates me, though, is how unstable line drive rate is on a year-to-year basis. To connect on a line drive, a hitter essentially needs to connect with the ideal launch angle, and that can happen on any particular pitcher. Therefore, if a pitcher had a very low line drive rate the following year, he can probably expect for that percentage to jump up significantly the following year.

**Luck**

I categorized home runs allowed per nine innings (HR/9), batting average allowed on balls in play (BABIP), and left-on-base percentage (LOB%) as “luck” peripherals; there are a lot of variables that go into how a pitcher fares in these statistics. For example, home runs can be dependent on the stadium they’re pitching in, BABIP relies a lot on luck and the speed of the batter, and LOB% is predicated on a pitcher consistently being able to work through traffic. So, for that reason, it should surprise nobody that all of these metrics are incredibly unstable, with coefficient of determinations below 10%. In other words, less than 10% of how a player rates in these statistics can be explained by their performance in these same metrics from the following season- HR/9, BABIP, and LOB% shouldn’t be used as predictive statistics.

**Strikeout/Walk Ability **

Now, we’ve reached the category of peripherals that has the most year-to-year stability. Primarily focusing on swinging strike rate (Sw Strik%), K/9, and the contact rates, pitchers are able to control the aspects of where the ball isn’t put in play. Meanwhile, the fact that walks are less stable than strikeouts fits with our past discovery that “stuff” is more valuable than command, even though walks are a more reliable statistic than some of the other statistics we’ve gone over. When it comes to missing bats, though, pitchers are less susceptible to change on a yearly basis.

Okay, so trajectory of contact and missing bats appear to be the most stable aspects of pitching from year-to-year. So, what does that tell us about the merits of each of the main run-prevention metrics? Let’s take a closer look:

Since allowing runs has a lot to do with being on the right side of some of the unstable “luck” and batted-ball metrics, it should come to no one’s surprise that earned runs average (ERA), which only takes into account runs allowed, isn’t a very reliable statistic. Meanwhile, although fielding independent pitching (FIP) only takes into the account the three true outcomes to predict what a pitcher’s ERA should be, allowing home runs is predicated a lot on luck, so that’s not the best statistic either. On the other hand, expected FIP (xFIP) gives pitchers a normal HR/FB rate, which would explain why it’s far more stable, which goes for Skill Interactive ERA (SIERA) as well- it’s similar to xFIP, but takes into account trajectory of contact and values strikeouts even higher. Meanwhile, win probability added (WPA) is dictated by the amount of pressure situations a pitcher has, so it’s obviously not a predictive metric. Regardless, it’s clear that xFIP and SIERA are the top two run-prevention metrics, and which one a person uses is really up to preference.

You should use xFIP and SIERA if you’re looking for one run-prevention metric to get a quick read on a pitcher’s quality. However, when projecting their success for an upcoming season, I highly recommend digging deep into their peripherals. If I’m investing in a pitcher, I’d want to do so on someone who induces a lot of ground balls and misses a lot of bats, as those are the two stable skills year-to-year. Even then, though, it’s clear that pitchers are much more subject to volatility than hitters, so I’d be very careful investing in one sole pitcher- having as much pitching depth as possible may be the way to go. Still, if front offices are targeting pitching and are looking to get good value on the marketplace, they should search for pitchers who suffered from a high BABIP and HR/9, or a low LOB%, as those are some of the more volatile statistics on a yearly basis. A pitcher like Max Fried of the Braves, for example, had ideal strikeout and walks numbers, but his ERA (4.02) sat over 4.00 due to having the second highest BABIP allowed among qualified starting pitchers. At the same time, his teammate, Mike Soroka, appears to be a budding star, as he rated out well in the unstable statistics, and thus was able to post a 2.68 ERA. However, long-term, I’d be much more comfortable betting on Fried, who actually was a better pitcher in terms of xFIP and SIERA than Soroka, as he’s already demonstrated the ability to succeed in the more dependable metrics. As baseball becomes more of a three-true outcome game, it’s more likely that ERAs will start to match with the quality of pitcher they are. For now, though, there is still a misconception of what teams should look for in a pitcher, but if they choose to rely on pitchers who are talented in the more stable areas (controlling the trajectory of contact and striking hitters out), they are more likely to feel good about their investment.

## One thought on “Study: What Are The Best Metrics For Evaluating a Pitcher?”