MLB Deep Dive: How To Use Expected Stats Properly

“This player is having a good year, but his xwOBA is slightly lower than his wOBA, so he’s going to get worse.”

This is a common phrase you’ll hear within the baseball analysis community. To be fair, with the data made available to us, it’s easy to come to conclusions like this. However, it’s not always about the data made available to us, but the analysis that comes from it.

“Wait, Justin, slow down, xwOBA?” To better grasp how this “problem” of data analysis came to fruition, let us go back in time.

Starting in 2015, through Baseball Savant, the public was provided with Statcast metrics for MLB players. Among these stats were exit velocity, launch angle, hard-hit rate, pitch velocity, sprint speed, and, to be honest, practically anything that can be measured! It’s a fabulous website that provides very useful information to which we should be exceptionally grateful to have.

The most popular metrics on the website, however, are their expected stats: expected batting average (xBA), expected weighted on-base average (xwOBA), expected on-base percentage (xOBP), expected slugging percentage (xSLG), and expected isolated power (xISO). Essentially, these statistics are what you’d expect based on the name; they indicate what a player’s “true talent level is” based on the quality of their contact, frequency of contact, and, depending on the batted ball, sprint speed.

With that in mind, this would appear to be a gold mine on surface level. With the ability to know what numbers a player deserves to have, we should be able to separate their talent level from outside circumstances, and, thus, better predict future performance. Yet, that actually isn’t the case:

NOTE: Data Collected From 2015-2019

Expected Stats Are Not More Predictive of Future Success Than Actual Stats

Batting Average versus xBA

On-Base Percentage vs xOBP

Weighted On-Base Average vs xwOBA

Slugging Percentage vs xSLG

Isolated Power vs xISO

As you can see with the coefficient of determinations (r^2), a player’s success in an upcoming season can be better explained by their actual numbers than expected numbers from the past season in every key area. While the gap is minimal, if expected statistics were meant to help us project a player’s future production, then you’d expect that regression to be more clear. As it turns out, though, this was never the goal of expected statistics, according to Tom Tango, Senior Database Architect of Stats for MLB Advanced Media in a 2018 article from Jonathan Judge at Baseball Prospectus:

“Tango then stressed that the expected metrics were only ever intended to be descriptive, that they were not designed to be predictive, and that if they had been intended to be predictive, they could have been designed differently or other metrics could be used.”

For what it is intended to do, which is try to separate a player from extenuating circumstances and tell the story of a player differently, expected stats have accomplished their job. Asking it to do more than that (predict future success) would be asking too much based on the purpose of the metric when formulated, which is evident in the data.

Using Peripheral Metrics

Another potential problem with expected stats is that, similar to Wins Above Replacement (WAR), any metric that tries to tell the complete story of one player on its own is going to lack context for proper analysis. Rather, when projecting future success and analyzing a player’s ability, how the number is generated would be more useful.:

For projecting batting average, it appears that quantity of contact matters much more than quantity. While there was very no correlation between hard-hit rate and batting average, in addition to just a small correlation between sweet spot rate and batting average (r^2= 0.04), the correlation between strikeout rate and batting average (r^2= 0.21) was much larger. Considering that this gap only grows larger when projecting future batting average, based on the stability of strikeout rate, it would appear to that this would be the best peripheral metric to use. Why would this be the case? Well, if luck on batted balls is volatile, having as many shots at potentially getting a hit likely increases your floor much more, and it’s also worth noting that softer hit balls, such as flares and bloopers, end up as hits much more often that typical quality of contact would lead you to believe.

We’ve known that walk rate is a very stable statistic, so it’s not surprising that it remained a much better predictor of future on-base success than batting average, which is much more volatile and subject to luck. Meanwhile, power in it of itself is generally stable, but the key statistic to look at is barrel rate. There is a quite strong relationship between barrel rate and future home runs (r= .609), and this only becomes more significant when looking at power output just in that season (r= .775). Whereas exit velocity or launch angle tell you very little on its own, barrel rate combines the two by looking at the rate of balls hit hard AND at optimal angles. In my opinion, it’s the premier metric available on Baseball Savant.

With that in mind, when trying to project a player’s overall success, combining plate discipline and barrel% would appear to be the best way go about it without factoring in past success. In fact, Eno Sarris of The Athletic wrote recently that chase rate and barrel rate improvements were a great way to assess the legitimacy of a player’s “breakout” in a given season; it’s a relatively easy rule of thumb that can give an edge over your opponents in fantasy leagues.

Meanwhile, sprint speed and home plate to first base times were equally predictive of stolen bases, though there’s obviously much more to the puzzle when it comes to stealing bases than simply speed.

A Note on Pitcher xStats

There has been much more coverage on the predictive nature of expected statistics for pitchers, but to provide background:

  • Fielding Independent Pitching (FIP) projects a pitcher’s ERA based on strikeouts, walks, and home runs allowed- the three true outcomes they should be able to control.
  • xFIP is very similar to FIP, but provides a standardized home run to fly ball rate, as home runs allowed tend to be a much more volatile metric
  • Skill Interactive ERA (SIERA) is very similar to the former two, but adjusts for the type of ball in play.
  • xERA is essentially a pitcher’s xwOBA allowed.

For more on the predictiveness of these metrics, I recommend checking out this fabulous piece by Dan Richards at Pitcher List. To summarize, though, FIP has the best correlation to ERA in a given season, but when looking at the overall predictiveness, xFIP and SIERA slightly remain supreme. Yet, at the end of the day, the premier metric when predicting future success is none other than strikeout to walk ratio. It may seem old fashioned to continue to simply look at K-BB ratio, but with quality of contact being much more volatile for a pitcher and subject to the hitter, you can get a great idea of the quality of a pitcher simply by the amount of strikeouts he has versus the amount of walks he allows. Oftentimes, simplicity should be welcomed!

How To Project For The Future?

If expected statistics don’t have much of a predictive nature, how do we project a player’s future success?

This may seem overly simplistic, but really, it all comes down to projecting off of past performance! There are plenty of projections (ZiPs, Steamer, The Bat X) that do an excellent job of forecasting a player’s production based on what he’s done in the past, and if you want to make projections on your own, I recommend following Tom Tango’s Marcel Projection system. Essentially, it uses a weighted average of a player’s production over the past three seasons, regresses to the mean, has an age adjustment, and has actually proven to continue to be as predictive as other projection systems.

Once you have a large enough sample size on a player, previous batting average, on-base percentage, weighted on-base average, home runs, and stolen bases are the best way to estimate what a player will accomplish in an upcoming season. There’s a lot of variability in surface-level statistics from one year to the next, but if you’re taking a weighted average of multiple seasons, that matters much less. Remember, baseball features a 162-game season, and is thus an inherently large-sample size sport.

What To Do In a Given Season?

In a given season, a player may be on the cusp of a breakout, making projections matter much less useful, even if they attempt to adjust for rest of season projections.

We are gaining new information on players, and this is where expected stats can be more useful. However, since the main curiosity generally involves grasping the legitimacy of a breakout season, looking at the stable peripherals we talked about helps with that. If a player is striking out less, chasing less, walking more, has seen his barrel rate increase, or has increased his speed, those are generally strong indicators of production in various categories. For pitchers, looking at who is generating called strikes and whiffs would be the most beneficial.

Most of all, though, the key is to assess why the breakout is happening. Has the hitter made a swing change? Is this part of his expected developmental track? Has he adopted a new approach? Is a pitcher experiencing a velocity increase? Has he changed his pitch mix? These are the ways a projection system can falter, and, most importantly, indicate that a player’s true talent level has changed. Taking that into account is critical and surely can give you an edge over your opponents.

Overview

So, what we should we take away from this?

While expected statistics are a very interesting descriptive metric, they aren’t meant to be predictive of future success. Every statistic is great if they are used in the way meant to be, and these metrics don’t differ.

At the end of the day, every baseball player is unique. Some hitters naturally run higher batting averages on balls in play, while some pitchers naturally suppress home runs better. While we haven’t been able to peg down how this is the case, the mysteriousness with that might just come down to the beauty of baseball! Over time, these skills become a common trend, and once a player has developed enough of a sample size, there’s no reason to expect regression to suddenly hit him.

Conversely, when a player first comes up to the major-league level and projections have a much higher degree of error, it is much more useful to look at expected statistics and be more skeptical of players who don’t fit the grain in terms of general accumulation of production. These unusual skills take a lot of time to become statistically significant, so you always want to account for the higher range of outcomes with inexperience players. Still, for players with enough of a sample size, projections based on past performance and age adjustments should be the foundation of a draft valuation board and set clear “tiers” in terms of value. From there, if you want to rank players within tiers differently based on new information that might lead to the projection being wrong, go ahead! An objective methodology with subjective adjustments that add needed context are generally the best way to assess future value in terms of not having any fatal flaws that one may eventually regret.

At the end of the day, baseball is a large-sample size sport, so, unlike football, where peripheral metrics such as PFF grades are more predictive of future success than past success, we don’t have that problem, outside of a few clear outlier seasons. When analyzing candidates for regression, looking at track record, rather than xStats, would be better. xStats mainly help for those who don’t have 2+ years of production, and even then, we have minor-league stats and peripheral numbers might be more useful (barrel%, plate discipline). As the late Dennis Green, former NFL head coach of the Vikings and Cardinals once said, “they are who we thought they were”. Hopefully, in the future, we can apply this methodology with baseball players in an attempt for more contextualized and thorough analysis.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s