clock menu more-arrow no yes

Filed under:

Talking Chop Baseball Analysis Primer: Fielding

MLB: Washington Nationals at Atlanta Braves Dale Zanine-USA TODAY Sports

Where hitting is straightforward, fielding is anything but. With recent developments, the landscape of fielding value has actually gotten more fraught, rather than clearer or easier to interpret. But, it’s still a huge chunk of value, so let’s dive in.

UZR and DRS

The first thing to know is that there are two main defensive metrics currently in use: UZR and DRS. There are some others, too, such as Baseball Prospectus’ Fielding Runs Above Average (FRAA), but for the most part, you’ll see defensive aptitude discussed as one of these two. Both UZR and DRS are pretty similar in terms of what they strive to accomplish, though they don’t get there entirely the same way. The main thing to remember when comparing the two is that UZR tends to be more average-y, while DRS tends to be more aggressively negative or positive. However, that’s more of a tendency than a rule -- there may definitely be cases where DRS for a player is near zero, and UZR is a big number in one direction or another. UZR and DRS will generally agree with each other, because they’re using the same body of actual in-game plays to make their measurements and calculations, but not always. Where they disagree strongly is always an interesting case study.

So, how do these things actually work? It’s actually pretty complicated, and to do it complete justice would essentially be re-typing the UZR primer currently available on Fangraphs. So, instead, the below is a paraphrased, overly simplified, illustrative-only version.

The metrics start with the basic idea of whether a ball in play was converted into an out, or fell for a hit (or was an error that resulted in the batter reaching base). Before it’s known whether any given ball in play is an out or a hit, it has two possibilities: it either carries the run value of an out (positive for the fielder, negative for the batter), or the run value of whatever type of hit it’s likely to be (negative for the fielder, positive for the batter). (A note here that while this run value is NOT xwOBA, hopefully it essentially becomes xwOBA in the future; the defensive metrics were developed separately from and before xwOBA, but the general principle of finding the average run value for a ball hit into a certain place ) Therefore, when a fielder makes a play, he gets a credit, based on the difference between the run value of the out and the run value of the hit. When he fails to make a play, he gets a debit, based on the difference between the run value of the hit and the run value of the out. However, this isn’t the only factor in play, because difficulty matters too. If a ball is normally transformed into an out 40 percent of the time, but the fielder transforms it into an out, he gets 60 percent of the credit of the run value differential. If it’s normally transformed into an out 99 percent of the time (an easy, routine play), the fielder only gets 1 percent of the credit of the run value differential, or essentially, no value at all. Similarly, if a ball is normally transformed into an out 40 percent of the time, but the fielder doesn’t make an out on it, he only suffers 40 percent of the debit of the run value differential. So missing a hard play doesn’t change the defensive metric much, but goofing an easy play is deadly.

At its heart, that’s it. Basically, there are two big things that the defensive metrics reflect: (1) the value in turning non-outs into outs, with balls more likely to be extra-base hits providing more value if caught, or subtracting more value if not caught; and (2) the harder the play is to make, the more credit the fielder gets, and vice versa. This structure is used for the “range” component of both DRS and UZR, but many of the other components use similar principles. Specifically, these components are:

UZR

  • RngR - this is range, as described above
  • DPR - double plays converted per opportunity, basically an added measure of how often the fielder is able to transform balls hit to him into double plays (if the situation allows for one)
  • ErrR - the process for these is basically similar to range, except that errors are assumed to be made on easy-to-field balls, and as such, almost the full value differential is debited. Functionally, all this really means is just a separate category to track how often a fielder goofs relative to his peers in terms of runs, but still using the principle that balls more likely to be extra-base hits are costlier errors than surefire singles
  • ARM - only for outfielders, this score quantifies the run value of the player’s throwing arm, in terms of extra bases prevented. The methodology is similar to range, i.e., the metric considers, for each batted ball, how likely existing runners are to take an extra base, and then determines whether the extra base was taken or not (and whether an out was recorded on the play, I believe). That way, fielders get credit not only for outfield assists, but also for having the type of arm that dissuades runners from trying to take extra bases in the first place. (If it didn’t do this, guys with surprisingly good arms would get better ARM scores than guys with known cannons that never get tested.)

DRS

  • rPM - same idea as RngR in UZR
  • rARM - same idea as ARM in UZR
  • rGDP - same idea as DPR in UZR
  • rGFP - the main difference in what’s tracked/not tracked between the two, this is a relatively subjective metric that awards players run value for certain weird plays otherwise hard to quantify in a rigid system, such as cutting a ball off in the gap to prevent a hustle double, or some kind of heads-up relay throw. For most players, this is really only a marginal effect on their overall defensive value.

The UZR and DRS scores tell you some version of “what actually happened,” although they’re less focused on straight-up reporting that “X percent of balls hit to the player were turned into outs” as they are on giving you the run value associated with that player’s defensive performance. Again, UZR and DRS might differ. If they differ a lot, the implication for a player’s WAR might be substantial. For example, Ender Inciarte had around +7 UZR in 2018, but +17 DRS. fWAR currently uses only UZR as an input (though their pages present both UZR and DRS data), so Inciarte’s 2.9 fWAR for 2018 includes approximately 0.7 wins from his superior defense relative to his center-field peers (he also gets a separate boost from playing center field in the first place, more on that later). But, if we changed that 0.7 to 1.7 from DRS, he’d have closer to 3.9 fWAR instead, going from a solidly above-average player to a great one. Even if you split the difference, he ends up around 3.4 fWAR. I bring this up only to say that if you want a fuller snapshot of player value, it’s worth it to peek both at UZR and DRS and consider what the player’s value would be under both. (A note here that nominally, the main difference between fWAR and bWAR for position players is indeed whether UZR or DRS is used; however, the DRS reported by Baseball-Reference is not always the same as that on the Fangraphs player pages. So it’s up to you whether you just calculate a guy’s DRS-fWAR to compare with his UZR-fWAR, or whether you just compare bWAR and fWAR for position players, it’s all just “more information” with which to contextualize player value.)

So, if UZR and DRS tell you, to some extent, what actually happened, is there an xUZR or xDRS that tells you what’s likely to happen in the future? Unfortunately, not really! All you’ve really got to go on are the measures themselves, and their component pieces. There’s also another issue: when a player is batting, if he’s healthy all year, he gets 600 or more PAs. When a player is fielding, if he’s healthy all year, he might get a chance to make 300 plays if he’s a shortstop on a very groundball-focused staff, but is instead likely to end up seeing a chance for around 200-250 plays (most shortstops, third basemen, second basemen, center fielders), around 150ish plays (most left fielders and right fielders, some first basemen), or 100 to 150 plays (many first basemen). That means that a single season’s worth of offensive data contains a much greater sample (at least twice as big, and oftentimes three times as big or more) than a season’s worth of defensive data for a player. To that end, if a player had only 200 PAs and you wanted to wait-and-see a full season before determining how good he was offensively, that would mean you’d need to wait until you got at least two (and more like three) seasons of defensive data to be internally consistent with that logic. This is generally why the rule of thumb is “wait for three seasons of defensive data.”

Unfortunately, players also age in real time, so you often can’t afford to wait. In these cases, some basic regression to the mean might be helpful -- simply assume that whatever you don’t have data for is average. For example, if a fielder is +10 in one season, assume that his other two seasons are +0, such that his three-year average ends up being 10/3, or +3.3. This seems pretty harsh, but it prevents your forward-looking assumptions from being too aggressive.

Because they are run-weighted and adjust for difficulty of the play, UZR and DRS are better than other fielding measures that don’t do either of these things. However, as useful as they are, the evolution of defensive play in baseball is also making them less useful over time, mainly due to one thing: defensive positioning. The thing is, UZR and DRS base their underlying inputs on “buckets,” which are basically slices of the field where the ball goes. For each bucket and batted ball type (and possibly some other criteria, like hangtime for flies, whether the infield plays in or back, and whether there’s a shift on), there are data on the likelihood of the ball being an out, and the run value of said ball. What’s missing, though, is exactly how far the fielder had to travel to get to the ball. That creates kind of an issue!

To see why, consider a team that tinkers around with infield defensive positioning a lot. To the extent this helps its fielders record more outs (and I’m not sure why a team would employ defensive positioning that didn’t help in this regard), the UZR and DRS of its fielders will increase. But, let’s say that there’s a situation where the shortstop is playing up the middle, and there’s an end-of-the-bat routine roller that skips through where the shortstop usually plays. That play will be made, say, at least 90 percent of the time. But in this case, the shortstop didn’t make it. He gets the debit for allowing a single on an easy ball, but it wasn’t his fault: he was positioned where the batter was going to be more likely to hit it. Before defensive alignments got very batter-specific UZR and DRS made more sense, because the underlying “likelihood of being fielded” data were more general -- balls in certain areas were more consistently likely/unlikely to be turned into outs, because alignments were standard and the distance to travel to a given ball was more similar for most fielders. These days, that isn’t the case, but UZR and DRS are still the same. Is there a solution? Yes, but not one that’s immediately publicly available. One way to think about defense separately from UZR and DRS is to take the xwOBA of all balls hit near a fielder, and look at the resulting wOBA on them. This requires a few extra steps to convert it into value, but it’s theoretically doable. However, it gets difficult because “hit near a fielder” isn’t an easy thing to query. Hopefully this is the sort of thing Stacast rolls out publicly sooner rather than later. With that said, though…

Statcast Defensive Metrics: OAA and CPA

What Statcast data do have available right now is really useful: a measure of defensive quality for outfielders (only). The way these work is kind of like how UZR and DRS work, but also kind of like how xwOBA works. Basically, Statcast does capture fielder starting position, as well as hangtime of the ball and the direction the fielder has to travel to get to it. The combination of these things allows for a calculation that says, for each batted ball to the outfield, how likely it is to be caught given how far away the fielder is, how long the ball hangs up, and which direction the chase for the ball is in. That actually makes the calculation of Outs Above Average (OAA) really simple: if you have, for each batted ball, a likelihood of it being caught, then a fielder that catches a ball with 60 percent likelihood of being caught adds 0.4 outs to his ledger; a fielder that fails to catch a ball with a 90 percent likelihood of being caught loses 0.9 outs. CPA (Catch Probability Added) is the same concept, but more of an average than a counting stat: it just averages the catch probability of all balls to the fielder, and then compares that to the actual rate (full of 1s for catches and 0s for misses) that the fielder recorded. If the actual rate is higher, the fielder added catch probability; if the actual rate is lower, he gave some away. In addition, OAA is published in a directional basis, letting you see whether a particular fielder is especially good or bad at moving in, back, left, or right.

Right now, OAA is not run-weighted. In reality, preventing doubles is more useful than preventing singles, but OAA counts a fractional out that removes a single the same as a fractional out of the same magnitude that removed a likely double. Further, OAA is really only about range; it doesn’t say anything about a player’s arm. Run-weighting OAA shouldn’t be too hard, but that’s not available yet in anyway. It shouldn’t be that hard to either assign the average run value for a fly ball out versus fly ball hit and use that as a factor by which OAA gets multiplied to transform it into runs. Alternatively, since CPA is essentially done on the same basis as comparing wOBA and xwOBA, the “wOBA differential” for a fielder translates to runs really easily, on the same basis that wOBA can be converted to wRAA and wRC+, which are then easily translated to runs. Even with these missing pieces, however, OAA is really helpful for gauging outfielder quality. Right now, if you look at UZR or DRS, you don’t really know to what extent the number is giving you “this fielder is really good” versus “this fielder was positioned really well to make outs without having to run as far.” OAA takes that out of the equation, because positioning is directly considered. If an outfielder is positioned well, such that he has a 100% chance of getting to every ball hit to him based on distance and hangtime, then he’s not going to be able to get any positive OAA since the expectation is that he’ll catch everything. This is really different from UZR, where weird-but-prescient positioning can give a player a huge boost by catching a would-be triple that’s only otherwise caught like ten percent of the time based on where it was hit.

There’s also not quite enough of a track record to know how predictive OAA and CPA are, as these stats are pretty new. Fundamentally, OAA should be free of most noise, because it measures an underlying input (player speed/route efficiency) rather than something prone to random variation. But given that fielders only get so many fly balls to deal with a year, and there are all sorts of small variations below the surface (fatigue, nagging injury, sunlight/glare, etc.), it’s not super-clear right now whether OAA is “sticky” and useful for future projection, or just a good descriptive stat.

The Positional Adjustment and Def

The positional adjustment is probably one of the most annoying concepts, because unlike most things in this primer, its use and indications can be kind of counterintuitive. Basically, here’s the idea: there’s hitting value, there’s baserunning value, and there’s fielding value. Each of these is compared to “average.” For hitting, this value is based on hitting better than or worse than average. For baserunning, it’s the same. And for fielding, your UZR/DRS/etc. is once again compared to league average… except for this last one, because of how UZR and DRS are calculated (remember the “buckets”), a player is compared his peers at the same position, because those are the only ones also getting chances on similarly-hit balls. However, this is kind of an issue for comparing players to one another. A center fielder is responsible for way more of the field than a first baseman, so it doesn’t make sense to say, “This center fielder is two runs better than his peers at center field, and this first baseman is two runs better than his peers at first base, so their defense is equally valuable.” And, indeed, based on limited data about players who switch positions, there is a hierarchy of “value you get just for playing a position,” which is approximately as follows:

  • Catcher: +12.5 runs (or about 1.25 wins) per 162 games
  • Shortstop: +7.5 runs (or about 0.75 wins) per 162 games
  • Second base, third base, center field: +2.5 runs (or about 0.25 wins) per 162 games
  • Left field, right field: -7.5 runs (or about -0.75 wins) per 162 games
  • First base: -12.5 runs (or about -1.25 wins) per 162 games
  • Designated hitter: -17.5 runs (or about -1.75 wins) per 162 games

Note that these numbers are purposefully made “smooth” as general estimates. In 2015, Jeff Zimmerman updated the original analyses and found something a little different (https://www.fangraphs.com/tht/re-examining-wars-defensive-spectrum/), more compressed:

Under this alternative set of positional adjustments, there’s less overall spread and players derive a little less value just from being listed at a given position. These updated adjustments aren’t currently used in WAR, but if you wanted to use them, that’s fine. They’ll only move the needle about half a win (except I guess for designated hitters) at the most, and that was already the baseline level of uncertainty in WAR anyway, so it’s not a huge deal.

In any case, here’s the reason why the positional adjustment is a little weird: it’s based directly on the relative difficulty of different positions. So, say you have two outfielders, Glovey McField and Fieldy McGlove. Glovey currently plays left field, where he’s a +5 defender relative to his peers. Fieldy currently plays center field, where he’s dead average (+0). Based on the original positional adjustments, we know that corner outfield is essentially 10 runs “easier” than center field, meaning that moving from corner outfield to center field will knock 10 runs off a player’s fielding, because the “peer group” to whom he’s being compared changes when he moves positions. So, if Glovey was a +5 left fielder, he’d be a -5 center fielder. But, remember, the positional adjustment is still here. So when we think about defense (above/below average) plus positional adjustment, Glovey was +5 with a -7.5 positional adjustment, or -2.5. If he starts playing center, he’ll be -5 with a +2.5 positional adjustment, or still -2.5. For Fieldy, it’s the opposite: he was +0 in CF, which suggests +10 in LF. But he goes from 0 plus 2.5 to 10 minus 7.5, and his defensive value is also unchanged. The main principle here is that the positional adjustment already reflects how hard one position is relative to another, so moving positions doesn’t change a player’s overall value, it just changes how well he fields that position (with the positional adjustment making up any difference).

The thing is, you can get kind of crazy with this. Imagine a lumbering first baseman (-12.5 positional adjustment) who manages to be average defensively relative to other first basemen. Now stick him in center field. Per the positional adjustment, he’d gain 15 runs from the move, and he’d lose 15 runs defensively having to run around the outfield. Do you buy that? Maybe for some first basemen! 15 runs below average is really bad defensively, so maybe it makes sense that that’s what a first baseman would do. But if you don’t buy it, and you think that a first baseman would be more like 30 runs below an average center fielder if forced to play center field, then the positional adjustment doesn’t hold and isn’t really doing its job.

There’s no great guidance here. My general assumption is that the positional adjustment is based on actual data, so there’s no reason to assume it’s invalid or inapplicable unless there’s something player-specific that really warrants consideration. In general, I’m hard-pressed to think what such a player-specific consideration might be, but one thing could be a fairly good infielder (+2.5 adjustment for second base or third base) who has no footspeed. This player would be expected to be 10 runs better in the corner outfield than at second or third, but if his footspeed prevents him from being great outfielder, that 10-run difference won’t hold.

None of this changes actual accrued player value, though -- it’s more of a thought experiment about how much player value changes if they move position. (My thought experiment conclusion: not much, unless you have some really good reason to think it will that’s specific to the player.) So, with that in mind, actual accrued player defensive value is basically the sum of two things: (1) the player’s UZR (or DRS, or other run-weighted “defense above average” measure); plus (2) the player’s positional adjustment. Sum these up, and you get what Fangraphs calls “Def.”

Fangraphs’ Def uses UZR and the original positional adjustments. Since it’s an easy sum, you can calculate your own Def by using an alternative positional adjustment using DRS or some other defensive measure. But you do need to include both components; if you skip the second, you make first basemen equivalent to center fielders. Just remember to be consistent: it doesn’t make sense to use DRS for one player and UZR for another. Maybe just average them, or present a series of ranges, as necessary.

Def is handy because it’s expressed as a number that can be either positive or negative. Like WAR itself, it’s a one-stop shop for defensive quality shorthand (again, it’s descriptive, not a stat directly meant to be used for forecasting, though). If a guy has a Def of zero, he’s basically an average MLB defender, once you take position and actual fielding quality into account. A guy with a Def of, say, +5 could either be a mediocre-ish shortstop or a really good defender at an easier position. He may even be both at the same time! A guy with a Def of say, -5 could either be an awful defender at a hard position (like a really bad shortstop), or a decent corner outfielder. Or, again, maybe both.

In any case, Def and fielding value in general are easy to futz around with as far as player value goes -- it’s already directly on a runs basis. So if you have an average hitter and baserunner who you’re going to pencil in as a corner outfielder and you want to know how good of a defender he has to be to be an average MLB player, the answer is pretty simple. You start at 2 WAR, and you lose 7.5 runs for the positional adjustment. So, the player will need to be a +7.5 fielder in a corner outfield spot to make up those 7.5 runs and get back to 2 WAR. If you have a first baseman with an average bat, he starts at 2 WAR and loses 1.25 WAR for the positional adjustment, meaning he’s going to have to be really, really good defensively to clamber back up to average. If you’re wondering why the market for decently-hitting corner players with defensive limitations has largely dried up in recent years, this is a good hint as to why that’s been the case. Defense matters, and defensive runs are just like offensive runs, for the most part.

When you’re doing all this, though, just remember that positioning matters too, and what you’re reading in a defensive stat might be a team positioning skill as much as the player’s own instincts, range, and arm. It’s a complex world out there, but it’s too important to ignore. It just requires prudence and perseverance.

Catcher Defense

Catcher defense is entirely its own beast. For one, there’s no general UZR data for catchers. As far as what catcher defense should even consider, well, even that in and of itself is a very fraught question. So, what exactly does a catcher do?

A catcher controls the running game. This is included in catcher defensive metrics! The way to think about this is much like how to think of the arm metric for outfielders: it’s not just throwing guys out that matters, but preventing the attempt as well.

A catcher minimizes extra bases on balls that get past him. This is included in catcher defensive metrics! The measures for this probably aren’t that sophisticated, but the general idea is that there’s an average rate of wild pitches and passed balls per pitch (or inning, or game) caught, so catchers that allow fewer of these get some added value by not allowing as many free bases as average.

A catcher is responsible for some small degree of fielding plays. This one… may or may not be included, depending on the defensive metric. However, it’s relatively few plays. I’m not sure this would move the needle much even if it were rigorously accounted for. But perhaps it would.

A catcher is responsible for receiving throws and blocking the plate. This one probably isn’t included, except potentially as a part of the rGFP metric in DRS. This doesn’t come up much, and in most cases, the throwing is on the outfielder or relay man (even if there are a few cases where catchers are really responsible for getting or failing to get the tag down each season).

A catcher is responsible for framing pitches (at least until we get robot umpires and automated strike zones). This one, here’s where it gets dicey. For background on the issue, you may want to refer here: https://www.talkingchop.com/2018/11/23/18107273/mlb-framing-war-gap-catchers-pitchers-tyler-flowers-yasmani-grandal. The basic idea is that right now, all value is based on play outcomes. A hitter did or didn’t reach base, a fielder did or didn’t make an out. But pitch framing is kind of weird: it essentially requires us to move beyond the “outcome” idea for value, and start giving credit for changing counts even before an outcome is recorded. Further, WAR is already set at 1,000 per year. If we start giving catchers more value for framing, we need to reduce someone else’s value. Would we just re-balance catcher defense? Would we move some pitcher value to catchers, since catchers are helping them? What about where framing leads to weak contact later in the count, giving a fielder an easier play? It’s a complex topic, and until there’s a great solution, pitch framing is going to be the one place where WAR has a gap: we know pitch framing is real and matters, but WAR doesn’t include it. It’s basically the one place where, if pressed for time, looking only at WAR in and of itself won’t tell you everything you need to know about a catcher’s value to his team in terms of production on the field.

So, what to do? Well, there are a few things. Two places currently publish good catcher framing data. One is Statcorner, another is Baseball Prospectus. The latter has its own WAR-type stat, WARP, which already incorporates catcher framing. So, if you’re okay with using the other aspects of WARP for player evaluation, you can just look at WARP leaderboards to compare catchers, inclusive of framing skill, with other players. Otherwise, to compare two catchers to one another, or do something else that includes framing, you’ll need to add framing value manually to a catcher’s bWAR or fWAR. Luckily, both Statcorner and Baseball Prospectus provide their framing data on a runs basis, so you don’t need to convert anything to anything else. Just add it up. Very recently, Fangraphs also added a DRS-based framing metric to its player pages for catchers, called rSZ. You can use this too, it’s just like the other DRS components.

In addition to pitch framing, a catcher may (or may not) have some responsibility in terms of game-calling. This is too nebulous to really be able to quantify. If a catcher really calls a bad game, then why doesn’t his team just take the ability out of his hands, or give him a binder that helps him call a better game? It’s not really a physical skill, nor some kind of huge mental lift, so I’m not sure this really needs to be quantified, any more than a pitcher should be further penalized for throwing one too many fastballs in a sequence if he shook the catcher off. At times, there have been desires to quantify catcher success by calculating “catcher ERA” or similar, which is really an attempt to attribute the pitcher’s success to his catcher. Unfortunately, catcher ERA variants have generally been something like a junk stat: while you can calculate them, they don’t really tell you anything other than which catcher happened to be catching when a pitcher did well or poorly. Catcher ERA-type stats bounce up and down wildly based on pitcher quality; if there’s some kind of actual quantitative effect to a pitcher’s stats that can be traced to any specific catcher (beyond his contributions to framing, fielding, controlling the running game, etc.), it’s not really self-evident right now. That doesn’t mean it isn’t present, but it does mean that adding more to catcher value based on “catcher ERA” seems fraught right now -- if only because, again, if you credit a catcher for how well his pitcher pitches, then don’t you need to take that credit away from the pitcher?

Errors, Total Zone, and Other Fielding Considerations

This primer has walked through the ways that defensive value is currently considered these days. What this section is about, however, is the other defensive stuff that hasn’t been mentioned until now, and why it’s not (as) awesome.

  • Errors and fielding percentage. This way of evaluating defense is probably the worst one available, and might be worse than just assuming all players are average at defense relative to their peers at a certain position. There are two big issues with metrics derived based on errors. First, errors are (for some reason) awarded only when a fielder gets to a ball and botches his motions in some way. Errors on failing to convert really easy plays are rare. This means that the more balls a fielder gets to, the greater his potential for errors, yet penalizing players for getting to more balls makes no sense. Second, there’s no weighting in error counts or fielding percentage. A throwing error that lets a runner reach first base is way less bad than a botched cutoff that turns a single into three bases, or dropping a ball deep in the outfield. Some errors, like passed balls, are also way too discretionary -- any passed ball could really just be a wild pitch or vice versa.
  • Total Zone. Total Zone is kind of a proto-defensive metric that uses play-by-play data to do the same things that UZR and DRS do. However, it’s based off of play-by-play information (think the stuff you get on MLB Gameday, ESPN, or Yahoo! Sports if you can only follow along with the game rather than watching it). The UZR/DRS eras only started in the early 2000s, when baseball data companies started to actually collect, track, and score information on each ball in play. For run-weighted fielding data before this period, Total Zone is more or less the only game in town. On the one hand, this means that we can still assign some degree of defensive value to player-seasons before 2002. On the other hand, Total Zone is even more erratic than UZR and DRS. For example, Lonnie Smith was generally considered (and rated by Total Zone) as a pretty poor defensive outfielder, except a random +23 Total Zone mark in 1989, which he rode to an 8.1 fWAR season (wow!). He never again hit double-digit Def values (positional adjustment plus Total Zone), but had two double-digit negative Def values in his career. For player-seasons before 2002, you don’t really have much of a choice other than to use Total Zone, which is already baked into Def and fWAR. Just be mindful that what you’re seeing is a little cruder than defensive value for player-seasons after that.
  • Gold Gloves. Gold Gloves are awarded by voting, so they fail more or less every meaningful factor for a useful metric. While the awards process does purport to include an analytic component, this component is both a tiny chunk in deciding who gets the award, and done in a pretty strange manner that dilutes its usefulness, by lumping together multiple defensive metrics of various levels of quality, as opposed to just using the best ones.

What about the minor leagues? Fangraphs presents wRC+ and other stats for the minors, so what about defensive metrics? Unfortunately, there’s no UZR, DRS, OAA, or similar stuff for minor league players (yet?). This is basically a gap in our collective knowledge base. One possible solution: data provided by Clay Davenport (for example, here: http://claydavenport.com/stats/webpages/2018/2018pageBALrealALL.shtml). These data are also provided for major leaguers, and they don’t always align with UZR or DRS that well. (For example, in 2018, Johan Camargo was one of the better-rated third basemen in baseball, with the second-highest DRS, the fifth-highest UZR, and the second-highest UZR on a rate basis. But the Clay Davenport data have him as a -7 third baseman!) That definitely doesn’t mean they should be ignored (and what else are you going to use for minor league defensive data, anyway?), but it’s just another thing to be aware of.

tl;dr takeaway for fielding - fielding is very complicated but very important, and there’s a lot of room for improvement. UZR and DRS are the best metrics we have right now to value fielding, but they don’t help us forecast defense as well as we’d potentially like. For catchers, don’t forget framing! Also, don’t forget that because the positional adjustment exists, a player won’t magically gain more value just by moving positions.