The amazing thing about working on a project is that sometimes you find a road you hadn't expected, out of curiosity you take it, and you wind up finding something much more interesting than what you were originally working on. Such was the case a few weeks ago when researching another topic I stumbled across S.A.F.E.
S.A.F.E. stands for Spatial Aggregate Fielding Evaluation and is a method for evaluating range and defense based on probability. It was developed right here in our backyard by a University of Pennsylvania statistics professor, Shane Jensen.
Intrigued, I approached Dr. Jensen about an interview, and he was gracious enough to accept.
-----------------
Tell me a little bit about yourself, your professional background, your interest in baseball, your favorite team, and what prompted you to work on SAFE.
Well, I've always been a huge sports fan, but I grew up in Western Canada, so I didn't follow baseball nearly as much as hockey and other sports. My first big baseball moment was the Joe Carter walk-off home run (sorry!) while I was in college, but it wasn't until I moved down to Boston that I began to follow baseball actively. By actively, I really mean obsessively...I fell in love with the Red Sox and the game as a whole. The reason I had moved down to Boston was to do a Ph.D. in statistics at Harvard University, and it was thrilling to immerse myself in a sport that is was so quantitative in nature. After I was hired here at the University of Pennsylvania, I started an informal sports group within the department of statistics. We were then contacted by ESPN and they gave us a grant to study the use of sophisticated statistical models for measuring performance in baseball. SAFE was one of the methods that grew out of our research.
With the advent of technologies to analyze play-by-play data, particularly batted ball type and location, we've seen a sharp increase in the quality and sophistication of defensive statistics in baseball. Recently, you introduced the S.A.F.E. system, can you describe it in a nutshell?
Sure. The central idea of SAFE is to use smooth curves to estimate the probability of making a fielding out given the type, location and speed of a ball in play. We estimate a different smooth curve for every player as well as an average curve across all players. We are using high resolution ball-in-play data from Baseball Info Solutions that allows us to estimate these curves precisely enough to quantify differences between players in terms of their fielding success rate at each coordinate in the field. We then total these individual differences across all locations and ball in play types to give an overall measure of each players fielding performance relative to the average player. You weight the frequency and run consequence of each batted ball in order to express SAFE in terms of expected runs saved/cost relative to the average player.
Let's say I'm not a mathematician [actually, you can really say that, I'm not a mathematician], what does that mean?
It's not as complicated as we made it sound! The smooth curves we calculate for each player tell us, at each point in the field, the probability that the player makes a successful fielding play. However, we want to compare players to a common baseline, and so we also calculate the probability that the average player makes a successful play for each point in the field. We then take the difference between these two probabilities, which gives us the amount of gain or cost of that particular player at each point in the field. As an example, say that for a particular point in the field, Manny Ramirez has a 65% chance of making a catch, whereas the average left fielder has a 75% chance of making a catch at that same point in the field. This means that Manny is costing the Red Sox a 10% chance of a successful play for balls hit to that point in the field, which means that for every 10 balls hit to that point in the field, Manny is expected to miss one more catch than average.
What is the consequence of this expected missed catch? Well, we calculate the run consequence of a missed catch at that point in the field by looking at the relative frequency of singles, doubles and triples that result from balls hit to that point in the field, and use these frequencies to calculate the expected number of runs that would score as a consequence of an extra missed catch in that spot. We do this same calculation for all spots on the field, and then we can total up the runs cost/saved for each player across all points in the field. However, when we calculate this total, we don't want to treat every point in the field equally, since there are far more balls in play hit to certain spots in the field relative to other spots. Therefore, when we do the total, we weight each point in the field based on the relative frequency of balls hit to that point in an entire season. This final total gives us the expected runs saved or cost by a fielder over an entire season.
Shouldn't a player get extra credit for getting to a ball that most player's wouldn't get to?
Yes, and actually I think that SAFE implicitly has this property, since we are always comparing individual fielders to the average fielder at their position. If a particular fielder has a 50% chance of making a successful play at a location where the average fielder only has a 10% chance of success, then that fielder will be rewarded substantially in our SAFE calculations. In contrast, if a particular fielder has a 97% chance of making a successful play at a location where the average fielder only has 95% chance of success, then that fielder will be still be rewarded, but clearly not as much because the difference from average isn't as dramatic. The ability to consistently make successful plays that the average fielder doesn't is more highly rewarded by our system.
What went into your decision to use batted ball velocity over batted ball type (i.e., liner, fliner, fly ball, grounder, etc.)?
Actually, we use both batted ball type and batted ball velocity in our SAFE model. When we estimate our smooth curves for each player, we estimate a different curve for liners, fly balls, and grounders. In fact, the methodology for grounders is quite different from fly balls and liners. For fly balls, liners, and pop-ups, we need to estimate curves in two dimensions, since players can move in any direction to make a play. For grounders, we need to only to estimate a curve in one dimension, since infielders need to only move to the left or right to make a play on grounders (except for very slow grounders, such as bunts). There is more detail about the differences between our treatment of grounders and fly balls on the SAFE website.
Have you examined 2006 data? If not, do you plan to do so?
We are very interested in seeing the 2006 data! Unfortunately, our 2002-2005 data was made available through our previous collaboration with ESPN, and our contract with them has expired. I'm hoping that we can get access to the 2006 data in the near future, since having more years of data would better enable us to discover trends over time in individual players.
From the 2002-2005 data, what player performances surprised you? Anyone better than you thought? Anyone come out worse than you thought?
I was pretty surprised that Trot Nixon comes out as the top right fielder, and by a substantial margin. Having watched him many times at Fenway, my own perception was that he was an above average fielder who hustled on every play, but certainly did not seem to be the top player at that position. I think that one mediating factor in his performance is the strange geography of right field at Fenway park, which is not directly taken into account by our SAFE model. Thus far, we have only taken into account park geography in a couple of extreme cases, such as the green monster, but we are planning to incorporate all differences in outfield geography into our models in the near future.
In terms of players that do surprisingly poorly in our measure, I would have to single out Derek Jeter and Bobby Abreu, both of whom have won gold gloves but are close to the worst fielders at their positions according to SAFE. Even with my inherent Red Sox bias, I have always thought of Derek Jeter as a decent (though certainly overhyped) shortstop. However, according to SAFE, he comes out as the second worst fielding shortstop across the 2002-2005 seasons despite having won two gold gloves in that time span. Bobby Abreu also performs quite poorly despite having been awarded a gold glove in 2005. However, many discussions with Phillies fans over the last couple of years suggests that perhaps this SAFE result isn't that surprising.
What's next for SAFE, where do you plan to take it from here?
Well, I would like to get a hold of the data for the 2006 season and add it into our analysis. There are also several modifications I would like to see implemented in our system. As I mentioned above, it would be great to incorporate differences in outfield geography into the model. Even though it probably wouldn't make much difference for most players, in some extreme cases (like Fenway Park) differences in outfield shape could have substantial effects on positioning and success of fielders. I would also like to examine the influence of grass vs. astroturf surfaces, especially on infielders. I was surprised to see that Orlando Cabrera was a slightly below average shortstop according to SAFE, but how much influence did those years of playing in Olympic Stadium in Montreal have on that result?
Have you generated individual graphs for each player? If so, may we see an example?
I can show a couple of plots that illustrate the different methodology we used for fly balls vs. grounders. In the left-hand side of the plot below, we see a two dimensional plot of the probability of making a catch for the average centerfielder. The plot is rotated so that we can show both the X (moving left or right) and Y (moving forward and back) dimensions. The curve is roughly symmetric, though if you stare at it long enough, you can see that the curve is slightly different for fielders moving back (increasing values of Y) versus moving forward (decreasing values of Y). The right-hand side of the plot is the curve for an individual center-fielder (Darin Erstad) and you can see certain differences when you compare his curve to the average curve. For example, Erstad seems to be more successful than average on balls near to where he is standing (the curve is flatter at the top).
Click graph in order to enlarge
For grounders hit to infielders, we have a simpler picture, because we only measure distance to the left or right of the shortstop. In the plot below, we compare three shortstop curves from the 2005 season. The black curve is the average shortstop, the blue curve is the worst shortstop (Michael Young) and the green curve is the best shortstop (Adam Everett). The X axis is the angle of the ball in play from the third base line, and the red line is where the shortstop is standing. The first thing we notice is that the curve is different for moving to the left or the right, which makes sense, since all shortstops have their gloves on the left hand, and so can get to more grounders moving to the left. We also see that all three of these curves are very similar at the top, which implies that there is little difference between the best and worst shortstops when the ball is hit right at them. The big differences come on balls hit at a farther distance away from them, in which case Adam Everett gets to a substantially higher percentage of ground balls than the average shortstop, whereas Michael Young does substantially worse.
Click graph in order to enlarge
-----------------
Clearly, the SAFE system is quite similar to David Pinto's Probabilistic Model of Range (PMR), published at Baseball Musings, though there are differences, though it appears there is some difference in the way data is recorded and compared.
What's interesting about SAFE is that it takes the probabilities of a successful play and goes a step further by estimating the runs saved or cost by each play. For instance, from 2002-2005, SAFE estimates that Aaron Rowand, the top-ranked center-fielder over that time, saved the White Sox 20.56 runs over an average player. Over the same span, Chase Utley saved the Phillies 5.81 runs over the average second baseman, despite limited duty at second base in 2003, 2005, and the first part of 2005. On the other hand, Derek Jeter, the whipping-boy of objective defensive measures, cost the Yankees a shade over 9 runs.
Player performances can then be totaled on a team-level. For the 2002-2005 data set, the White Sox averaged 15 runs per year above a team of average fielders while the Yankees came in over 35 runs a season below the average.
-----------------
In the next few days or weeks, I hope to follow up with Dr. Jensen, in particular I am hoping he can answer some questions about Aaron Rowand's defensive positioning. If you leave a comment below, I'll be sure to pass that along as well.





