Finally, my last post on the four statistics that I studied on the Twins’ 2009 season to see how good they were at moving runners around the bases (which I call advanced bases percentage). I’ve mentioned several times how my friend Steven didn’t completely agree with my methods, but this one is about as close as I think I’ll ever get to satisfying him without changing my data. I said before that this one is similar to slugging percentage, and now I’ll explain why.

I’m sure you already know this, but I’m going to reiterate it anyway. Slugging percentage is a player’s total bases divided by his at-bats, so a single is one base, double is two, triple is three, and home run is four. So yes, a hitter with a bunch of singles could match a hitter that only gets an occasional home run, but the batting averages would be much different. This last statistic is very similar. I decided to measure the number of bases that a runner advanced during each hitter’s plate appearance. Steven’s biggest complaint was that I was crediting hitters if they advanced a runner even if they made an out, and this stat partially addresses that problem. A hitter could have a respectable average if he made a “productive” out all the time, but the hitters that could consistently advance a runner more than one base (almost always possible only by getting a hit) would rate better. Here’s what I found:

What does this mean? In the case of Joe Mauer (who is once again the leader), whenever he hit with a runner on base, he moved that runner an average of .676 total bases. If it was two runners on base, then each runner advanced an average of .676/2 = .388 bases* .676 bases as well. I was a bit surprised that Orlando Cabrera rated so high, but I’ll address that in a second. I was also interested that Jose Morales had such a poor average, but I wasn’t surprised considering how low his slugging percentage (.361) and isolated power (.050) was last year, even despite his high batting average (.311).

* *Stupid error I made last night. If there’s a runner on 1st base and the batter hits a single, which advances the runner to 3rd, the hitter gets credit for 2 bases advanced. That makes a bases advanced percentage of 2.000 (yes, that’s higher than 100%, but just bear with me here). Now if there’s runners on 1st and 2nd and the hitter walks, then he also gets credit for 2 bases advanced. However, this time the bases advanced percentage is only 2/2=1.000 bases advanced percentage, because he averaged one advanced base per baserunner. This should clear up the error I made earlier.*

Now, what existing statistics best predicted a hitter’s advanced base percentage?

Not surprisingly, three statistics that dealt with runners on base during a hitter’s plate appearance best correlated with the number of bases those runners advanced. A hitter’s OPS with runners on base best predicted this, and that .894661 was also the highest correlation I got for any two statistics. This is also partially why Orlando Cabrera had such a high rating, because his AVG/OBP/SLG line with runners on base of .314/.338/.415 was better than his overall line of .284/.316/.389…

…although when there was a runner on 1st base, it doesn’t hurt that O-Cab averaged nearly a base advanced per runner (41/46, or .891 bases per runner on 1st base), which also was his most common situation with a runner on base (picture can be expanded by clicking on it). This also explains why I ranked Denard Span over Jason Kubel, because his average was .00034 points higher.

I have a week-long break from school, so I’m considering doing this whole thing over again, but this time I’ll take Steven’s suggestion by no longer rewarding a hitter if he advances a baserunner by making an out. However, I might make exceptions if that out was due to a sacrifice fly. I can’t give a timeline of when you can expect that to be finished, partially because the Twins are finally coming alive during this offseason and because there’s something else that I find to be really cool that I want to take a look at (thanks to some research being done at The Hardball Times).

January 31, 2010 at 3:15 pm |

Very interesting series here. I was wondering what, if any application you see for these statistics over more conventional and SABR slanted stats? how do you see this being used in evaluating players?

January 31, 2010 at 10:42 pm |

Well I had typed about half of my response, bumped the touch pad, and accidentally hit the back button which deleted my half-response :-P So here goes my attempt to repeat what I had already typed (with additions!).

Obviously, I was looking at how good hitters were at advancing and scoring baserunners. Unfortunately, I only looked at the Twins, so the sample size was far too small. For the 2010 season, I’m planning on keeping a daily record of every team, so I can better understand what constitutes the difference between “good” and “bad.” I’ll also include the adaptations that I’ve mentioned (probably far too many times) that Steven recommended, and I’ll show the difference between his method and mine to see how hitters fared when making an out. Like I said at the end of this post, I’m debating whether I should include sac flies as being positive or negative.

1. (Runners scored w/ RISP % and total runners scored %): My thought was to expand on the belief that RBI is an overrated statistic, since some hitters get more opportunities than others to drive in runners, even if they have the same number of RBI. With r-values of .772 and .827, there was some correlation (very understandable) but they weren’t great. I think I should look at each hitter’s total number of plate appearances with runners on base (or total number of runners on base) to see who was most opportunistic in 2009, since looking at correlation statistics isn’t completely helpful to me.

2. (Runners advanced % and bases advanced %): This wasn’t to necessarily challenge anything that existed, because I didn’t think there was anything that looked at this. I wanted to see if hitters like Nick Punto could theoretically help their team by making “productive” outs even if their batting averages, OBP, SLG, etc. weren’t very good. By Steven’s argument, this was a flawed thing to study because making outs do not help your team in any situation,* so I believe that when I redo all of this by saying “productive” outs actually hurt your team, the numbers will better correlate with a hitter’s OBP, SLG, OPS, etc. So in summary, I was hoping to make a breakthrough when I designed this stat, since I didn’t think anyone anywhere had ever looked at stuff like this.

Personally, I think these two are the most interesting things to look at. If a hitter could consistently replicate his numbers over several years, then I might be able to argue that these statistics could aid in optimal lineup construction for a team. There’s data suggesting that only the hitters in the lineup matter, not where they hit, though.

* My problem was that I looked at a scoring expectation matrix that only accounted for the probability of scoring a single run, whereas Steven looked at one that predicted the total number of runs a team would score. My matrix suggested that there were ways to make productive outs, but I forgot that my source was only concerned with situations where a team needed one run. Steven’s suggested that there is never a time when an out is productive. TwinsFoghorn argued that sac flies can be considered productive since they scored a run, but I don’t think Steven has offered a retort to this yet.

I hope this was helpful.

February 1, 2010 at 9:09 am |

Thank you for taking the time to type out such a lengthy response (twice). I guess my main sentiment is that these more or less seem to reflect a players WPA, rather providing a basis for potentially challenging a players current valuation. Can you give me an example where a players WPA would not correlate in general to all of these statistics, or where the bases advanced % won’t correlate to SLG?

I have a very weak background in statistics, so it is quite possible I’m being foolish and am completely of base, and regardless of whether these really allow one to challenge old assumptions, they do help provide SOME justification for Punto which is kind of amazing in and of itself :P

February 1, 2010 at 3:00 pm |

No problem. When checking for correlations of data, I did include WPA. This was actually done prior to Steven suggesting that my data was flawed because of how I was rewarding hitters. WPA does what Steven suggests, but to a greater degree, where specific scenarios and results during the game have varying values. My stuff treated two similar results but in different scenarios (1st inning RBI single vs. 8th inning RBI single) as being equal, when in actuality they are not. WPA accounts for this. Since my statistics didn’t account for that, unsurprisingly there wasn’t a great correlation between WPA and my statistics. For a refresher, the r-values for WPA were…

Runners scored w/ RISP %: .486

Total runners scored %: .572

Advanced runners %: .485

Advanced bases %: .656

Preferably, these values would be between .9 and 1, as that would signal a very strong correlation between the two. But since these are between .48 and .66, it means that WPA has some positive correlation (as WPA goes up, each of these statistics usually goes up as well) but it’s not guaranteed.

Steven’s suggestion for me follows WPA more closely, so I’m fully expecting their correlations to be much stronger when I finally get that data put together.

March 3, 2010 at 7:18 pm |

[...] The results for Twins hitters and their success rates of moving runners up at least one base. 4. Moving Them Around: The results for Twins hitters and the average number of bases they advanced a [...]