Tuesday, May 26, 2009

Win Probability Added

Win Probability Added (WPA) is a statistic that attempts to measure not just what a player did, but also the context in which it was done. Based on data from games throughout baseball history, the chance of the player's team winning is calculated before and after their at bat, and the difference is the player's win probability added. Of course, the statistic has its problems, as these estimations of the win probability based on previous games may not be entirely accurate because of changing environments (more offense nowadays, etc), and not all of the situations have large enough sample sizes. In addition, the statistics only measures a player's batting contributions to the team, and it can also penalize a player for another player's mistake. For instance, if a base runner gets thrown out trying to advance an extra base on a hit, that counts against the hitter (stolen bases/caught stealing count for the baserunner though). Despite these flaws, it is a relatively reliable statistic, as a quick look at the league leaders will show many league leaders in other offensive categories, with Raul Ibanez currently topping the list at 3.22.

A quick regression of WPA on batting runs above average (BRAA) also gives an R-squared value of .744, which means that batting runs explains 74.4% of the variation in WPA. Since batting runs is a statistic that is not affected by context, much of the variability in WPA is explained simply by the difference in offensive production between players, but an important part of the variation in WPA is left to the context in which this production occurs. WPA is more of a descriptive statistic than other advanced statistics, since clutch hitting has been proven by almost all studies of the sort to not be a repeatable skill. Therefore, it is useful in telling what a player has done, but not as much for predicting what a player may do in the future.

With all that out of the way, here's a look at how the Twins' hitters have fared thus far this season:

The three lefties unsurprisingly lead the way for the Twins, with Justin Morneau, Joe Mauer, and Jason Kubel all accounting for at least 1.4 wins thus far this season. Of course WPA is a counting statistic, so it would be more accurate to divide it by plate appearances, as the following list shows:

The WPA/PA statistic is multiplied by 100 in order to present more easily understood numbers, so it's really WPA per 100 plate appearances. With this new calculation, Joe Mauer is head and shoulders above everyone else. His 1.41 easily outdistances the second place Jason Kubel, who sits at 0.97 WPA/100PA. In fact, despite missing the first month of the season, Mauer ranks in the top 30 overall in WPA and already leads all catchers in that category. Once you start to adjust for the fact that Mauer is a gold glove catcher, you really begin to see just how out of this world his performance has been so far this year. Most of the other players don't see much of a change in their WPA or rank on the team, though it is surprising to see Brian Buscher in the same realm as Morneau, considering his .666 OPS and .309 wOBA pale in comparison to Morneau's 1.082 OPS and .446 wOBA. Buscher doesn't even have one monster game that boosts his overall WPA totals, as his largest total for one game is .208, while Morneau has three games in which he added over .3 wins.

In total, there have been 6 Twins who have had at least one game with a WPA of +0.3: Morneau (3 times), Kubel (twice), Matt Tolbert, Joe Crede, Delmon Young, and Alexi Casilla. Despite his phenomenal overall WPA, Mauer's highest single game WPA this season is only .263 in Saturday's game against the Brewers. Surprisingly, Casilla and Young have produced 2 of the 4 single game WPAs over 0.4, with Kubel accounting for the other 2. The highest single game WPA thus far was Kubel on April 21st when he accounted for nearly a full run (.921) by hitting for a cycle against the Angels, including a go-ahead grand slam in the 8th that increased the Twins chances of winning by 70%. However, the greatest single play this season as measured by WPA was Alexi Casilla's walk-off single against the Mariners on April 7th, which raised the Twins' chances of winning from 27% to (obviously) 100%, and credited the previously 0-4 Casilla with adding over a half of a win in that one game.


  1. After trying to get some clarification from you I think that WPA is an interesting statistic but I can't get over the fact that the statistic is only supposed to measure the batting performance, yet the hitter is the one who is being penalized, or credited, for another players performance in a completely different aspect of the game. I'm sure it's a reliable statistic but I would just feel more comfortable using figures from statistics that did not depend so much on context. But I'm so sabermetrician and I'm still a little confused so I probably don't see the full virtues of WPA.

  2. I'm assuming you meant you're "no" sabermetrician.

    But I agree; I don't like that it's so affected by context. A great example is Alexi Casilla's game that is mentioned in the post. By all accounts, that wasn't a very good game (1-5 with the hit being a single) and certainly wasn't the best individual performance of the year. Yet, it's graded as that by WPA.

  3. To Cortne: Yes, that is a flaw of WPA, but the vast majority of plays do not have these in between baserunning plays. In other words, on most balls in play, all runners would do the same thing.

    To Twin #1: That's why I said that it is more descriptive than predictive. Those first four outs were relatively unimportant because they were earlier in the game and without runners on base, so they didn't decrease their chances of winning that much, especially since most at bats end up with the batter getting out. Also, you misread my post a little bit, as Casilla's walk off was the play with the highest WPA this season, but his WPA for that game was second behind Kubel's cycle.

  4. @ Twin #1: way to put me on blast

    @ Twin #2: I can't really seem to reconcile my lack of understanding with the good aspects of the statistic. Plus I'm probably oversimplifying this a lot.

  5. OK, I did misread that. But still, it's the second-best game, which I don't like. I know we've talked about it, and you agree that's it's not really a very good stat, but it is interesting to look at.

    And Cortne- what does that mean??


Let us hear your thoughts!