Popular Posts

Tuesday, August 30, 2011

Expected Points Added (EPA) Leaders Through Week 3

Below are the Expected Points Added (EPA) leaders for the EPL through week 3. The week 1 leaders can be found in an earlier post here. To reiterate, EPA weights goals based on how important they are to the team's chance of winning the game. This is based on the notion that a go ahead goal in the 90th minute is worth more than the 5th goal in a 5-0 win.


Some interesting things to point out...

  • While Rooney has 5 goals this season, Welbeck's 2 goals have actually been more beneficial to United. In fact, Rooney doesn't even make the top 15 list above considering most of his goals were in the recent Arsenal blowout.
  • Dzeko gets to the top of the list by scoring frequently and in important situations. His average goal weight is a solid .51 expected points added, but just because of the fact that he has scored 6 goals puts him at the top.
  • It's still early in the season. Arteta makes third on the list with only 1 goal (a late game winning goal). Soon we'll start to see the top dominated by players who have scored a lot, and in important situations.

Sunday, August 21, 2011

Expected Points Added (EPA) Data Through EPL Week 1

Before the season I promised to post Expected Points Added (EPA) totals after each week of the season. Here are the EPA totals from week 1. If you don't know what EPA is, check out a full explanation here.

To summarize it very basically, EPA is the total measure of how much each player's goals add to team's expected points total. That is why you see some EPA's of 0 below. These players scored goals that added nothing to the teams expected points total (for example, a team is up 3-0 and is already going to win, and a player scores a 4th in the 90th minute. This does not add to the team's chance of winning technically, because the team is already very likely to win.)

Average Goal Weight (AGW) is just EPA divided by the number of goals a player has scored. This measures how important, on average, a player's goals are. It can show us that a player consistently scores clutch goals (high AGW) or that they are scoring useless goals in blowouts (low AGW).


Dzeko has the highest EPA from his go ahead goal in the 57th minute. This equated to a little more than a point for City. Klasnic, Muamba, and Silva all scored goals that added no expected points for their team.

If you have any questions feel free to ask in the comment section. I'll be super busy this week between moving in to my apartment at school and 3-a-days for preseason but I'll try to keep some posts coming.

Wednesday, August 3, 2011

Refining The Win Probability Statistic


Last year I was planning on going to go to the Sloan Sports Conference but ended up not being able to make it. I was thinking about it again this year, and I decided it wouldn't be a bad idea to submit something for this year’s conference. At first I wasn’t going to, but why the hell not? Might as well go for it, I guess.

My win probability added statistic has generated some interest for people, and I think it gives some pretty interesting insight, so I’ve been working on expanding it. If you have no idea what win probability added is, check out my first post on win probability and another on win probability added. Anyways, thus begins my quest to refine and expand the win probability added statistic for submission to the sports conference. To make it a lot better, comments, criticisms, and suggestions are very much appreciated and would help a lot.

The first fix I made was change the name based on a simple fix. The problem with “win probability added” is that it doesn’t necessarily calculate the win probability added. That’s a little bit problematic. For example, if two teams are tied in the 90th minute, the win probability under my old calculations was .333 for both teams. This doesn’t really make sense, because each team has close to a 0% chance of winning the game, not 1/3. This comes from modeling the statistic after the similar calculation in professional baseball. My fix for the problem is extremely simple: multiply all the values by 3. This changes the statistic from win probability added, to the expected points added. It basically makes much more sense now. If a player scores a go ahead goal in the 90th minute, the Expected Points Added (easier to write EPA from now on) is going to be almost 2. If a player scores a tying goal in the 90th minute the EPA would be almost 1. Much simpler and easier this way (originally got the idea from @11tegen11’s similar analysis).

After this, I noticed the graphs were not nice easy curves. Even though I took a big sample size of games (about 10 years worth) there isn’t enough data to give a nice curve. To fix this, I just created lines of best fit for each game situation. The home and away graphs for each minute and goal differential are below. Before there were a few situations that didn’t give a realistic expected point total because there were so few game situations (like a 2 goal lead in the 5th minute). Making the nice smooth curves fixes this problem. It also allows me to use equations to calculate EPA instead of the annoying process of referencing a massive excel chart.





I think there’s a lot of possible paths to take from here. I’m going to recalculate the top goal scorer’s EPA using the equations. It won’t change much, but it’ll be nice to have some continuity because I’ll be calculating EPA week by week for every goal next EPL season.

I’m also working on creating a database of the top goal scorers in the last 10 years in the EPL with their goal totals and their EPA over the years. Looking at goals and EPA over time will hopefully give some insights in to clutch (or lack thereof) goal scoring. If some players consistently have very high EPA’s and some players consistently have low EPA’s, it could be an indicator of clutch goal scoring in football.

Like I said before, I’d love comments and suggestions on ideas for where to go next on the blog, via Twitter, or even email.