Wednesday, August 31, 2011

The 2011-2012 Regression Based Analysis

I really want to beat Justin. Not just "sort of" beat Justin. I want to wipe that ACC smirk of his face and really give him the business.

The bad news is that RBA is up to 73.2% accuracy since 2000 while TFG is at 74.0%, implying that Justin gets approximately 5.3 games per year more than I do. However, I have been making significant strides on the difference between expected accuracy and actual accuracy. As of today, the difference between my predicted and actual performance is down to 0.2%, meaning that I miss 1.3 games per year more than I think I should. This manifests itself as an improvement in the confidence metric, implying that I may yet beat him in the TFD pick 'em pool.

The major update to RBA comes in the form of conference versus non-conference home field advantage. Teams that play each other every year are less phased by harsh environments, whereas teams that infrequently travel into opposing stadiums are more rattled. Based upon my data since 2000, home field advantage is worth approximately 1.9 PPH in conference games. In contrast, home field advantage is worth 3.6 PPH in non-conference games.

I experimented with a lot of different home field advantage metrics throughout the summer, including per-team advantage, linear regressions by strength, and linear regression by strength difference. These features generally caused drops by 1% accuracy or more. Justin suggests that attendance makes a difference, but I don't have a way to get these numbers into the algorithm.

One of the ideas that Justin and I have thrown around is changing predictors when we detect a lot of mispredictions. I took a shot at this by pruning history when encountering N consecutive mispredictions and had very poor results. In general, I don't tend to miss many consecutive picks. Furthermore, by time we detect the change (presumably due to injury, suspension, or whatever), the change has corrected itself, meaning that we miss even more games later.

In summary, there have been major experiments but only minor tweaks in the RBA algorithm during the summer of 2011. However, I'm relatively confident that RBA will keep it close during the 2nd Annual TFD College Football Pick 'Em.