Sioux from afar

Index
Search
Search this log

Recent Entries
A quick look at the effects of the new RPI formula
WCHA pre-season press conference
Rumors that UND football may play Northern Iowa?
Destination: Milwaukee
WCHA pre-season press conference review

more posts...

Syndicate
RSS 0.92
XML

January 27, 2007

A quick look at the effects of the new RPI formula

Jim Dahl

Who's #1 in RPI? Most of you would probably guess Minnesota, and under this year's RPI, you're right. However, if the NCAA hadn't changed the RPI formula for the 2006 season, St. Cloud would be (how many guessed them?)

Obviously, changing the RPI formula is going to change the rankings. This post tries to dig into the details a little to figure out how.

RPI formula change

One of the biggest changes in the NCAA hockey selection criteria this year was the reweighting of RPI.
 Win percentageOpponents'
win percentage
Opponents' opponents'
win percentage
New.25.50.25
Pre-2006.25.21.54

NCAA explanation

The NCAA official rationale for the change is:

Over the last couple of years, the committee observed that the current RPI discourages teams from playing certain opponents because playing and winning against these opponents would significantly lower their RPI. The committee noticed that for many teams it was truly better to not play a contest at all than to play certain teams and win. These negative games (games that are won, but causes the RPI to decrease) may cause teams and conferences to reduce the number of games played outside of their leagues or to severely limit the ability of some teams to get quality non-conference games on their schedule. Another flaw with the current percentages is that a team could have wins against two teams that are ahead of it in the RPI, yet receive fewer points for one of the teams it defeated that is higher than the other team it defeated. During the 2005-06 season, there were 117 games out of 941 where teams would have been better off not playing than winning because of its negative effect on the RPI.

After working with the software developer of the RPI, the committee believes that the new formula can reduce the number of negative-impact games, while at the same time not change the order of teams in the RPI. The modification simply reduces the number of negative-impact games and rewards teams for competing. Had the recommended RPI been used in 2005-06, the number of negative games would have been reduced from 117 to four.

Their goal was to reduce the number of "negative" wins without changing the order of teams in the RPI. Then what's the point? Why tweak the formula if it's not going to change the order?

Everyone knows that a new formula is actually going to change the order. By taking a very small sample (two years) and carefully choosing a new formula that doesn't affect the order for that sample, the NCAA could more easily sell the change by claiming it doesn't change the outcome (again, though, what would be the point?)

The other criteria used to choose the new formula was minimizing the number of "negative" wins over the two year sample. Of course it's going to do that, for reasons I explain below, but not because of the carefully chosen percentages. Any shift in weight from opp% to opp-opp% is going to reduce the number of negative wins over the entire sample. Chosing the weights that minimized the number of "negative" wins for those two years is arbitrary at best, and it's unlikely to experience that level of success in future years.

Actual impact of the change

First, let's note an important attribute of the components of RPI. The opponents' win percentage (opp%) is an average. The opponents' opponents' win percentage (opp-opp%) is an average of that average. Therefore, the opp% is going to have a higher variance, and broader range, than opp-opp%.

Using the old RPI formula as of today, the opp% ranges from .4170-.6268. The opp-opp% ranges from .4418-.5615.

Comparing the old RPI to new RPI, the changes are as expected:

Who does worse?

Teams with a significantly better opp% than opp-opp% do worse. At first you might think those will be teams whose opponents got high win percentages playing weak competition. However, given the high number of conference games each team plays, opp-opp% is actually largely determined by conference membership. Keeping in mind the significantly higher variance of opp% vs. opp-opp%, the teams with significantly higher opp% than opp-opp% tend to be those who played opponents with a very high win%. Let's go to the numbers:
Teamopp%opp-opp%RPI/rank
old formula
RPI/rank
new formula
Michigan State.5638.5034.5727/#4.5552/#7
Mankato.6268.5197.5452/#12.5141/#25
Wisconsin.5843.5263.5337/#15.5169/#21
Michigan Tech.5584.5192.5247/#21.5134/#26
Ohio State.5516.4987.5155/#24.5001/#33
Northeastern.5474.5046.5144/#25.5020/#31

Not surprising to see some WCHA and CCHA teams in there. They play pretty strong conference schedules, which can lead to a high opp%. However, why Wisconsin and not Minnesota? Why Michigan State not Michigan? Again, as explained above, high opp%.

Number of negative wins

As explained above, the change should reduce the number of negative wins, as the NCAA hoped. However, that's not because their carefully calculated weightings have some special meaning in analysis of college hockey schedules, rather its simply because they increased the importance of opp-opp% relative to opp%.

The NCAA boasted that the new percentages would have resulted in only 4 negative wins in the previous seasons (not shocking, since they were chosen to minimize that number). At this point in this season there are 9. That's a significant reduction from the 117 for the previous year under the previous formula, but demonstrates that choosing the peculiar percentages that absolutely minimized the number over the past two seasons was a bit of a waste. How about .25-.25-.50?

I wouldn't be surprised to see this number fall a little from 9 before the end of the regular season, but expect any such gains to be wiped out in the conference tournaments.

Conclusion

Though "strength of schedule" still comprises 75% of RPI, it's now heavily weighted toward the lower variance opp-opp%. That diminishes the importance of strength of schedule relative to the importance of the win%. As hinted at as a goal in the NCAA text above, teams from top conferences now have less incentive to build a strong schedule and more incentive to build a schedule in which they maximize their win%.