It's generally assumed that Arizona, UCLA and Washington will go dancing out of the Pac-10 this year. It's also assumed that unless an interloper wins three or four straight in Los Angeles this week that will be it. But it's also generally assumed that USC and Washington St. are on the outer edges of the bubble with the chance to play their way in. But not Cal. Why is that? Well, it's also the time of year to throw out an endlessly confusing stream of RPI stats. They are arranged in every way possible, to argue, defend, and explain why Team A is a lock, Team B is on the bubble and Team C is NIT bound.
I was reminded of why I hate the RPI while reading a series of posts on Jon Wilner's College Hotline, beginning with his experiences as a part of the NCAA's mock selection committee. He describes how the Ratings Percentage Index is used:
I’ve written many times before that the RPI is just one tool the committee uses, but I just experienced that in person. Yes, it is available: Each team’s overall RPI, its RPI rank, its non-conference RPI … they’re all available with a click of the mouse. And yet in hours and hours of discussion and analysis today, an individual team’s RPI rankings came up only a few times. What we did look at, though, is what the NCAA refers to as the RPI quadrants: How a particular team performed against opponents with RPIs of 1-50 … 51-100 … 101-200 … and 200+.
Now, before I begin my ranting and complaining, let me be clear: This isn't intended to be a hit piece on Wilner. He's merely the messenger, describing the process used during the mock selection meetings and therefore presumably used when the actual committee meets to create the 68 team bracket. I'm guessing he has no particular problem with the RPI, but that's neither here nor there. Ultimately his opinion on the matter is meaningless. What matters is the selection committee, and evidently the committee values the RPI quite highly. Here's why that bothers me:
Note: All RPI data referenced below is as of March 7th and will assuredly change over the coming week
The RPI is just one potential tool to rank teams, and shouldn't necessarily be valued more than any other
Maybe I'm mistaken, and the selection committee has a bunch of other metrics that they value similarly to RPI, but much of what I read doesn't indicate that they do. And I think that's flawed. For one thing, you could calculate an RPI type score all kinds of different ways. The actual formula for the RPI is this:
RPI = (Winning Percentage * 0.25) + (Opponent's Winning Percentage * 0.50) + (Opponent's Opponent's Winning Percentage * 0.25)
That's three different variables given a subjective weight. You could hypothetically value each input equally, or just weight winning percentage highest, or really any combination.
And then think about the variables that you might consider worth recognizing. How about adding a component rewarding road wins and penalizing home losses? Maybe you want to weight recent results higher because how a team is playing at the end of the season is more important to you than the beginning? What about margin of victory?
You're probably thinking 'Norcalnick, is this just an excuse to ram Kenpom rankings down our throats again?' While I do think that Kenpom is about as good a model as I'm aware of, I don't think it's necessarily an ideal metric for deciding tournament teams on its own. I love that it's tempo-free and realizes the importance of margin of victory. But at some point you have to grade teams on wins and losses, which the RPI focuses on by solely analyzing various winning percentage numbers. But that doesn't mean that the RPI should be weighted as strongly as it apparently is with the selection committee, when there are so many evaluation metrics out there that are arguably just as accurate at deciding what team is "best."
The RPI 'quadrants' are incredibly arbitrary
I'd like to know how it was decided that 1-50, 51-100, 101-200 and 200+ was the best way to analyze what a team has accomplished. I can only assume that humanity's love for round numbers led to the system, because it just as easily could have been 1-43, 44-79, 81-172 and 173+.
I mean, are we really supposed to believe that a loss to Virginia Commonwealth (RPI 49) should be considered that much differently than a loss to Princeton (51)? And that a win over Kansas (1) is equal to a win over Marshall (50), or to make this more relevant to Cal's schedule, Boston College (44)? Interestingly, as of now a loss to Arizona St. (152) is a bad loss, but it's not nearly as awful looking as a loss to Oregon St. (232), a dreaded 200+ loss! We Pac-10 fans know there's not much of a difference between losing in Corvallis or losing in Tempe, but if you're just looking at the RPI quadrants it's potentially a huge distinction.
But the worst part is a distinction that is made about how you're supposed to look at the RPI, a distinction that in a way forces the use of quadrants:
For some reason, an individual team's exact RPI ranking isn't relevant
Rather than relying on arbitrary quadrants, a team's actual ranking within the RPI takes every piece of data from a team's schedule, synthesizing each result into a final number. That final number is supposed to represent the value of what they have and haven't accomplished, and in a way that isn't nearly as biased and random as the quadrants. But evidently it's not used.
Requoting from Wilner above:
And yet in hours and hours of discussion and analysis today, an individual team’s RPI rankings came up only a few times.
When discussing Pac-10 tournament prospects recently he was even more explicit:
Arizona won’t be given a seed because of its RPI. In other words, just because the Cats are No. 18 in the RPI this week doesn’t mean they would be a No. 5 seed if the brackets were announced today. It simply doesn’t work that way. In my two-day experience at the Mock Selection Seminar, we never discussed a team’s actual RPI.
Now, I agree that a team shouldn't be handed a certain seed because of their RPI, but it's completely illogical to just ignore an actual ranking. So UCLA, for example, gets the credit of having beaten the #18 team in the RPI, but Arizona doesn't get credit for actually being that team? In what world does that make any sense? Ken Pomeroy made this exact point last year, and it has a bearing on Cal both this year and last year:
And if the NCAA insists on using RPI, then use the team’s ranking directly. If New Mexico is getting credit for beating the 23rd best team when they have a win over Cal, it doesn’t follow that you aren’t allowed to assume that Cal is the 23rd best team when you evaluate the Bears.
Last year Cal was practically accused of gaming the system by scheduling a deceptively difficult non-conference schedule that somehow 'tricked' the computers into overvaluing the Bears. Four high value losses to Kansas, Syracuse, Ohio St. and New Mexico along with wins over a number of solid mid-majors with RPIs between 50 and 150 somehow meant that Cal's high RPI was an artificiality that should be ignored in favor of their bad W/L percentage against the RPI top 50. If Cal was actually assumed to be the 23rd best team that would translate to a 6 seed, and maybe Cal's NCAA journey might have been a little different last year.
And this distinction is again relevant this year. As of right now Cal has an RPI of 65, higher than that of Washington St. (73) and USC (67). Now, I've just described why the RPI maybe isn't so awesome, so I'm not trying to tell you that this means definitively that Cal is better than WSU and USC and should be considered for the NCAA ahead of them. But I do find it odd that the Bears aren't at all mentioned as being on the fringe of the bubble while the other two teams are. Once again Cal has scheduled a deceptively difficult non-conference schedule, but don't appear to be getting much credit for it.
Now, if Cal loses in the Pac-10 tournament I'm not going to scream in frustration when the Bears don't receive an NCAA invitation. I don't think Cal has the resume to make a compelling case for their inclusion in the field, no matter what criteria I might use. And more importantly, it's impossible for me to muster any sympathy for a team missing out on the NCAA tournament when they can barely scratch a .500 record in their conference. Each year a coach with an 8-8 conference record complains about getting left out of a competition attempting to decide the best team in the nation, and they sound absurdly dumb and self-serving when they do so.
Now, I've seen various quotes and anecdotal evidence that indicate that perhaps the selection committee isn't nearly as dogmatic about the RPI as you'd think. Evidently information like Kenpom and Sagarin rankings are at least distributed to the committee members, in addition to piles of other information. I certainly hope that's the case, and that all of my above complaints only need to be directed at fans and media members who misuse the data. Of course, the only way to know how all of it is used is to sit in while they deliberate for the weekend prior to Selection Sunday.
In any case, the next time somebody tries to use the RPI as the sole data point in an argument over who is in and who is out, consider it with a grain of salt.