March Madness: Statisticians quantify entry biases

By examining historical data, statisticians in the College of Science at Virginia Tech have quantified biases that play a role in granting Division I at-large basketball teams inclusion in the NCAA March Madness Tournament.

Assistant professors Leanna House and Scotland Leman found that in addition to the standard Ratings Percentage Index (RPI) used by the 10-member selection committee, biases such as the team’s marquee and the strength of its schedule are also factors.

“We wanted to quantify how much bias there is for bubble teams,” Leman said. So-named “bubble teams” are those that do not have an automatic bid but are still considered potential teams to be invited to the tournament. Usually bout 30 teams fall into this category.

One bias for bubble teams, House and Leman found, was consideration of the marquee (or pedigree) of the team. For instance, a team that historically has an outstanding record and is usually included in the tournament has that fact in its favor.

“Having a rich history of a spot in the tournament will ‘break the tie,'” House said.

She and Leman found that inclusion probabilities were much higher for marquee teams. For example, in the 2009-10 season, the bias of not being a marquee team lowered Virginia Tech’s chances of receiving an at-large bid from 0.83 to 0.31. During the 1999-2000 season, the marquee bias increased the University of North Carolina’s chances from 0.32 to 0.85.

“UNC’s marquee status during that season had a substantial influence on the committee’s decision.” Leman said. “Of that, I’m sure.”

The statisticians also explored the influence a team’s schedule has on its RPI in addition to its record. By using a hypothetical model, Leman and House determined that the more powerhouse teams a bubble team plays in a season, regardless of whether they win or lose, will help them win a bid in the tournament.

“Of course scheduling is a complex process and involves a lot of negotiation,” Leman said. “But in cases where a coach is able to select to play a powerful team or a smaller, less powerful team, it is better to pick the power team. The rule of thumb is: the more powerhouse teams, the better.”

At the beginning of each March Madness decision-making process, the selection committee is provided documentation that contains season statistics and the RPI for each team. Other measures of team strength are excluded.

“The RPI accounts for known, quantitative biases in raw winning percentages that may impact their ratings, but it has been shown repeatedly that raw winning percentages per team are not adequate for ranking teams,” Leman said. “Tournament decisions made for teams with only moderately high RPIs (bubble teams), until now, were not clear.”

Leman and House say their research was motivated by a chance meeting with Virginia Tech head basketball coach Seth Greenberg in a restaurant in the spring of 2010. At that time, Virginia Tech had not won a bid for the tournament. Greenberg suggested that he would like to know how tournament decisions are made for at-large teams. The two statisticians, along with graduate assistants John Szarka and Hayley Nelson, stepped up to the challenge and have presented their conclusions just in time for this year’s March Madness to begin.

“We don’t want to create, improve, or validate a ranking system,” House said. “Our goal was simply to evaluate how the selection committee has chosen teams for the tournament in the past.”

You-Tube feature in which Assistant Professors Leanna House and Scotland Leman demonstrate their Bayesian visual analytics methodology: http://www.science.vt.edu/media/statistics-house-leman-video.html

Related