Want to win the NCAA basketball office pool this year?
Then consider the predictions made by Georgia Tech’s Logistic Regression Markov Chain (LRMC) method, a computer ranking system that has historically been more accurate than the NCAA’s own Ratings Percentage Index.
LRMC predicts this year’s NCAA Final Four matchups will most likely be Kentucky vs. Michigan St. and Ohio St. vs. Kansas, with Kentucky beating Ohio St. for the championship.
Other predictions by the system include:
- Texas, Belmont and N.C. State are the underdogs most likely to pull off an upset in the first round.
- California, N.C. State, Belmont and Texas could be this year’s “Cinderella” teams; they are the most likely double-digit seeds to make it to the Sweet 16.
- Michigan St. will be the No. 1 seed with the toughest second-round matchup.
- Wichita St. vs. Indiana and New Mexico vs. Louisville are other intriguing potential second-round matchups.
- The West Region, led by Michigan St., is the deepest of the four regions.
- Wichita St. vs. Indiana and New Mexico vs. Louisville are “intriguing” potential second-round matchups because LRMC says they will be close games, even though both pit a lesser-known team against a better-known team.
“Kentucky is the likely champion because they’ve won almost all their games,” said Joel Sokol, operations research professor at Georgia Tech who developed LRMC along with colleagues. “They’ve won by convincing margins at home and on the road against very good teams, and they’ve done it all against a strong schedule, including Kansas, North Carolina, Indiana and Florida.”
Since the 2003 season, LRMC has correctly predicted the outcomes of more NCAA tournament games than competing ranking systems and major polls.
In 2010, for example, LRMC correctly predicted the winners of 51 out of 64 NCAA games—beating out more than 50 of the top-ranking sites. In 2008, the system predicted the Final Four, final two and the eventual victor, as well as several upsets in earlier rounds.
Georgia Tech Operations Research Professors Sokol and George Nemhauser and Statistics Professor Paul Kvam developed the LRMC method, along with Math Professor Mark Brown of the City College of New York.
The system looks at the results of all the college basketball games played during the season. Specifically, it examines which team wins, which team loses, where the game was played and the team’s margin of victory. The researchers then run that data through several mathematical models—empirical Bayes, logistic regression and Markov Chain—to determine the ranking of teams.
Yet even with the best formula, it’s impossible to predict a perfect bracket, Sokol said.
About one-quarter of all tournament games are affected by upsets, injuries or last-second, buzzer-beating baskets. Such was the case last year when only one top seed made it to the regional finals. This human factor is where the LRMC predictions can falter.
Still, LRMC’s odds aren’t bad.
According to a study of historical data just completed by the research team, LRMC is significantly better at predicting NCAA Tournament games than almost all of the other ranking systems, such as Sagarin’s predictor, Pomeroy’s ranking, Las Vegas Favorite and the NCAA’s RPI.
Sokol, Nemhauser and Kvam are professors in the H. Milton Stewart School of Industrial & Systems Engineering in Georgia Tech’s College of Engineering.