College Sports

NCAA pool is worth $15,000—if you can write code


Ryan Boesch enters a March Madness pool with family members every year. But after developing a knack for algorithms at Stanford University, he stumbled upon "the nerd way" to compete.

Getty Images

"I'm the type of person who loves playing cards but would have more fun writing the model for code to compete against someone else's code," said Boesch, a graduate student in electrical engineering.

Boesch and other tech-savvy sports fans have jumped into alternative March Madness pools where participants pit their best data-driven computer models against each other. A contest held by data science site Kaggle—which will award $15,000 in prize money—has already drawn entries from 273 teams. Their algorithms will attempt to predict this year's college basketball championship, which starts next week.

Read MoreHow 'Cinderellas' cash in on March Madness

The odds of nailing down the tournament outcome are steeply stacked against even the best data scientists and statisticians. Still, participants say the March Madness problem is an entertaining diversion from their day jobs.

"This is more of a fun side project for bragging rights in the office among the nerds," said Will Cukierski, a data scientist at Kaggle, which is holding its second March Machine Learning Mania.

Traditional March Madness pools draw a crowd despite the difficulty of picking the tournament's outcome. Last year, roughly 50 million Americans participated in March Madness pools, according to a survey from MSN.

People recognize that if you're in this contest, you need to get really lucky. If I get a lottery ticket and don't win, should I be mad?
Michael Lopez
assistant professor of statistics, Skidmore College

Big sets of data have become more accessible as March Madness has turned into a cultural phenomenon, said Scott Turner, who runs the Net Prophet college basketball blog. That in turn has led to more amateur programmers attempting to project the tournament's outcome.

"The two trends have crashed together to create a lot of interest in predicting the tournament by computer program," Turner told CNBC in an email.

Turner now manages the March Machine Madness competition, which is in its sixth year but has never drawn the type of interest Kaggle's contest has. In both contests, participants practice machine learning. They make algorithms based on data sets of their choice.

Read MoreNCAA tourney run can bring billions (with a 'B')

Competitors in Kaggle's contest—which will run on Hewlett-Packard's Haven data platform this year— are currently testing their models against the previous four tournaments. They'll construct an algorithm for this year's tournament after the field is announced on Sunday.

Sign-ups have already surged past the 250 that entered last year, and Kaggle expects more entries before the tournament starts. Participants range from data scientists and statisticians to college students and developers.

"You don't just have to be data scientists anymore to harness big data," Jeff Veis, vice president of marketing for HP's data business, told CNBC.

Sports analytics' best & brightest

In his day job, Monte McNair crunches numbers to inform scouting and coaching decisions for the NBA's Houston Rockets. He has fine-tuned a March Madness algorithm that he first made as part of his college thesis.

McNair, director of basketball operations for the Rockets, hosts his own tournament website for fun. Last year, he won the Machine Madness contest and entered the Kaggle competition.

In the Kaggle contest, data-driven algorithms assign a probability that each team would win in a game against every other eligible squad. While machine models generally predict more accurately than the average office bracket, the randomness of the tournament wreaks havoc on even the best statistical models.

Read MoreThe role of data in transforming pro sports

The likelihood of correctly picking even 75 percent of games correctly is "astronomically low," said Michael Lopez, who was on last year's winning team. His team's model won while getting 73 percent of games right.

Lopez, an assistant professor of statistics at Skidmore College, said he believes that it was more luck than skill that earned his team the first-place finish and $15,000 prize, which he split with a teammate. He plans to enter again this year, when $10,000 will go to the first-place team, and the second-place team will receive $5,000.

Lopez won't be disappointed if he can't replicate last year's results.

"People recognize that if you're in this contest, you need to get really lucky," he said. "If I get a lottery ticket and don't win, should I be mad?"