Is big data behind scoring drought in professional sports? And your business?
As spring training brings the familiar sounds of baseball, and the annual renewal of foolish optimism that this might be the Cubs' year, Major League Baseball is hoping for something even more dramatic -- more runs. From anyone.
Baseball is in a crisis not seen since the 1960s. Pitchers ran circles around hitters last year, with runs per game and batting averages at decades-long lows. There was an epidemic of defensive 2-1 ball games last year -- this at a time when baseball is struggling to remain popular with younger, supposedly attention-span-challenged fans.
But it's not just baseball. The National Hockey League has an offense problem, too. The game's biggest star, Sidney Crosby, has only 20 goals three-quarters of the way through the season. Goals per game have shrunk since the 2005-2006 season. And in the NBA, hot-shot scoring has also declined. In the2007-2008 season, there were 27 players who averaged more than 20 points pergame. Today there are 15.
What in the wide-wide-world of sports is going on here? If you own spreadsheet software, you know that advanced analytics are the biggest change to hit professional sports in the past decade. As Michael Lewis explained in his book "Moneyball: The Art of Winning an Unfair Game" that popularized the revolution, sports franchises will do almost anything to get a leg up.
Geeks with video cameras track everything now. Baseball has its spray charts. Defensive shifts based on those charts are so effective that some critics have suggested banning them. Hockey has its Corsi and Fenwick, which measure shot attempts during ice time. The National Basketball Association uses PPP, or points per possession now.
But a funny thing is happening on the way to refining these sports -- big data had chosen sides. Moneyball tactics seem to help the defense more than the offense. The tiny tweaks and refinements suggested by nerds are simply better at stopping players than enabling them.
It's a lot harder to find and exploit defensive weakness than offensive weakness. There's a lot more available data on what offenses are trying to accomplish than on what defenses are trying to suppress. To play a little loose with an aphorism, it's a lot easier to criticize than create.
You might have noticed that glaringly omitted from this sports discussion is the National Football League, and you probably know why. Scoringis up in the NFL. And by any measure, it's never been more popular. That's because the NFL has constantly invoked rule changes that favor offensive creativity. No hitting quarterbacks, no clutching and grabbing wide receivers. They are ensured plenty of space to create highlight-reel moments. Who would you rather be -- the NFL or Major League Baseball?
What does this have to do with your business? Businesses are projected to spend nearly $40 billion in big data technology this year,according to collaboration site Wikibon.org,most of it with the idea of Moneyball-ing their companies.
It seems like a no-brainer -- run a few spreadsheets, find a few million dollars. But I think there's a flaw in big data that's big enough to drive a slap shot through. As in sports, big data helps defense more than offense. That might mean companies are spending a lot of money so they can be penny-wise and pound-foolish.
In the corporate world, playing defense means things like limiting overtime and shrinking health care benefits costs. Offense means finding new markets and inventing new products. Big data is great at optimizing work schedules to minimize labor costs, but not nearly as good at giving employees extra time to tinker with a potentially profitable idea.
So big data brings you misguided cost savings like the now-infamous Starbucks scheduling software. It drove workers nuts by created "clopenings," which scheduled employees to close stores at night as well as open them the next morning. Starbucks changed its practices after the New York Times reported on how the scheduling affected workers' lives.
Economist Tim Harford, author of the book "The Undercover Economist Strikes Back," has been a critic of big data because its users often seem to forget that no matter how large a dataset is, it's still subject to sample bias that leads to errors. It remains true that 5,000 carefully-selected survey takers provide better results than a billion random Google searchers. And he thinks sample bias might be part of why data helps defense more than offense.
Read More2014 CNBC's Disruptor 50
"Data analytics are excellent at finding subtle historical patterns that might then be exploitable. They are much less useful at suggesting something radically new, or producing a response to something new," he said. "Analytics favor the optimiser, the tweaker, but usually not the radical disruptor. Analytics help Google and Facebook optimize their services, but they didn't really help Jony Ive and Steve Jobs create the iPhone."
Big data can also be gamed. In fact, it's nearly always gamed. When NBA players found out that assists were worth more in contract arbitration hearings than points, you saw a lot more players making a gratuitous extra pass on a fast break.
"We need to be very careful about the hidden bias in any data project," Harford said. "Some things are easy to measure and some things aren't. We've known this for a long time, but we keep making the same mistake."
What's missing from this discussion so far might be the hardest thing of all to quantify -- happiness. Sometimes, it's easier to describe its opposite. Employees get pretty unhappy when they see bean counters coming. It almost always means something is about to get taken away --free lunches, or office space, or a benefit. No wonder they get defensive. We can all agree offense is more fun, be it home runs, goals or casual Fridays.
Happiness has a value, of course, to both workers and their companies. A few studies have attempted to quantify this, but they are relatively squishy. Harford takes a stab at it with the Starbucks example.
"In the short run, (this is) terrible for the employees. In the long run, it's not smart business for Starbucks. They may suffer turnover and recruitment problems and in the end have to pay more to compensate for a problem caused by a 'cost-saving' algorithm," he said "If you impose lots of costs on workers for tiny saving on your (balance sheet), that's not only evil capitalism, it's incompetent capitalism."
This doesn't mean all data -- or all cost savings -- are bad, of course. You can't win without playing both offense and defense. But it's important to know that big data can, in fact, lie. And so can people selling it. It would be wise to know that big data is often just an expensive way to pinch pennies. Firms should be particularly careful about subtle bias in the big data projects they pick.
"The database or spreadsheet often shuts down the conversation when it should open up the conversation," Harford said.
A great way to start that conversation is simply to ask: Are we playing defense or offense?