Hadoop goes primetime with Hortonworks IPO

Get ready to hear this word a whole lot in 2015: Hadoop.

Silicon Valley has long been buzzing about it, over $1 billion of investing dollars flooded into related start-ups this year, and tech giants like Intel and Google are betting on it.

But what is it? The stock market will get its first real taste on Friday, when Hortonworks, a seller of Hadoop technology, debuts on the Nasdaq under ticker symbol HDP. The Palo Alto, California-based company raised $100 million Thursday night in its initial public offering, selling 6.25 million shares at $16 apiece.

The stock jumped as much as 52 percent to $24.35, giving the company a market capitalization of just over $1 billion.

Hadoop is a challenging subject for anyone who doesn't spend their days thinking about enterprise software. Here's an oversimplified explanation of Hadoop and how it's grown into something meaningful and potentially massive.

Hadoop the elephant
Source: Doug Cutting
Hadoop the elephant

Read MoreHadoop: Toddler talk provides big name

In the early-to-middle part of last decade, engineers at Google and Yahoo were developing software to improve search results and better understand all the crazy amounts of information that queries were producing. That technology became part of an open source project, which later became Hadoop. The code is housed at the Apache Software Foundation.

Over time, Hadoop spread well beyond search and became a central tool for some of the biggest consumer Internet companies, including Facebook and LinkedIn, to make sense of the tons of unstructured data sitting across many machines that were flowing through their network. Companies using Hadoop internally would contribute their own code to the project.

Big data for everyone
Big data for everyone   

Then the business opportunities started to become apparent. Imagine if e-retailers knew more about their customers, or if medical companies could better predict a patient's risk for specific diseases or if advertisers could more precisely target the right audience.

"The Hortonworks opportunity is to enable Hadoop to be an enterprise viable platform for managing all of this data within the enterprise," Chief Executive Officer Rob Bearden said in the online roadshow before the IPO.

But other start-ups got there first. In 2008, a group of data nerds from Facebook, Google, Yahoo and Oracle came together to start Cloudera and began evangelizing the benefits of Hadoop to anyone who would listen.

MapR was formed a year later to also find commercial applications for Hadoop. And in mid-2011, Yahoo joined with venture firm Benchmark to form Hortonworks as an independent company that would "enable Apache Hadoop to meet the growing market demand and become the big data management and analysis platform of choice for the industry," according to a press release at the time.

Read MoreA holiday treat for tech investors: IPOs

Fast forward to late 2014, and "big data" has gone from being a niche idea to a wildly overused phrase. But when it comes to Hadoop, the concept makes all the sense in the world.

That's because the old data center software, created by companies like Oracle, IBM and Microsoft, was designed for the pre-Internet and certainly pre-smartphone days. Big companies still have all that gear in their facilities, but they need tools that can handle 10 times, 100 times, even 1,000 times that amount of capacity. And then they need software to analyze it, which is Hadoop's primary role.

Allied Market Research estimates that sales of Hadoop-related software will climb to $50.2 billion annually by 2020, from $1.5 billion in 2012.

Investments in Hadoop companies jumped more than five-fold this year to $1.28 billion from $236.8 million in 2013, according to research firm CB Insights. Back in 2010, the number was $48.5 million.

The bulk of this year's capital—$900 million—went to Cloudera, thanks largely to a $740 million investment by Intel, which forged a strategic partnership with the start-up and bought an 18 percent equity stake. Hortonworks raised $150 million, with one-third coming from Hewlett-Packard, and MapR brought in $110 million, mostly from Google.

Read MoreHP makes strategic investment in Hortonworks

Because the software is open source and free, with code being contributed from developers big and small, the only way these companies can profit is by adding enough value to the raw technology to make enterprises open their wallets.

"While that seems like a problem, it's offset by the advantage of a global community of people that work on that stuff and move much faster than any vendor could," said Mike Olson, co-founder and chief strategy officer at Palo Alto-based Cloudera.

Olson said the early team at Cloudera spent their days educating people about the promise of Hadoop while developing services on top of it. By 2013, the story had shifted dramatically and customers were buying big packages. This year it's taken off.

"People are moving into substantial production," Olson said. "It's not a science experiment anymore." Cloudera's customers include eBay, Qualcomm and Orbitz.

Watson researchers work inside IBM's Thomas J. Watson Research Center, in Yorktown Heights, New York, Feb. 11, 2014.
Can IBM succeed in cloud computing?   

We still don't know how Cloudera or MapR's finances look, because neither have filed to go public. Hortonworks' numbers are out there for all to see, and there's plenty of room for skepticism.

Revenue more than doubled in the first nine months of the year to $33.4 million, but the company's net loss shot up to $86.7 million from $48.4 million a year earlier. Even for growth-hungry Wall Street, those numbers are challenging.

For the patient investor who wants an early taste of Hadoop, this is the story Hortonworks is selling: Those balance sheet numbers don't mean much. The reason is that for every big sale, Hortonworks has to invest in first teaching the customer about Hadoop, then helping with the deployment, then training users.

That all goes in the cost bucket, but the revenue gets recognized over an extended period of time because it's a subscription business. Furthermore, 43 percent of revenue in the first three quarters came from professional services, which includes low-margin offerings like consulting and training, with the rest from highly profitable subscriptions.

Read MoreCramer's cloud play

Over time, as the ecosystem matures and the company can focus more on development, 80 percent of sales should come from subscriptions, Hortonworks says.

"Most of the heavy lifting is still done by us," Chief Financial Officer Scott Davidson said in the roadshow.

MapR co-founder and CEO John Schroeder says that while he's happy to see public market excitement for Hadoop, he's nervous about potentially being "guilty by association."

According to Schroeder, 90 percent of MapR's revenue is subscription licenses, and the business is much less capital intensive than Hortonworks, even as revenue is more than doubling annually. MapR, based in San Jose, California, said earlier this week that it now has more than 700 customers. Hortonworks has 233.

Read MoreStart-up millions needed to wine and dine CEOs

Next year is shaping up to be a land grab as the three contenders spend their new capital to expand aggressively in the aim of landing deep-pocketed clients. Jai Das, a managing director at Sapphire Ventures (formerly SAP Ventures), also expects the market to start consolidating.

Like in the previous generation of database wars, when Oracle bought up its competition, Das says there will be some combinations among the existing players, and that larger tech companies could potentially buy their way into the space.

"If two or three go public, one will be left standing and the others get acquired over time," said Das, who's not an investor in any of them. "One of them has to win."