- Databricks, whose software helps companies process large data sets for analysis and use in applications, has over $500 million in cash.
- The start-up made CNBC's 2020 Disruptor 50 list after reaching $200 million in annualized recurring revenue last year.
- During the coronavirus pandemic, a South Carolina health system used Databricks to predict which patients to prioritize.
Several times over the last three years, the leaders of data software start-up Databricks performed exercises to imagine the implications of an economic collapse.
"The salaries people make, the growth expectations, the career aspirations — it just seemed like it was overheated," Databricks CEO Ali Ghodsi said. The company ranks No. 36 on CNBC's 2020 Disruptor 50 list of innovative companies, announced Tuesday.
The exercises helped convince Ghodsi to raise venture financing each of the last two years, valuing Databricks most recently at $6.2 billion. In the middle of 2019, the company pulled out of a planned $120 million deal for premium office space in its hometown of San Francisco. It also held off on real estate enhancements in the U.K.
Now, during a recession caused by a global pandemic, Databricks hasn't had to shed any of its more than 1,300 employees, while technology peers like Airbnb, Uber, Glassdoor and many others have downsized. Databricks' actions have put it in a strong position, as customers have turned to its software to strengthen their operations with artificial intelligence and to process massive amounts of data that traditional databases weren't built to handle.
"If this crisis lasts five or six years, we could just go on and continue hiring and be unaffected by this," said Ghodsi, adding that the company has over $500 million in the bank after raising nearly $900 million.
In the third quarter of 2019, revenue topped $200 million on an annualized basis, compared with over $100 million in 2018. Ghodsi declined to provide updated figures or say if the company is profitable. Databricks wants to be ready to go public in 2021, and Ghodsi said there's plenty of investor demand.
Companies depend on Databricks to store their data and clean it up so employees can analyze it and wrap it into applications.
The technology draws on Spark, a piece of open-source software for processing data that originated in 2009 at the University of California, Berkeley. It was developed as a "toy project" to show the power of Mesos, software designed to operate clusters of computing workloads across many servers, said Alexy Khrabrov, a general manager at the Linux Foundation who co-founded the first Spark meetup group in 2012.
As Spark was emerging, software engineers were looking for new ways to store large quantities of modern software instead of relying on traditional databases. Spark's breakthrough was its ability to place data in computer memory to speed up the tasks. To commercialize the technology, Ghodsi co-founded Databricks in 2013, along with Spark creator Matei Zaharia and five others. The next year, the company boasted that, using Spark, it had sorted 100 terabytes of data in a record 23 minutes.
Ghodsi said more people have been using Databricks software since the Covid-19 outbreak.
Data scientists have become more dependent on the product at the Medical University of South Carolina, said Matt Turner, the school and health system's chief data officer. After the group established an online screening service for South Carolina residents with coronavirus symptoms, data scientists there trained an AI model within two weeks to predict positive cases. The health system prioritized contacting the most high-risk patients and getting them tested.
"I know there's multiple medical centers right now that are trying to implement what we've done," Turner said.
For Databricks remote work hasn't been a problem. Salespeople are pitching the product over Zoom and have generated a long list of leads from the 50,000 people who registered for the company's annual conference, scheduled for later this month. Demand is so strong that Databricks continues to hire through the downturn, including at a new engineering office in Toronto, the company said.
The use cases have become more sophisticated over the years. Nationwide uses the software to improve insurance pricing with AI and lots of data on claims, and Regeneron uses it to help engineers more quickly prepare data for analysis, which then speeds up drug discovery.
Databricks' software is only available as a cloud-based service inside Amazon Web Services and Microsoft Azure. That leaves out Alibaba, Google and others. The company will bring its software to other clouds when it sees major customer demand, Ghodsi said.
In 2017 Microsoft introduced Azure Databricks, and Microsoft sales reps promote the service to customers. The deal, which preceded the announcement of Microsoft's investment in Databricks by more than a year, has made a significant contribution to revenue, Ghodsi said.
Rishi Jaluria, an analyst at DA Davidson, puts it in starker terms: "I would say Databricks may not be here without the Microsoft partnership."