Not long ago, most investors had to wait until companies reported quarterly earnings to get a sense of how they were performing. Today, the ubiquity of data collection and the Internet of Things, coupled with advances in computing and machine learning, means early insights are that much closer and easier to access.
A hedge fund no longer has to send scouts to retail stores for a headcount of shoppers; instead satellite photos of cars in mall parking lots are analyzed at scale. Bots comb through millions of emailed receipts to harvest insights on shopping trends.
For years, so-called alternative data was the purview of a small fringe of hardcore believers, not the mainstream of investing. The creative, sometimes quirky strategies took traditional investors aback and were often of interest more for idle chatter than actual portfolios. But more recently, alternative data has matured. Recent investments by heavyweights like J.P. Morgan Chase and Goldman Sachs have lent an institutional backing. Bloomberg announced recently that it would sell alternative datasets through its terminal.
Nowhere is this more apparent than at the recent Quandl Data Conference in Manhattan. Now in its third year, the event brings together data providers and data seekers for talks around strategy, technical infrastructure and legal issues that arise when alternative data meets compliance. Founded in Toronto in 2012, Quandl makes a business out of finding and vetting previously unknown datasets and ferreting out useful and actionable insights. Broadly defined, alternative data is anything outside of the traditional financial statements and fundamental insights used by investors.
Some practitioners argue that investors have always used "alternative" data for prediction — Babylonians would supposedly use the water level of the Euphrates River to forecast what crops would do well in a given year. Contemporary alternative data practitioners use more than water levels. Sophisticated data gathering, cleaning and analysis has exploded in the past years, and there are over 400 alternative data providers out there, according to AlternativeData.org, a site run by YipitData, one of those providers.
Still, it's the early days.
"We're on the second batter of the second inning," said Adena Friedman, president and chief executive of Nasdaq. Further evidence that alternative data is itself becoming not so alternative: Quandl itself was purchased last December by Nasdaq.
The flourishing of alternative data has coincided with two interrelated phenomena: The expansion of data collection and the technology to process those massive amounts of data into something usable. The expanding Internet of Things has increased many-fold the amount of data we generate on a daily basis.
Every action — from parking at your local mall to an oil tanker chugging across the ocean — is recorded and harvested for economic trends and consumer anomalies. At the same time, increased computational power allows analysts and data scientists to gather insights from that trove of data at a pace previously unheard of. Machine intelligence allows human operators to find insights in unstructured data like text transcripts.
Lastly, talk of the appetite for alternative data has trickled down to the smallest company with a computer network. Firms no longer have databases of customer rolls but untapped veins of potential consumer insights, just waiting for discovery. Companies with successful business models serving ads to consumers can find additional revenue streams selling so called "exhaust data" to a third party.
"Every company is now a data company," said Abraham Thomas, chief data officer at Quandl. "There's no transaction that doesn't leave a trace."
Now that early mainstream adopters are dipping their toes in, there's greater emphasis on datasets with a proven track record, like credit card data and geolocation. Experts say the more idiosyncratic datasets that were the hallmark of alternative data are of less interest to hedge funds today. As more companies move to make their data offerings more robust, the infrastructure required to run massive data operations is becoming more important.
There are thorny legal questions that have yet to be answered.
For one thing, hedge funds need to be careful what data they use to avoid running afoul of insider trading regulators. The other major legal issue is centered around personal privacy. Each credit card swipe that goes into a dataset showing purchase activity is a potential area of concern for privacy advocates. Credit card terms of service generally include clauses to allow anonymized collection, but legislation could change that.
The U.S. has a lot of work to do in that regard, and has to catch up to Europe's data-protection laws, Friedman said at the Quandl conference. Legislation coming online in states like California mean to make it easier for consumers to control the data being collected about them, and could be the future of data privacy law in the U.S.