How big data could help stop the Ebola outbreak

It was March when the World Health Organization first confirmed that Ebola had broken out in Guinea. Just six months later and it has become the largest ever-recorded outbreak of the disease.

A picture taken on June 28, 2014 shows a member of Doctors Without Borders (MSF) putting on protective gear at the isolation ward of the Donka Hospital in Conakry, where people infected with the Ebola virus are being treated.
Cellou Binani | AFP | Getty Images
A picture taken on June 28, 2014 shows a member of Doctors Without Borders (MSF) putting on protective gear at the isolation ward of the Donka Hospital in Conakry, where people infected with the Ebola virus are being treated.

Ebola has so far infected more than 6,500 people and claimed the lives of more than 3,000 people around the world, according to the latest numbers from the WHO and U.S. Centers for Disease Control and Prevention and WHO. That puts the fatality rate around 47 percent. Cases in Liberia are doubling every 15-20 days and in other countries such as Sierra Leone and Guinea, they're doubling every 30 to 40 days. (Click here for the latest report from the CDC.)

According to Peter Piot, Director of the London School of Hygiene and Tropical Medicine, the unprecedented nature of the outbreak is making it impossible to determine how and where it is going to spread next, a real problem for the humanitarian agencies concerned with ensuring care reaches those who need it most.

Read MoreNo other suspected Ebola cases in Texas: Health officials

With official channels under strain — Liberia has just 200 recorded doctors to care for a population of 4 million — emerging technologies are becoming increasingly important in the fight against the disease and attempts to stop the epidemic in its tracks.

While big data analytics are often championed by the private sector, its potential use by humanitarian workers operating out of disaster zones has long been overlooked.

The technology enables vast swathes of information to be aggregated and filtered from a diverse array of sources whilst removing irrelevant information along the way. Banks are using it to massively increase the accuracy of their fraud-detection measures, while pharmaceutical companies are using it to develop life-saving new drugs.

In disaster zones, real-time analytics that process and churn huge amounts of data can help pinpoint previously unanticipated trends, limit the number of deaths and, in doing so, massively reduce the spread of disease. Used in this way, big-data technologies might one day become an essential tool in the relief worker's arsenal.

An essential part of this requires being able to collate unstructured data as soon as it is produced, by any number of organizations from across the globe — a process known as multi-center ingest. Combined with the vast quantities of public information already accessible via the internet, big data can help ensure those working in hazardous environments are able to stay on top of ever changing situations.

Read MoreWhat we know about the Texas Ebola patient

For instance, one mobile carrier has provided researchers with access to anonymized data gleaned from cellphones in Senegal. This provides a window into regional population movements that could help predict the spread of Ebola.

With an incubation period of between two and 21 days, during which victims may not know they are infected, determining where people are going and where they have been is essential to containing Ebola. As such, the model created using the cellphone data, combined with the latest reports from the World Health Organization, is helping to offer clues about where to focus preventive measures and distribute health care.

Using information gleaned from a wide range of sources, such as social media, hospital updates and flight records, authorities are able to develop unprecedented insights into where and how to respond. Not only could this help save lives, it can also ensure that resources are allocated where they are needed most.

In spite of its potential, big data is not without its critics: some dismiss it as the plaything of the marketer — a useful tool for clever advertising but little else. Indeed, many continue to lambast big data with its notable early failures in tracking the spread of disease.

An article in Science magazine earlier this year undermined the findings of Google Flu Trends, a project that began in 2009 and aimed to identify flu outbreaks from information gathered by looking at search queries alone.

Google had claimed to be able to track the spread of flu across America quicker than the CDC and without the need of any medical results. But the research featured in Science found that the tech giant had overestimated the number of cases for four years running compared with slower-to-collate data from official channels.

Read MoreLooking to trade on Ebola? Beware

In the case of the Ebola outbreak, similar misinterpretations could lead public health officials down wrong paths and only increase the spread of the disease. For instance, past observations pointed to Ebola only occurring in limited instances within rural communities, but the current outbreak has seen a wide range of populations affected, spreading as it has from villages to urban centers. While big data can help provide more detailed modeling, these are just reference points — a human eye is still needed to synthesize and interpret the information accurately.

As countries affected by the epidemic struggle to contain the disease despite numerous border closures, evacuations and mass quarantines, their already fragile economies are being crippled. Some have suggested the economic impact of the virus could end up killing as many people as Ebola itself.

And while the ambitions of Google's skunkwork project are laudable, it's the world's governments that are best placed to use data to provide aid workers with the means to track disease.

Those working to fight the epidemic should use every tool at their disposal to help fight its spread. Used in the right way, big data can help authorities intervene much more effectively.

But we need to take a forward-looking approach that captures and utilizes as much data as possible, interpreting it smartly to better understand the lifestyle, environment and behavior of those affected. The CDC predicts up to 1.4 million people could be affected by January next year. Let us hope that by using big data as one of the tools to fight this epidemic it won't get to that.

Commentary by David Richards, the co-founder and CEO of big data firm WANdisco. Follow him on Twitter @DavidRichards.