Picture this: Your doctor prescribes a new medication, but once you start taking it, you begin to feel a little off. While the smart thing to do is call the doctor or pharmacist, the more common action today is to hop on the Web and see if you can figure out what's going on.
As it turns out, that self-diagnosing and hypochondriac-like behavior could help save people's lives.
Researchers at Microsoft Research Labs, in conjunction with Stanford University, have found that Web searches can help the FDA and pharmaceutical companies discover previously unknown dangerous drug interactions. And the FDA is welcoming the help.
"We are currently monitoring research in this area and are also engaged in an exploratory pilot project to assess the value of user-generated digital data for post-market safety assessment of certain FDA regulated products including drugs, devices, biologics, vaccines and dietary supplements," said FDA spokeswoman Andrea Fischer. "We are excited about the possibilities of using this type of data to improve or speed detection of significant issues."
(Read more: You can live forever! Digitally)
In 2011, Dr. Russ Altman, chairman of the Stanford bioengineering department, led a project that found patients taking paroxetine (an antidepressant, better known by its brand name Paxil) and pravastatin (a cholesterol-lowering drug marketed as Pravachol) could develop hyperglycemia. Later, as he was sharing a milkshake with his former classmate Dr. Eric Horvitz, a distinguished scientist and managing co-director at Microsoft Research, for assistance and discussing the findings, the two theorized that the reaction could have been spotted much earlier.
"We thought 'We wonder if you can go to the Web?'" said Horvitz. "With a certain methodology, we can see who is searching for side effects like fatigue, slow healing of sores and blurry vision. ... We did an analysis that would show if you're interested in both drugs, you 're more likely to search with a curiosity about symptoms of hyperglycemia than if you're taking just one of those drugs."
"We can take signals from traditional FDA databases and add them to the signals coming from the Wild West of the Web and show that together you can do better."
The new prescription
Looking at anonymized and consensual search data from Bing, Google and Yahoo that was gathered a year before FDA findings, they learned their theory was correct. And now the FDA is taking a greater interest in what's being called Web-scale pharmacovigilance—a drug safety study aimed at finding previously unknown adverse effects.
"This data has proven useful for similar health-related purposes such as disease detection and outbreak surveillance, so we are cautiously optimistic there will be value in the post-market safety setting, while also preparing for unique challenges involved in analyzing these data," Fischer said. "It is important to recognize that potential challenges include determining the sensitivity of such approaches (e.g. actual ability to detect a problem) and the specificity of the results, that is whether the signal detected is truly due to the product in question or they are associated for other, unknown reasons."
(Read more: 10 strangest data findings you need to know)
While the government organization has not established a formal relationship with Microsoft in regards to the study, the two met in September to discuss the findings.
"There has been a lot of interest at the FDA in combining these methods with traditional sources like these slower-paced reports that health-care professionals fill out when patients come in and complain," Horvitz said. "We can take signals from traditional FDA databases and add them to the signals coming from the Wild West of the Web and show that together you can do better. They complement each other."
This is hardly the first time Web searches have been linked to improved health. Several groups have built online flu trackers, using big data from the Web to chart the spread of influenza around the country each winter.
The biggest of these is Google Flu, which relies on Web search inquiries for its data. The problem with that method is when the media hypes a flu outbreak as it did last winter, that artificially increases the number of searches, which can skew the data.
(Read more: 10 surprising ways companies use your data)
Another nonprofit service, named Flu Near You, tracks the flu through user-submitted symptoms (and recently launched an app to expand its database), while Sickweather scans social media sites for public, geo-tagged postings that indicate illness.
Tweet what ails you
Social media may seem an unusual place to track disease on a national level, but according to a 2011 report from Johns Hopkins University, it's surprisingly accurate. Flu tracking on Twitter was in line with the CDC's findings, but could be obtained as much as two weeks earlier.
"We have investigated a variety of public health data that can be automatically extracted from Twitter," the report read. "These results suggest that public health officials and researchers can both replace current more expensive and time consuming methods and expand the range of what can be easily measured, including both new sources and types of information."
Sharing health data does raise some privacy issues, though. And some people will refuse to open up their medical records, even anonymously. Altman said he understands this hesitancy, but adds that without access to broad data sets, it will slow research.
"People are understandably concerned about privacy," he said at the "Atlantic Meets the Pacific" conference earlier this year. "[However,] there are these information altruists who are willing to share their data and I think we should take advantage of that. .... If people close down that data source, it would come at a great price to society in our ability to make discoveries."
Horvitz said early successes in mining big data for medical advances, like flu tracking and pharmacovigilance, are encouraging, but he suspects there's a lot more that can be done.
"We've just barely scratched the surface in harnessing the potential of big data," he said. "I think we're going to see the rise of using data in medical records. ... We think we can look at large-scale patterns and understand things."
So far, Microsoft Research Labs has a track record to back that up. In April, the group found a way to predict the risk of future cholera epidemics by combing through 70 years of New York Times headlines and Web data about countries and information about per capita income and water supplies.
"This analysis told us that if there is an extreme drought followed by a flood, it predicts a headline about cholera breaking out," Horvitz said.
The group has also used data drawn from electronic health records to develop systems that predict the likelihood of hospital readmissions as well as hospital-acquired infections. (Several medical facilities now use those models.)
While the team continues to work on tracking unknown negative drug interactions, other efforts are underway, including several new projects leveraging data from the Web. Horvitz is eager to see what's next.
"I've been pretty impressed with the Web's large-scale ability to give us interesting signals," he said. "And I'm very excited about methods that might 'regularize' it into a reliable sensor."
—By Chris Morris, Special to CNBC.com.