Research has gone digital, and medical science is no exception. As the novel coronavirus continues to spread, for instance, scientists searching for a treatment have drafted IBM's Summit supercomputer, the world's most powerful high-performance computing facility, according to the Top500 list, to help find promising candidate drugs.
One way of treating an infection could be with a compound that sticks to a certain part of the virus, disarming it. With tens of thousands of processors spanning an area as large as two tennis courts, the Summit facility at Oak Ridge National Laboratory (ORNL) has more computational power than 1 million top-of-the-line laptops. Using that muscle, researchers digitally simulated how 8,000 different molecules would interact with the virus — a Herculean task for your typical personal computer.
"It took us a day or two, whereas it has traditionally taken months on a normal computer," said Jeremy Smith, director of the University of Tennessee/ORNL Center for Molecular Biophysics and principal researcher in the study.
Simulations alone can't prove a treatment will work, but the project was able to identify 77 candidate molecules that other researchers can now test in trials. The fight against the novel coronavirus is just one example of how supercomputers have become an essential part of the process of discovery. The $200 million Summit and similar machines also simulate the birth of the universe, explosions from atomic weapons and a host of events too complicated — or too violent — to recreate in a lab.
The current generation's formidable power is just a taste of what's to come. Aurora, a $500 million Intel machine currently under installation at Argonne National Laboratory, will herald the long-awaited arrival of "exaflop" facilities capable of a billion billion calculations per second (five times more than Summit) in 2021 with others to follow. China, Japan and the European Union are all expected to switch on similar "exascale" systems in the next five years.
These new machines will enable new discoveries, but only for the select few researchers with the programming know-how required to efficiently marshal their considerable resources. What's more, technological hurdles lead some experts to believe that exascale computing might be the end of the line. For these reasons, scientists are increasingly attempting to harness artificial intelligence to accomplish more research with less computational power.
"We as an industry have become too captive to building systems that execute the benchmark well without necessarily paying attention to how systems are used," says Dave Turek, vice president of technical computing for IBM Cognitive Systems. He likens high-performance computing record-seeking to focusing on building the world's fastest race car instead of highway-ready minivans. "The ability to inform the classic ways of doing HPC with AI becomes really the innovation wave that's coursing through HPC today."
Just getting to the verge of exascale computing has taken a decade of research and collaboration between the Department of Energy and private vendors. "It's been a journey," says Patricia Damkroger, general manager of Intel's high-performance computing division. "Ten years ago, they said it couldn't be done."
While each system has its own unique architecture, Summit, Aurora, and the upcoming Frontier supercomputer all represent variations on a theme: they harness the immense power of graphical processing units (GPUs) alongside traditional central processing units (CPUs). GPUs can carry out more simultaneous operations than a CPU can, so leaning on these workhorses has let Intel and IBM design machines that would have otherwise required untold megawatts of energy.
That computational power lets Summit, which is known as a "pre-exascale" computer because it runs at 0.2 exaflops, simulate one single supernova explosion in about two months, according to Bronson Messer, the acting director of science for the Oak Ridge Leadership Computing Facility. He hopes that machines like Aurora (1 exaflop) and the upcoming Frontier supercomputer (1.5 exaflops) will get that time down to about a week. Damkroger looks forward to medical applications. Where current supercomputers can digitally model a single heart, for instance, exascale machines will be able to simulate how the heart works together with blood vessels, she predicts.
But even as exascale developers take a victory lap, they know that two challenges mean the add-more-GPUs formula is likely approaching a plateau in its scientific usefulness. First, GPUs are strong but dumb—best suited to simple operations such as arithmetic and geometric calculations that they can crowdsource among their many components. Researchers have written simulations to run on flexible CPUs for decades and shifting to GPUs often requires starting from scratch.
"The real issue that we're wrestling with at this point is how do we move our code over" from running on CPUs to running on GPUs, says Richard Loft, a computational scientist at the National Center for Atmospheric Research, home of Top500's 44th ranking supercomputer—Cheyenne, a CPU-based machine "It's labor intensive, and they're difficult to program."
Second, the more processors a machine has, the harder it is to coordinate the sharing of calculations. For the climate modeling that Loft does, machines with more processors better answer questions like "what is the chance of a once-in-a-millennium deluge," because they can run more identical simulations simultaneously and build up more robust statistics. But they don't ultimately enable the climate models themselves to get much more sophisticated.
For that, the actual processors have to get faster, a feat that bumps up against what's physically possible. Faster processors need smaller transistors, and current transistors measure about 7 nanometers. Companies might be able to shrink that size, Turek says, but only to a point. "You can't get to zero [nanometers]," he says. "You have to invoke other kinds of approaches."
If supercomputers can't get much more powerful, researchers will have to get smarter about how they use the facilities. Traditional computing is often an exercise in brute forcing a problem, and machine learning techniques may allow researchers to approach complex calculations with more finesse.
Take drug design. A pharmacist considering a dozen ingredients faces countless possible recipes, varying amounts of each compound, which could take a supercomputer years to simulate. An emerging machine learning technique known as Bayesian Optimization asks, does the computer really need to check every single option? Rather than systematically sweeping the field, the method helps isolate the most promising drugs by implementing common-sense assumptions. Once it finds one reasonably effective solution, for instance, it might prioritize seeking small improvements with minor tweaks .
In trial-and-error fields like materials science and cosmetics, Turek says that this strategy can reduce the number of simulations needed by 70% to 90%. Recently, for instance, the technique has led to breakthroughs in battery design and the discovery of a new antibiotic.
Fields like climate science and particle physics use brute-force computation in a different way, by starting with simple mathematical laws of nature and calculating the behavior of complex systems. Climate models, for instance, try to predict how air currents conspire with forests, cities, and oceans to determine global temperature.
Mike Pritchard, a climatologist at the University of California, Irvine, hopes to figure out how clouds fit into this picture, but most current climate models are blind to features smaller than a few dozen miles wide. Crunching the numbers for a worldwide layer of clouds, which might be just a couple hundred feet tall, simply requires more mathematical brawn than any supercomputer can deliver.
Unless the computer understands how clouds interact better than we do, that is. Pritchard is one of many climatologists experimenting with training neural networks—a machine learning technique that looks for patterns by trial and error—to mimic cloud behavior. This approach takes a lot of computing power up front to generate realistic clouds for the neural network to imitate. But once the network has learned how to produce plausible cloudlike behavior, it can replace the computationally intensive laws of nature in the global model, at least in theory. "It's a very exciting time," Pritchard says. "It could be totally revolutionary, if it's credible."
Companies are preparing their machines so researchers like Pritchard can take full advantage of the computational tools they're developing. Turek says IBM is focusing on designing AI-ready machines capable of extreme multitasking and quickly shuttling around huge quantities of information, and the Department of Energy contract for Aurora is Intel's first that specifies a benchmark for certain AI applications, according to Damkroger. Intel is also developing an open-source software toolkit called oneAPI that will make it easier for developers to create programs that run efficiently on a variety of processors, including CPUs and GPUs. As exascale and machine learning tools become increasingly available, scientists hope they'll be able to move past the computer engineering and focus on making new discoveries. "When we get to exascale that's only going to be half the story," Messer says. "What we actually accomplish at the exascale will be what matters."