A Deeper Look at Deep Potential Molecular Dynamics
The Summit Supercomputer at Oak Ridge National Laboratory is one of the world’s fastest, capable of reaching 200 petaflops, or, 200,000 trillion calculations per second. In 2020, it was the perfect vehicle to take a machine learning-based protocol created at Princeton for a test drive, to open it up and see what it could do.
Suffice it to say that the method, developed by Roberto Car, Ralph W. *31 Dornte Professor of Chemistry, and his team at Chemistry in Solution and at Interfaces (CSI), was well matched to the track. Called Deep Potential Molecular Dynamics (DPMD), it achieved record molecular dynamics simulations on Summit, the first to reliably model from quantum mechanics the movement of 100 million atoms over a few nanoseconds.
Molecular dynamics combined with the accuracy of quantum mechanics is an invaluable tool for scientific discovery, providing scientists with a greater understanding of how matter behaves on the atomic scale. DPMD can predict that behavior at extreme conditions and at unprecedented size and timescales that are otherwise difficult to access experimentally.
This achievement opens new avenues for advancements in chemistry, physics, the geo- and planetary sciences, and drug discovery.
Other machine learning-based protocols have been introduced for molecular dynamics simulations over the years. DPMD is more effective because it does not require ad hoc adjustments, and it minimizes the cost of learning by doing it on the fly.
For their achievement, Car and his team won the 2020 Gordon Bell Prize for achievement in supercomputing.
This week Car also received, in a virtual ceremony, the 2021 Benjamin Franklin Medal in Chemistry with longtime collaborator Michele Parrinello of ETH Zurich.
These are the latest in a long series of awards that puts the exclamation point on Car’s accomplishments as a physical chemist. He has consistently delivered novel tools that allow the rest of us to exploit the extraordinary gains in computing over the past few decades.
Explaining DPMD, Car said: “One of the aims of computational physics and chemistry for a long time has been to predict – from computer simulations of the movement of atoms in a given system – their properties: what kind of stable structure they have, what kind of crystalline structures exist, at what temperature and pressure crystals melt, how certain environments facilitate chemical reactions. That has long been a dream.
“The difficulty is that, as famously remarked by one of the fathers of quantum mechanics, P. A. M. Dirac, the equations to do this are far too complicated,” added Car, also an associated faculty member with the Department of Physics. “Therefore, the problem is to find approximations that are good enough to solve these equations. That has been a constant struggle for many years.”
Enter, DPMD.
GRAD STUDENT LEADS THE WAY
The foundation methodology for DPMD was developed by Linfeng Zhang, a graduate student advised by both Car and Weinan E, a professor in the Department of Mathematics. Zhang’s protocol, which had additional collaborators, including Jiequn Han, a student of E’s, draws together several technological advances: machine learning and advanced algorithms.
“Roberto is application-driven, I am algorithm-driven. The combination, plus the talent and drive of Linfeng and other students involved, made DPMD successful,” said E, who has been collaborating with Car since they both arrived on campus in 1999. “Roberto is a pioneer in ab initio molecular dynamics. I would say that DPMD made ab initio molecular dynamics a powerful, practical tool.”
The project had its origins in discussions between Car, Zhang, and E in 2016 about the opportunities Artificial Intelligence brings to computational science and, simultaneously, the limitations that prevent scientists from taking full advantage of them.
“At the time, we knew nothing about the solution to the challenges, but our discussions kept going,” said Zhang, who earned his Ph.D. in 2020 and is now a research scientist at the Beijing Institute of Big Data Research. “I started my journey in both worlds of molecular modeling and machine learning, without any expectation that one day I would come up with a solution or a new algorithm that integrates the two fields.
“My daily activity on this project was to learn, explain, and discuss the same concept using different languages,” he added. “I could think about this problem by considering how I translate the successes of machine learning in image recognition and natural language processing, etc., to the field of molecular modeling.”
STARTING WITH AB INITIO
DPMD proceeds from a simulation tool pioneered back in 1985 by Car and Parrinello called ab initio Molecular Dynamics, an efficient computational method that traces simultaneously the evolution of the atoms and the electrons responsible for chemical bonding, using suitable equations of motion adjusted on-the-fly, or step-by-step. The ab initio method has been called a “virtual telescope” for looking at atomic movement.
What Zhang did was couple ab initio molecular dynamics with machine learning to drive the efficiency of simulations to even higher levels. Where chemists and physicists could previously track a few hundred atoms at a time or a few thousand (if you had a really powerful computer), DPMD expands the efficiency to million-atom systems for tens of nanoseconds routinely.
The achievements of DPMD will allow scientists to explore a conspicuously higher range of systems; for example, those in complex chemical reactions, phase transitions, drug discovery, battery materials, and materials design, among others.
“We have used DPMD to describe the potential energy and other properties of a system of atoms that depend on the coordinates of all the atoms,” said Car. “The potential energy is a function that lives in a high-dimensional space. If you have a million atoms in a system, that is really a huge space. And there are limitations. But still, we can describe it.”
Car, Zhang, E, and collaborators have cultivated an open-source community on GitHub that offers free access to a software package kit code called DeePMD, which is written in Python/C++, for collective contributions. Access can be obtained here: https://github.com/deepmodeling/deepmd-kit.
“All the techniques used for optimizing the code are on the open-source repo, which is transferable to applications running on smaller systems,” said Zhang. “As a matter of fact, the most important reward from the Gordon Bell experience was that we formed a great team.”
That team, he added, has already made the DPMD simulation more than 10 times faster than the original version run on the Summit Supercomputer for the Gordon Bell Prize.
“And,” added Zhang, “we will keep pushing the limit.”
The awardees of the 2020 Gordon Bell Prize include Roberto Car, the team at CSI, and collaborators at University of California, Berkeley; Institute of Applied Physics and Computational Mathetics Beijing, China); Peking University; University of California, Berkeley, Lawrence Berkeley National Laboratory.
The Chemistry in Solution and at Interfaces Computational Chemical Science Center (CSI), of which Roberto Car is director and PI, is supported by funding from the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences.