Recent advances in artificial intelligence–driven biomolecular dynamics simulations based on machine learning force fields

Molecular dynamics (MD) simulation has become an indispensable computational technique in life sciences research, enabling scientists to trace the time-dependent evolution of molecular behavior and probe the intricate mechanisms of biomolecules at atomic resolution [1,2]. By modeling the interactions and motions of biological systems, such simulations bridge the gap between theoretical predictions and experimental observations. In MD simulations, the atomic forces are calculated by the force field, and numerical integrations are performed to update atomic positions and velocities iteratively, thereby simulating dynamic processes like protein folding and ligand binding. Therefore, the fidelity of MD simulations critically depends on the force field, which defines the interatomic forces driving the system’s evolution and ultimately determines the physical realism of the observations [3].

For decades, the molecular mechanics (MM) adopted by classical MD simulations have dominated biomolecular simulations [4, 5, 6]. These empirical methods decompose the potential energy into bonded and nonbonded terms, balancing computational efficiency with physical approximations. However, their reliance on pairwise additive interactions and fixed atom types limits the energy and force calculation accuracies, particularly in capturing polarization, charge transfer, and nonadditive quantum effects [3]. While quantum mechanics (QM), such as density functional theory (DFT), offers ab initio calculation accuracy, the exorbitant computational cost restricts their applications to small molecular systems and short timescales, rendering them impractical for studying biomolecular dynamics simulations with large system scales or long time scales [7].

Recent advances in artificial intelligence (AI) have been revolutionizing the computational structural biology field. Landmark achievements such as AlphaFold’s Nobel Prize-winning breakthrough in static protein structure prediction underscore AI’s transformative potential [8, 9, 10]. Yet, the static snapshots alone cannot fully unravel the dynamic behavior of biomolecules, such as allosteric regulations and enzymatic catalysis, which are critical for understanding biological mechanisms and designing therapeutics. Dynamic simulations demand force fields that not only predict equilibrium structures but also faithfully reproduce time-dependent interactions. This shift from static to dynamic modeling unlocks deeper insights into signaling pathways and drug-target binding processes, positioning AI-driven methods at the forefront of computational biology.

The emerging machine learning force fields (MLFFs), like ByteFF [11] and ViSNet designed by Wang et al. [12], represent a significant advancement by integrating ML techniques with fundamental physical principles. ByteFF designed by Zheng et al. [11] is trained on extensive datasets derived from ab initio calculations to parameterize classical molecular force fields with high precision, while ViSNet further incorporates equivariant graph neural networks (EGNNs) to maintain geometric symmetries, thereby improving both accuracy and transferability. These innovative models outperform traditional molecular mechanics in capturing complex many–body interactions, such as the coupling between bonded potential terms and polarization effects, while preserving computational efficiency. Although such MLFFs open a new way to address the contradiction between accuracy and efficiency that exists in MM and QM fields, the key challenge of generalizability to different kinds of biomolecules still exists. Given the limited availability of high-quality training datasets, MLFFs hardly capture the entire conformational space of a biomolecule or the differences among various biomolecules [13∗∗, 14∗, 15∗]. When adopting the MLFF trained on a certain kind of molecule to simulate other kinds of molecules, the energy and force calculation accuracy usually drops a lot, which leads to simulation collapse [16,17].

In this context, another kind of universal MLFFs, such as AI2BMD [13] and GEMS [18], unify quantum precision with high efficiency, dynamically adapting to diverse biomolecular systems. By employing fragmentation strategies, they extend the high accuracy of fragment-level calculations to large-scale macromolecular simulations. However, such existing frameworks still struggle to accurately model long-range interactions—such as long-range electrostatic interactions or polarization effects mediated by solvents [19] and face data scarcity for complex systems.

This review synthesizes recent developments to outline the key factors in designing MLFFs for AI-driven MD simulations, summarize advances in three major MLFF categories, analyze their limitations and trade-offs, and showcase their applications. We envision these AI techniques becoming indispensable tools for empowering atomic-resolution studies of cellular-scale phenomena, thereby bridging computational predictions with experimental validations and accelerating the elucidation of biological mechanisms and therapeutic discovery.

Comments (0)

No login
gif