Machine learning interatomic potentials with accurate long-range interactions for molecular dynamics collision simulations of atmospherically-relevant molecules
Abstract. Molecular collisions and subsequent clustering events are fundamental to atmospheric cluster formation. Accurately modeling these processes requires interatomic potentials that capture long-range forces governing collision kinetics and short-range quantum effects driving reactivity. In this work, we evaluate the AIMNet2 and PaiNN machine learning architectures trained on GFN1-xTB and ωB97X-3c data for molecular collisions involving sulfuric acid.
The models exhibit low mean absolute errors in energies and forces and accurately reproduce potentials of mean force relative to GFN1-xTB. Comparing models trained on GFN1-xTB and ωB97X-3c data reveals that while increasing the electronic structure theory level significantly alters the potential energy surface in the binding region, it has negligible impact on the long-range shoulder and collision rate coefficients. Notably, PaiNN demonstrates superior performance in reproducing binding and repulsive regions, making it highly effective for sampling stable cluster configurations.
However, discrepancies are observed in collision dynamics. While AIMNet2 accurately reproduces reference collision rates across all systems, PaiNN underestimates the rate for the charged sulfuric acid–bisulfate system by ~50 %. This error originates from the model's local atomic environment approximation, which neglects long-range attractive forces at large intermolecular distances. Comparisons with the OPLS-AA force field demonstrate that simple fixed partial charges are sufficient to describe these interactions.
Our results highlight that while local equivariant models like PaiNN offer exceptional accuracy for thermodynamics, correctly simulating collision kinetics in systems with strong long-range interactions requires models that explicitly account for forces beyond the local environment, such as AIMNet2.