Preprints
https://doi.org/10.5194/egusphere-2025-1191
https://doi.org/10.5194/egusphere-2025-1191
20 Mar 2025
 | 20 Mar 2025
Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

Improved vapor pressure predictions using group contribution-assisted graph convolutional neural networks (GC2NN)

Matteo Krüger, Tommaso Galeazzo, Ivan Eremets, Bertil Schmidt, Ulrich Pöschl, Manabu Shiraiwa, and Thomas Berkemeier

Abstract. The vapor pressures (pvap) of organic molecules play a crucial role in the partitioning of secondary organic aerosol (SOA). Given the vast diversity of atmospheric organic compounds, experimentally determining pvap of each compound is unfeasible. Machine Learning (ML) algorithms allow the prediction of physicochemical properties based on complex representations of molecular structure, but their performance crucially depends on the availability of sufficient training data. We propose a novel approach to predict pvap using group contribution-assisted graph convolutional neural networks (GC2NN). The models use molecular descriptors like molar mass alongside molecular graphs containing atom and bond features as representations of molecular structure. Molecular graphs allow the ML model to better infer molecular connectivity compared to methods using other, non-structural embeddings. We achieve best results with an adaptive-depth GC2NN, where the number of evaluated graph layers depends on molecular size. We present two vapor pressure estimation models that achieve strong agreement between predicted and experimentally-determined pvap. The first is a general model with broad scope that is suitable for both organic and inorganic molecules and achieves a mean absolute error (MAE) of 0.67 log-units (R2=0.86). The second model is specialized on organic compounds with functional groups often encountered in atmospheric SOA, achieving an even stronger correlation with the test data (MAE=0.36 log-units, R2=0.97). The adaptive-depth GC2NN models clearly outperform existing methods, including parameterizations and group-contribution methods, demonstrating that graph-based ML techniques are powerful tools for the estimation of physicochemical properties, even when experimental data are scarce.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share
Matteo Krüger, Tommaso Galeazzo, Ivan Eremets, Bertil Schmidt, Ulrich Pöschl, Manabu Shiraiwa, and Thomas Berkemeier

Status: open (until 01 Jul 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Matteo Krüger, Tommaso Galeazzo, Ivan Eremets, Bertil Schmidt, Ulrich Pöschl, Manabu Shiraiwa, and Thomas Berkemeier
Matteo Krüger, Tommaso Galeazzo, Ivan Eremets, Bertil Schmidt, Ulrich Pöschl, Manabu Shiraiwa, and Thomas Berkemeier

Viewed

Total article views: 219 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
152 62 5 219 27 8 7
  • HTML: 152
  • PDF: 62
  • XML: 5
  • Total: 219
  • Supplement: 27
  • BibTeX: 8
  • EndNote: 7
Views and downloads (calculated since 20 Mar 2025)
Cumulative views and downloads (calculated since 20 Mar 2025)

Viewed (geographical distribution)

Total article views: 245 (including HTML, PDF, and XML) Thereof 245 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 

Cited

Latest update: 22 May 2025
Download
Short summary
This work uses machine learning to predict saturation vapor pressures of atmospherically-relevant organic compounds, crucial for partitioning of secondary organic aerosol (SOA). We introduce a new method using graph convolutional neural networks, in which molecular graphs enable the model to capture molecular connectivity better than with non-structural embeddings. The method shows strong agreement with experimentally determined vapor pressures, and outperforms existing estimation methods.
Share