Preprints
https://doi.org/10.5194/egusphere-2025-1191
https://doi.org/10.5194/egusphere-2025-1191
20 Mar 2025
 | 20 Mar 2025

Improved vapor pressure predictions using group contribution-assisted graph convolutional neural networks (GC2NN)

Matteo Krüger, Tommaso Galeazzo, Ivan Eremets, Bertil Schmidt, Ulrich Pöschl, Manabu Shiraiwa, and Thomas Berkemeier

Abstract. The vapor pressures (pvap) of organic molecules play a crucial role in the partitioning of secondary organic aerosol (SOA). Given the vast diversity of atmospheric organic compounds, experimentally determining pvap of each compound is unfeasible. Machine Learning (ML) algorithms allow the prediction of physicochemical properties based on complex representations of molecular structure, but their performance crucially depends on the availability of sufficient training data. We propose a novel approach to predict pvap using group contribution-assisted graph convolutional neural networks (GC2NN). The models use molecular descriptors like molar mass alongside molecular graphs containing atom and bond features as representations of molecular structure. Molecular graphs allow the ML model to better infer molecular connectivity compared to methods using other, non-structural embeddings. We achieve best results with an adaptive-depth GC2NN, where the number of evaluated graph layers depends on molecular size. We present two vapor pressure estimation models that achieve strong agreement between predicted and experimentally-determined pvap. The first is a general model with broad scope that is suitable for both organic and inorganic molecules and achieves a mean absolute error (MAE) of 0.67 log-units (R2=0.86). The second model is specialized on organic compounds with functional groups often encountered in atmospheric SOA, achieving an even stronger correlation with the test data (MAE=0.36 log-units, R2=0.97). The adaptive-depth GC2NN models clearly outperform existing methods, including parameterizations and group-contribution methods, demonstrating that graph-based ML techniques are powerful tools for the estimation of physicochemical properties, even when experimental data are scarce.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Matteo Krüger, Tommaso Galeazzo, Ivan Eremets, Bertil Schmidt, Ulrich Pöschl, Manabu Shiraiwa, and Thomas Berkemeier

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-1191', Anonymous Referee #1, 02 Jun 2025
    • EC1: 'Reply on RC1', Jason Williams, 02 Jun 2025
  • RC2: 'Comment on egusphere-2025-1191', Patrick Rinke, 10 Jun 2025
  • AC1: 'Response to reviewers of egusphere-2025-1191', Matteo Krüger, 15 Jul 2025

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-1191', Anonymous Referee #1, 02 Jun 2025
    • EC1: 'Reply on RC1', Jason Williams, 02 Jun 2025
  • RC2: 'Comment on egusphere-2025-1191', Patrick Rinke, 10 Jun 2025
  • AC1: 'Response to reviewers of egusphere-2025-1191', Matteo Krüger, 15 Jul 2025
Matteo Krüger, Tommaso Galeazzo, Ivan Eremets, Bertil Schmidt, Ulrich Pöschl, Manabu Shiraiwa, and Thomas Berkemeier
Matteo Krüger, Tommaso Galeazzo, Ivan Eremets, Bertil Schmidt, Ulrich Pöschl, Manabu Shiraiwa, and Thomas Berkemeier

Viewed

Total article views: 1,118 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
952 148 18 1,118 45 29 40
  • HTML: 952
  • PDF: 148
  • XML: 18
  • Total: 1,118
  • Supplement: 45
  • BibTeX: 29
  • EndNote: 40
Views and downloads (calculated since 20 Mar 2025)
Cumulative views and downloads (calculated since 20 Mar 2025)

Viewed (geographical distribution)

Total article views: 1,131 (including HTML, PDF, and XML) Thereof 1,131 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 

Cited

Latest update: 10 Sep 2025
Download
Short summary
This work uses machine learning to predict saturation vapor pressures of atmospherically-relevant organic compounds, crucial for partitioning of secondary organic aerosol (SOA). We introduce a new method using graph convolutional neural networks, in which molecular graphs enable the model to capture molecular connectivity better than with non-structural embeddings. The method shows strong agreement with experimentally determined vapor pressures, and outperforms existing estimation methods.
Share