Preprints
https://doi.org/10.5194/egusphere-2024-1846
https://doi.org/10.5194/egusphere-2024-1846
18 Jul 2024
 | 18 Jul 2024

Technical note: Towards atmospheric compound identification in chemical ionization mass spectrometry with machine learning

Federica Bortolussi, Hilda Sandström, Fariba Partovi, Joona Mikkilä, Patrick Rinke, and Matti Rissanen

Abstract. Chemical ionization mass spectrometry (CIMS) is widely used in atmospheric chemistry studies. However, due to the complex interactions between reagent ions and target compounds, chemical understanding remains limited and compound identification difficult. In this study, we apply machine learning to a reference dataset of pesticides in two standard solutions to build a model that can provide insights from CIMS analyses in atmospheric science. The CIMS measurements were performed with an orbitrap mass spectrometer coupled to a thermal desorption multi-scheme chemical ionization inlet unit (TD-MION-MS) with both negative and positive ionization modes utilizing Br-, O2-, H3O+ and (CH3)2COH+ (AceH+) as reagent ions. We then trained two machine learning methods on this data: 1) random forest (RF) for classifying if a pesticide can be detected with CIMS, and 2)  kernel ridge regression (KRR) for predicting the expected CIMS signals. We compared their performance on five different representations of the molecular structure: the topological fingerprint (TopFP), the molecular access system keys (MACCS), a custom descriptor based on standard molecular properties (RDKitPROP), the Coulomb matrix (CM) and the many-body tensor representation (MBTR). The results indicate that MACCS outperforms the other descriptors. Our best classification model reaches a prediction accuracy of 0.85 ± 0.02 and a receiver operating characteristic curve area of 0.91 ± 0.01. Our best regression model reaches an accuracy of 0.44 ± 0.03 logarithmic units of the signal intensity. Subsequent feature importance analysis of the classifiers reveals that the most important structural fragments are NH and OH for the negative ionization schemes and nitrogen-containing groups for the positive ionization schemes.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Journal article(s) based on this preprint

17 Jan 2025
Technical note: Towards atmospheric compound identification in chemical ionization mass spectrometry with pesticide standards and machine learning
Federica Bortolussi, Hilda Sandström, Fariba Partovi, Joona Mikkilä, Patrick Rinke, and Matti Rissanen
Atmos. Chem. Phys., 25, 685–704, https://doi.org/10.5194/acp-25-685-2025,https://doi.org/10.5194/acp-25-685-2025, 2025
Short summary
Federica Bortolussi, Hilda Sandström, Fariba Partovi, Joona Mikkilä, Patrick Rinke, and Matti Rissanen

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2024-1846', Anonymous Referee #1, 15 Sep 2024
  • RC2: 'Comment on egusphere-2024-1846', Anonymous Referee #2, 27 Sep 2024
  • AC1: 'Comment on egusphere-2024-1846', Federica Bortolussi, 16 Nov 2024

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2024-1846', Anonymous Referee #1, 15 Sep 2024
  • RC2: 'Comment on egusphere-2024-1846', Anonymous Referee #2, 27 Sep 2024
  • AC1: 'Comment on egusphere-2024-1846', Federica Bortolussi, 16 Nov 2024

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload
AR by Federica Bortolussi on behalf of the Authors (16 Nov 2024)  Author's response   Author's tracked changes   Manuscript 
ED: Publish as is (20 Nov 2024) by Eva Y. Pfannerstill
AR by Federica Bortolussi on behalf of the Authors (21 Nov 2024)

Journal article(s) based on this preprint

17 Jan 2025
Technical note: Towards atmospheric compound identification in chemical ionization mass spectrometry with pesticide standards and machine learning
Federica Bortolussi, Hilda Sandström, Fariba Partovi, Joona Mikkilä, Patrick Rinke, and Matti Rissanen
Atmos. Chem. Phys., 25, 685–704, https://doi.org/10.5194/acp-25-685-2025,https://doi.org/10.5194/acp-25-685-2025, 2025
Short summary
Federica Bortolussi, Hilda Sandström, Fariba Partovi, Joona Mikkilä, Patrick Rinke, and Matti Rissanen
Federica Bortolussi, Hilda Sandström, Fariba Partovi, Joona Mikkilä, Patrick Rinke, and Matti Rissanen

Viewed

Total article views: 628 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
378 181 69 628 40 14 16
  • HTML: 378
  • PDF: 181
  • XML: 69
  • Total: 628
  • Supplement: 40
  • BibTeX: 14
  • EndNote: 16
Views and downloads (calculated since 18 Jul 2024)
Cumulative views and downloads (calculated since 18 Jul 2024)

Viewed (geographical distribution)

Total article views: 635 (including HTML, PDF, and XML) Thereof 635 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 17 Jan 2025
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
Chemical ionization mass spectrometry (CIMS) is widely used in atmospheric chemistry studies. We still have a limited understanding of the complex functioning of the instrument, therefore, we applied machine learning to provide insights from CIMS analyses. We were able to predict both detection and signal intensity with a fair error and we found out the most important structural fragments for negative ionization schemes (NH and OH) and positive ones (nitrogen-containing groups).