Preprints
https://doi.org/10.5194/egusphere-2024-3047
https://doi.org/10.5194/egusphere-2024-3047
14 Oct 2024
 | 14 Oct 2024
Status: this preprint is open for discussion and under review for Atmospheric Measurement Techniques (AMT).

An Algorithm for Automatic Fitting and Formula Assignment in Atmospheric Mass Spectra

Valter Mickwitz, Otso Peräkylä, Frans Graeffe, Douglas Worsnop, and Mikael Ehn

Abstract. Mass spectrometry is an established method for studying the chemical composition of gases and particles in the atmosphere. Using this technique, signals corresponding to thousands, or even tens of thousands of compounds may be detected from ambient air. The process of identifying all the peaks in the mass spectra is often arduous and time--consuming, in particular when multiple overlapping peaks are present. This manual peak fitting and identification may take even experienced analysts anywhere from weeks to months to complete, depending on the desired accuracy and completeness.

In this work, we attempted to automate the fitting and formula assignment workflow and evaluate how far the process can get using a ''one button'' algorithm. The algorithm constructed in this work takes in commonly known parameters specific to the instrument type and by pressing one button, it runs and ultimately provides a list of likely peaks for the mass spectrum. The algorithm utilizes weighted least squares fitting and a modified version of the Bayesian information criterion along with an iterative formula assignment process. We applied it to synthetic mass spectra and both a gas-phase chemical ionization mass spectrometer (CIMS) dataset and an aerosol mass spectrometer (AMS) dataset. The results were largely comparable with manual peak fitting and identification done previously, but were achieved in a fraction of the time. Erroneous assignments mainly appeared at low--intensity signals, with interference from nearby higher intensity signals, a case that is challenging also for manual peak fitting. This algorithm provides an excellent starting point for a peak list, which, if needed, can be manually revised.

The main result of this study is the algorithm itself. While further improvements and tweaks are possible, the algorithm presented here is currently being implemented into the commonly used Tofware analysis software package, to allow easy utilization by the broader community. We hope this can save valuable time of researchers for data interpretation rather than data processing and curation.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Valter Mickwitz, Otso Peräkylä, Frans Graeffe, Douglas Worsnop, and Mikael Ehn

Status: open (until 05 Dec 2024)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2024-3047', Anonymous Referee #1, 07 Nov 2024 reply
Valter Mickwitz, Otso Peräkylä, Frans Graeffe, Douglas Worsnop, and Mikael Ehn
Valter Mickwitz, Otso Peräkylä, Frans Graeffe, Douglas Worsnop, and Mikael Ehn

Viewed

Total article views: 164 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
105 53 6 164 17 1 1
  • HTML: 105
  • PDF: 53
  • XML: 6
  • Total: 164
  • Supplement: 17
  • BibTeX: 1
  • EndNote: 1
Views and downloads (calculated since 14 Oct 2024)
Cumulative views and downloads (calculated since 14 Oct 2024)

Viewed (geographical distribution)

Total article views: 158 (including HTML, PDF, and XML) Thereof 158 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 14 Nov 2024
Download
Short summary
This work presents and evaluates an algorithm that automatically conducts the steps of fitting peaks and identifying formulas, necessary but time consuming steps for most applications of mass spectrometry within atmospheric science. The aim of the algorithm is to save researchers working on these tasks significant amounts of time, and allow them to proceed with their analysis. The work demonstrates that this algorithm can achieve the goal of speeding up analysis, and provide accurate formulas.