20 Dec 2022
20 Dec 2022
Status: this preprint is open for discussion.

Positive Matrix Factorization of Large Aerosol Mass Spectrometry Datasets Using Error-Weighted Randomized Hierarchical Alternating Least Squares

Benjamin Sapper1, Daven Henze1, Manjula Canagaratna2, and Harald Stark3,4 Benjamin Sapper et al.
  • 1University of Colorado Boulder, 11 Engineering Dr, Boulder, CO 80309, United States
  • 2Aerodyne Research, 45 Manning Road, Billerica, MA 01821, United States
  • 3Center for Aerosol and Cloud Chemistry, Aerodyne Research Inc., 45 Manning Road, Billerica, MA 01821
  • 4Department of Chemistry and Cooperative Institute for Research in Environmental Sciences (CIRES), University of Colorado, Boulder, Colorado 80309, United States

Abstract. Weighted positive matrix factorization (PMF) has been used by scientists to find small sets of underlying factors in environmental data. However, as the size of the data has grown, increasing computational costs have made it impractical to use traditional methods for this factorization. In this paper, we present a new weighting method to dramatically decrease computational costs for these traditional algorithms. We then apply this weighting method with the Randomized Hierarchical Alternating Least Squares (RHALS) algorithm to a large environmental dataset, where we show that interpretable factors can be reproduced using these methods. We show this algorithm results in a computational speedup of 38, 67, and 634 compared to the Multiplicative Update (MU), deterministic Hierarchical Alternating Least Squares (HALS), and non-negative Alternating Least Squares (ALS) algorithms, respectively. We also investigate rotational ambiguity in the solution, and present a simple “pulling” method to rotate a set of factors. This method is shown to find alternative solutions, and in some cases, lower the weighted residual error of the algorithm.

Benjamin Sapper et al.

Status: open (until 12 Mar 2023)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Benjamin Sapper et al.

Benjamin Sapper et al.


Total article views: 168 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
114 46 8 168 4 5
  • HTML: 114
  • PDF: 46
  • XML: 8
  • Total: 168
  • BibTeX: 4
  • EndNote: 5
Views and downloads (calculated since 20 Dec 2022)
Cumulative views and downloads (calculated since 20 Dec 2022)

Viewed (geographical distribution)

Total article views: 154 (including HTML, PDF, and XML) Thereof 154 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 03 Feb 2023
Short summary
Positive Matrix Factorization (PMF) has been used by atmospheric scientists to extract underlying factors present in large datasets. This paper presents a new technique for weighted PMF that drastically reduces the computational costs of previously developed algorithms. We use this technique to deliver interpretative factors and solution diagnostics from an atmospheric chemistry dataset.