Preprints
https://doi.org/10.5194/egusphere-2024-4080
https://doi.org/10.5194/egusphere-2024-4080
22 Jan 2025
 | 22 Jan 2025
Status: this preprint is open for discussion and under review for Atmospheric Measurement Techniques (AMT).

Improving the quantification of peak concentrations for air quality sensors via data weighting

Caroline Frischmon, Jon Silberstein, Annamarie Guth, Erick Mattson, Jack Porter, and Michael Hannigan

Abstract. Traditional calibration models for low-cost air quality sensors have demonstrated a tendency to under-predict peak concentrations. We assessed the utility of adding data weights to low-cost sensor colocation data to improve the quantification of peak concentrations. Specifically, we explored the effects of data weighting on three different pollutant colocation datasets: total volatile organic compounds, carbon monoxide, and methane. Leveraging two different weighting functions, a sigmoidal and piecewise weighting regime, we explored the impacts of the base model choice (multilinear regression vs random forest models), the sensitivity of weighting functions, and the ability of data weighting to improve high-concentration pollution measurements. When compared to unweighted colocation data, we demonstrate significant reductions in both error (root mean square error-RMSE) and bias (mean bias error-MBE) for pollutant peaks across all three datasets when data weighting is employed. For the top percentile of data, we observe an average of 23 % reduction in RMSE and a 35 % reduction in MBE when optimal weights are employed. More significant reductions occurred in the 95–99th percentile of data, where MBE was reduced by an average of 70 %. RMSE in the 95–99th percentile was reduced by an average of 26 %. However, data weighting can also generate larger errors at baseline pollutant concentrations. Data weighting regimes were sensitive to input parameters, and input weighting functions may be tuned to better predict peak concentration data without significant reductions in the fidelity of baseline pollutant predictions.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Caroline Frischmon, Jon Silberstein, Annamarie Guth, Erick Mattson, Jack Porter, and Michael Hannigan

Status: open (until 27 Feb 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Caroline Frischmon, Jon Silberstein, Annamarie Guth, Erick Mattson, Jack Porter, and Michael Hannigan
Caroline Frischmon, Jon Silberstein, Annamarie Guth, Erick Mattson, Jack Porter, and Michael Hannigan

Viewed

Total article views: 41 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
31 7 3 41 7 0 0
  • HTML: 31
  • PDF: 7
  • XML: 3
  • Total: 41
  • Supplement: 7
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 22 Jan 2025)
Cumulative views and downloads (calculated since 22 Jan 2025)

Viewed (geographical distribution)

Total article views: 38 (including HTML, PDF, and XML) Thereof 38 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 24 Jan 2025
Download
Short summary
Air quality sensors often under-predict peak concentrations, which is a major issue in applications such as emissions event detection. This manuscript details a novel approach involving data weighting to improve quantification of these peak concentrations. To demonstrate its effectiveness, we applied data weighting to carbon monoxide, methane, and VOC sensor data. This work broadens our ability to use air sensors in contexts where accurate quantification of peak concentrations is essential.