Preprints
https://doi.org/10.5194/egusphere-2026-1103
https://doi.org/10.5194/egusphere-2026-1103
27 Apr 2026
 | 27 Apr 2026
Status: this preprint is open for discussion and under review for Atmospheric Measurement Techniques (AMT).

A Hybrid STL-Ensemble Framework for Multivariate Time-Series Forecasting of Source-Specific PM2.5 Emissions

Jintu Borah, Deepali Kushwaha, Sachchida Nand Tripathi, and Rajesh M. Hegde

Abstract. Forecasting the evolution of source-specific particulate emissions is central to modern air quality management strategies. Existing source identification methods result in non-uniqueness and instability in source profiles, leading to uncertainties in source identification and quantification. In this work, we present an approach that integrates receptor modeling with supervised machine learning to overcome this limitation. The hybrid model integrates statistical decomposition, feature-engineered multivariate learning, and ensemble regression techniques to predict the temporal trajectory of PM2.5 source contributions. The concentrations of elemental and organic species from high-resolution measurement systems were processed through source apportionment to identify the target sources. A time-series pipeline was developed, including temporal imputation, autocorrelation-guided feature engineering, Seasonal-Trend Decomposition using LOESS (STL), and multi-output ensemble regression. The proposed method demonstrated improved predictive performance across diverse emission categories, highlighting the importance of decomposition for interpretability and providing a robust foundation for the operational forecasting of air quality dynamics. Compared to the source-specific PM2.5 emission forecasting without STL, the proposed method is able to improve the R2 score from 0.22 to 0.95 in aggregate. The proposed comprehensive modeling framework is robust and can be adapted to various multi-source environmental datasets.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Jintu Borah, Deepali Kushwaha, Sachchida Nand Tripathi, and Rajesh M. Hegde

Status: open (until 02 Jun 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Jintu Borah, Deepali Kushwaha, Sachchida Nand Tripathi, and Rajesh M. Hegde
Jintu Borah, Deepali Kushwaha, Sachchida Nand Tripathi, and Rajesh M. Hegde
Metrics will be available soon.
Latest update: 27 Apr 2026
Download
Short summary
Accurate sectoral emission forecasting is challenging due to the multi-component nature of air quality data. This work addresses the gap by combining time-series decomposition with ensemble learning to enhance predictive performance. The work proposes a sector-specific emission forecasting method that uses source-apportioned, speciated PM2.5 concentration data from three locations. The framework enables robust quantification of sectoral emissions for new incoming speciated datasets.
Share