A Hybrid STL-Ensemble Framework for Multivariate Time-Series Forecasting of Source-Specific PM2.5 Emissions

Borah, Jintu; Kushwaha, Deepali; Tripathi, Sachchida Nand; Hegde, Rajesh M.

doi:10.5194/egusphere-2026-1103

Preprints

https://doi.org/10.5194/egusphere-2026-1103

Preprints

27 Apr 2026

| 27 Apr 2026

Status: this preprint is open for discussion and under review for Atmospheric Measurement Techniques (AMT).

A Hybrid STL-Ensemble Framework for Multivariate Time-Series Forecasting of Source-Specific PM2.5 Emissions

Jintu Borah, Deepali Kushwaha, Sachchida Nand Tripathi, and Rajesh M. Hegde

Abstract. Forecasting the evolution of source-specific particulate emissions is central to modern air quality management strategies. Existing source identification methods result in non-uniqueness and instability in source profiles, leading to uncertainties in source identification and quantification. In this work, we present an approach that integrates receptor modeling with supervised machine learning to overcome this limitation. The hybrid model integrates statistical decomposition, feature-engineered multivariate learning, and ensemble regression techniques to predict the temporal trajectory of PM2.5 source contributions. The concentrations of elemental and organic species from high-resolution measurement systems were processed through source apportionment to identify the target sources. A time-series pipeline was developed, including temporal imputation, autocorrelation-guided feature engineering, Seasonal-Trend Decomposition using LOESS (STL), and multi-output ensemble regression. The proposed method demonstrated improved predictive performance across diverse emission categories, highlighting the importance of decomposition for interpretability and providing a robust foundation for the operational forecasting of air quality dynamics. Compared to the source-specific PM2.5 emission forecasting without STL, the proposed method is able to improve the R2 score from 0.22 to 0.95 in aggregate. The proposed comprehensive modeling framework is robust and can be adapted to various multi-source environmental datasets.

Received: 26 Feb 2026 – Discussion started: 27 Apr 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Jintu Borah, Deepali Kushwaha, Sachchida Nand Tripathi, and Rajesh M. Hegde

Status: open (until 11 Aug 2026)

Post a comment Subscribe to comment alert

Jintu Borah, Deepali Kushwaha, Sachchida Nand Tripathi, and Rajesh M. Hegde

Viewed

Total article views: 348 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
256	71	21	348	22	21

HTML: 256
PDF: 71
XML: 21
Total: 348
BibTeX: 22
EndNote: 21

Views and downloads (calculated since 27 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	80	24	4	108
May 2026	164	28	14	206
Jun 2026	5	9	2	16
Jul 2026	7	10	1	18

Cumulative views and downloads (calculated since 27 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	80	24	4	108
May 2026	164	28	14	206
Jun 2026	5	9	2	16
Jul 2026	7	10	1	18

Viewed (geographical distribution)

Total article views: 337 (including HTML, PDF, and XML) Thereof 337 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 20 Jul 2026

Short summary

Accurate sectoral emission forecasting is challenging due to the multi-component nature of air quality data. This work addresses the gap by combining time-series decomposition with ensemble learning to enhance predictive performance. The work proposes a sector-specific emission forecasting method that uses source-apportioned, speciated PM2.5 concentration data from three locations. The framework enables robust quantification of sectoral emissions for new incoming speciated datasets.


Total:	0
HTML:	0
PDF:	0
XML:	0