Preprints
https://doi.org/10.5194/egusphere-2025-5677
https://doi.org/10.5194/egusphere-2025-5677
09 Dec 2025
 | 09 Dec 2025
Status: this preprint is open for discussion and under review for Atmospheric Measurement Techniques (AMT).

A Framework for Dynamic Hyper-local Source Apportionment using Low-cost Sensors for Real-time Policy Action

Shoubhik Chakraborty, Sachchida Nand Tripathi, Davender Sethi, Akanksha Lakra, Ambasht Kumar, Pranjal Kumar Srivastava, Nihal Thukarama Rao, Avnish Tripathi, and Purushottam Kar

Abstract. The presence of particulate matter, toxic gases and other pollutants in the air pose significant risk to human health and the environment. Identifying the different sources of air pollution which is termed as Source Apportionment (SA), needs to be done in real-time in order to understand the dynamics of the contributing sources and also to enable the policy makers frame effective regulatory measures to curb air pollution. The unit deployed for implementing the SA framework at a particular location must also be cost-effective, so that it becomes feasible to create a dense network with such units and thus cover a wide geographical area. The use of low-cost air quality monitoring sensors have become popular in this regard. In our proposed framework we use low-cost air quality sensor units in conjunction with machine learning models to develop a low-cost real-time solution for SA. Multi output regression models, which are supervised machine learning models are used for this purpose. Reference Grade Instruments are used for learning calibration models for the low-cost sensors as well as the multi output regression models for SA. Once the calibration and multi output regression models are learnt during training, the proposed framework allows the low-cost sensors to be deployed on the field as a standalone device, where it collects on-field data and stores it in a remote server through a wireless network. This data can be pulled at the user end, calibrated and then fed to the trained model to obtain the SA results in terms of the relative abundance of the different sources in ambient air. Mean Absolute Error (MAE) has been used as the metric to measure the accuracy in predicting the relative abundance of different sources, while Spearman's Rank Order Correlation Coefficient (SROCC) and Normalized Discounted Cumulative Gain (NDCG) are the metrics that have been used to get an estimate of how well the proposed approach performs in predicting the relative abundance of the different sources in the correct order. Extensive experimentation done using data gathered from two different environments in the city of Lucknow, India shows the robustness of the proposed approach in doing real-time SA. MAE of less than 5 % have been obtained in predicting the relative abundance of most of the organic as well as elemental sources, while values of SROCC greater than 0.75 and NDCG greater than 0.85 obtained for all the sources shows that the proposed framework also performs very well in predicting most of the sources in correct order of their actual contribution to air pollution.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Shoubhik Chakraborty, Sachchida Nand Tripathi, Davender Sethi, Akanksha Lakra, Ambasht Kumar, Pranjal Kumar Srivastava, Nihal Thukarama Rao, Avnish Tripathi, and Purushottam Kar

Status: open (until 14 Jan 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Shoubhik Chakraborty, Sachchida Nand Tripathi, Davender Sethi, Akanksha Lakra, Ambasht Kumar, Pranjal Kumar Srivastava, Nihal Thukarama Rao, Avnish Tripathi, and Purushottam Kar
Shoubhik Chakraborty, Sachchida Nand Tripathi, Davender Sethi, Akanksha Lakra, Ambasht Kumar, Pranjal Kumar Srivastava, Nihal Thukarama Rao, Avnish Tripathi, and Purushottam Kar
Metrics will be available soon.
Latest update: 09 Dec 2025
Download
Short summary
This paper proposes a novel source apportionment paradigm that predicts the relative contributions of different air-pollution sources using a machine-learning framework applied to data obtained from low-cost sensor units. A key strength of this approach is its ability to support a dense network of low-cost sensor units spanning wide geographical areas and providing source apportionment results in real time, thus helping policymakers take regulatory action to curb air pollution in real time.
Share