A Framework for Dynamic Hyper-local Source Apportionment using Low-cost Sensors for Real-time Policy Action

Chakraborty, Shoubhik; Tripathi, Sachchida Nand; Sethi, Davender; Lakra, Akanksha; Kumar, Ambasht; Srivastava, Pranjal Kumar; Rao, Nihal Thukarama; Tripathi, Avnish; Kar, Purushottam

doi:10.5194/egusphere-2025-5677

Preprints

https://doi.org/10.5194/egusphere-2025-5677

Preprints

09 Dec 2025

| 09 Dec 2025

Status: this preprint is open for discussion and under review for Atmospheric Measurement Techniques (AMT).

A Framework for Dynamic Hyper-local Source Apportionment using Low-cost Sensors for Real-time Policy Action

Shoubhik Chakraborty, Sachchida Nand Tripathi, Davender Sethi, Akanksha Lakra, Ambasht Kumar, Pranjal Kumar Srivastava, Nihal Thukarama Rao, Avnish Tripathi, and Purushottam Kar

Abstract. The presence of particulate matter, toxic gases and other pollutants in the air pose significant risk to human health and the environment. Identifying the different sources of air pollution which is termed as Source Apportionment (SA), needs to be done in real-time in order to understand the dynamics of the contributing sources and also to enable the policy makers frame effective regulatory measures to curb air pollution. The unit deployed for implementing the SA framework at a particular location must also be cost-effective, so that it becomes feasible to create a dense network with such units and thus cover a wide geographical area. The use of low-cost air quality monitoring sensors have become popular in this regard. In our proposed framework we use low-cost air quality sensor units in conjunction with machine learning models to develop a low-cost real-time solution for SA. Multi output regression models, which are supervised machine learning models are used for this purpose. Reference Grade Instruments are used for learning calibration models for the low-cost sensors as well as the multi output regression models for SA. Once the calibration and multi output regression models are learnt during training, the proposed framework allows the low-cost sensors to be deployed on the field as a standalone device, where it collects on-field data and stores it in a remote server through a wireless network. This data can be pulled at the user end, calibrated and then fed to the trained model to obtain the SA results in terms of the relative abundance of the different sources in ambient air. Mean Absolute Error (MAE) has been used as the metric to measure the accuracy in predicting the relative abundance of different sources, while Spearman's Rank Order Correlation Coefficient (SROCC) and Normalized Discounted Cumulative Gain (NDCG) are the metrics that have been used to get an estimate of how well the proposed approach performs in predicting the relative abundance of the different sources in the correct order. Extensive experimentation done using data gathered from two different environments in the city of Lucknow, India shows the robustness of the proposed approach in doing real-time SA. MAE of less than 5 % have been obtained in predicting the relative abundance of most of the organic as well as elemental sources, while values of SROCC greater than 0.75 and NDCG greater than 0.85 obtained for all the sources shows that the proposed framework also performs very well in predicting most of the sources in correct order of their actual contribution to air pollution.

Received: 15 Nov 2025 – Discussion started: 09 Dec 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Shoubhik Chakraborty, Sachchida Nand Tripathi, Davender Sethi, Akanksha Lakra, Ambasht Kumar, Pranjal Kumar Srivastava, Nihal Thukarama Rao, Avnish Tripathi, and Purushottam Kar

Status: open (until 26 Feb 2026)

Post a comment Subscribe to comment alert

Shoubhik Chakraborty, Sachchida Nand Tripathi, Davender Sethi, Akanksha Lakra, Ambasht Kumar, Pranjal Kumar Srivastava, Nihal Thukarama Rao, Avnish Tripathi, and Purushottam Kar

Viewed

Total article views: 431 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
205	212	14	431	29	34

HTML: 205
PDF: 212
XML: 14
Total: 431
BibTeX: 29
EndNote: 34

Views and downloads (calculated since 09 Dec 2025)

Month	HTML	PDF	XML	Total
Dec 2025	129	138	14	281
Jan 2026	76	74	0	150

Cumulative views and downloads (calculated since 09 Dec 2025)

Month	HTML	PDF	XML	Total
Dec 2025	129	138	14	281
Jan 2026	76	74	0	150

Viewed (geographical distribution)

Total article views: 411 (including HTML, PDF, and XML) Thereof 411 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 21 Jan 2026

Short summary

This paper proposes a novel source apportionment paradigm that predicts the relative contributions of different air-pollution sources using a machine-learning framework applied to data obtained from low-cost sensor units. A key strength of this approach is its ability to support a dense network of low-cost sensor units spanning wide geographical areas and providing source apportionment results in real time, thus helping policymakers take regulatory action to curb air pollution in real time.


Total:	0
HTML:	0
PDF:	0
XML:	0