the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Enhancing Low-Cost PM2.5 Sensor Reliability Through Multi-Model Calibration Against a Beta Attenuation Monitor
Abstract. Accurate particulate matter (PM2.5) monitoring using low-cost sensors requires careful consideration of meteorological influences and calibration against reference instruments. This study evaluates the performance of a low-cost optical sensor through an outdoor co-location experiment with a Beta Attenuation Monitor (BAM 1022). Raw measurements showed strong temporal agreement but substantial overestimation, particularly under high relative humidity, which induced hygroscopic particle growth and amplified light-scattering responses. Correlation and regression analyses confirmed humidity as the dominant environmental factor affecting low-cost sensor bias, while temperature exhibited only minor influence. To address these limitations, multiple calibration models (including Linear Regression, Random Forest, Gradient Boosting, Support Vector Regression, and an Adaptive-blend ensemble) were developed and assessed. Nonlinear and ensemble-based models significantly improved accuracy, reducing MAE from 17.40 μg/m³ (uncalibrated) to 5.85 μg/m³ after calibration. These findings demonstrate the necessity of environmental compensation and model-based correction for reliable low-cost PM2.5 monitoring and support their integration into high-resolution air quality networks.
- Preprint
(3484 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-6203', Anonymous Referee #1, 14 Feb 2026
-
RC2: 'Comment on egusphere-2025-6203', Anonymous Referee #2, 24 Feb 2026
Review of “Enhancing Low-Cost PM₂.₅ Sensor Reliability Through Multi-Model Calibration Against a Beta Attenuation Monitor” by Adira Prayoga et aL
Adira Prayoga et al., present an evaluation of a low-cost sensor against the BAM 1020 as a reference instrument. The authors investigate potential interference and biases in the low-cost sensor PM2.5 measurements and apply numerous calibration models to correct the low-cost sensor data relative to the BAM 1020. The authors conclude that machine learning models performed the best, and that there is a need to correct for environmental (namely RH) conditions to improve the low-cost sensor measurements.
While it is clear that the authors have put together a comprehensive study and done a lot of work, the novelty of the work is not clear. It is well-established that low-cost optical particle sensors need to be corrected for effects of relative humidity and that non-linear calibration methods perform better than a linear correction. The authors claim that a ‘comprehensive calibration framework is presented, however it is not clear to me what is the actual calibration framework is. For example, what of the presented models was found to perform the best and therefore be included in the framework? Random Forest? Support Vector Regression? While it may be useful to present and compare different calibration model approaches, the study needs to go a step further and articulate what approach is recommended and how it improves on previous work.
It is because of this lack of novelty that I am recommending that is work be rejected in its current format. The paper needs substantial work to reframe the research and address some of the structural and analytical issues outlined below.
Further comments
Section 2.1: why are the GP2Y and SHS0 sensors described when only ZH03 sensor is seemingly tested section 3?
The BAM 1020 is utilized as a reference instrument for this study. While the BAM 1020 is a suitable choice, there are a couple limitations of the BAM 1020 that the authors have not mentioned or considered in their analysis. Firstly, while the BAM 1020 does have FEM, this is only for 24-hr average measurements. Using it for 1-hr measurements comes with greater measurement uncertainty and the author need to consider this in their analysis. Secondly, the BAM 1020 can also suffer from interference from moisture. While this is typically mitigated for with the smart heater, its not infallible, especially in locations with high ambient RH this can result in a negative interference due to moisture on the tape. As there is no information on the sampling location, it is hard to judge if this may have been important.
There is too much technical detail in the methods section, especially 2.2 and 2.3
In section 2.3, there is too much irrelevant detail on the data collection and communication procedures. Key information that is missing include the actual sampling location, length of the sampling, dates (time of year).
Section 2.4. the removal of outliers based solely on statistics is in my opinion problematic. High concentration peaks can be real, and capturing these is important for a monitoring network from a public health perspective. Therefore, these should be left in unless there is a good operational reason (e.g., flow error, etc) as the correction model need to be able to handle these.
Section 3.1 and 3.2 could be significantly shortened and focused on the key findings. A short description of the meteorology would be useful as well during the study, to give context.
There is little reference to previous work in Section 3, notably in Section 3.3 where the authors may wish to review previous studies approaches to addressing hygroscopic growth using κ-Köhler theory.
Section 3.4. it is difficult to assess the performance of the models based on the presented data as it is not clear if training and validation datasets was used. that is, was the model performance tested on a separate (validation) dataset not used to develop the model? For example, was the (admittedly very short) time series in Fig 13 used to develop the calibration model or different dataset? That a machine learning model performs well on the dataset used to build is not surprising, and the real test is how it performs on different dataset.
Citation: https://doi.org/10.5194/egusphere-2025-6203-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 149 | 116 | 18 | 283 | 13 | 10 |
- HTML: 149
- PDF: 116
- XML: 18
- Total: 283
- BibTeX: 13
- EndNote: 10
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Title: Enhancing Low-Cost PM2.5 Sensor Reliability Through Multi-Model Calibration Against a Beta Attenuation Monitor
I appreciate the opportunity to review this manuscript. The topic is important and relevant for the community: improving low-cost PM2.5 sensing performance is valuable for broader air-quality monitoring applications. However, after assessment, I recommend rejection in its current form because the manuscript does not yet establish a clear novel contribution and requires substantial reworking throughout.
The manuscript reports calibration of a low-cost PM2.5 sensor against BAM data using multiple statistical/ML models. While performance improvements are reported, the current study has foundational issues in novelty framing, contextual reporting, analytical rigor, and presentation. In my view, the required changes are not incremental; they involve redesign of the framing and substantial redevelopment of the analysis.
Major Concerns
Section1 Introduction
The manuscript frames a key gap around humidity/temperature effects on low-cost optical PM sensor response. This relationship is already well documented in existing literature, and the authors summarise some literature in Table 1. The current manuscript does not clearly demonstrate what is new beyond known behaviour and commonly used calibration workflows. A publishable revision would need a substantially stronger, explicit novelty claim (e.g., transferability, understudied geographical region, or clearly differentiated methodological advance) and direct evidence supporting that claim.
Section 2 Methods
The Methods section contains excessive low-level electronics/build detail in the main text, which disrupts the scientific narrative and reads more like thesis documentation. The manuscript should be restructured so core measurement/calibration methods are the focus, with implementation-level hardware details moved to Supplementary Material or a repository.
Key experimental information is missing from the methods.
Sensors used
In the text (line 118) three sensors are mentioned (GP2Y, ZH03 and SDS011) to be used in the study and all are included in Table 2. However, only data from ZH03 is included in the analysis and reported on.
Outlier removal
The authors report removing outliers using the Z-score method and in text (line 260) say they removed isolated concentration spikes which were not erroneous data. High concentration events are arguably more important in air pollution contexts due to the relationship between high concentrations and negative health impacts. No information was provided on the amount of data which were removed, no sensitivity analysis was completed to assess impact on the model performance. It would also be important to know if the removed events were in the training or test data.
Site information/Experimental Approach
There is no information on the site of the study, where did the co-location take place? What site type classification would be appropriate? (e.g. roadside, kerbside, background etc). There are no start/end dates for the co-location period reported in the methods. Lacking seasonal/geographical context.
Lack of train/test splitting information and hyperparameter searching approach.
There is insufficient clarity in the modelling descriptions as no dependent variables explicitly mentioned. There is a reference to a physics-based linear correction for the blended model (line 317), but this is not described before and appears to be the previous linear scaling.
Section 3 Results & Discussion
The analysis completed lacks focus on the reported novelty and could be improved for clarity of interpretation. There appears to be no discussion to wider literature and limited discussion on the reported results. It’s mostly a results section.
Section 3.1
There is extensive reporting of raw values and summaries of the raw values relative to temperature and humidity. The manuscript also highlights some example periods in Figure 7. This takes up significant space within the section that would be better focused on assessing the calibration models and deeper analysis on the effects of the environmental variables using modelling approaches.
The assessment of the raw data presented in Table 3 is difficult to interpret. A different approach by reporting the difference between the two would help. Inclusion of the number of values with the stratification would be informative also.
From eye there may be some kind of temporal misalignment as the sensor appears to respond an hour before the BAM in figure 7.
Section 3.3
This section looks at correlations to environmental variables (temperature and humidity) to the raw readings of the sensor and the BAM. I am unsure of the relevance of this approach. A more meaningful assessment would be to look at the bias on the y and the environmental variable on the x.
Section 3.4
I would like to see some time series of the calibrated data during these example periods or the data coloured by humidity/temperature, for example. Overall, I would like to see a more rigorous assessment of the calibrated models on temperature and humidity effects, which is the novelty claim of the manuscript. Such as during the highlighted extreme periods in Figure 7, if the train/test split allows. There is the reported overall MAE in Figure 12. From eye there is no meaningful difference between the Random Forest, Gradient Boosted, SVR and Adaptive-blend models. A difference of 2 µg m-3 MAE isof note from the linear model compared to the more computationally intensive ML approach. Claims of substantial improvement are not demonstrated in this manuscript. In addition, the differences in the reported metrics between the ML models are very small. There is no discussion on these points in the manuscript.
Section 4 Conclusion and Future works
I am not convinced that the conclusions drawn are compatible with the reported analysis/experimental approach.
Line 588 says “a comprehensive calibration framework was developed’. No definition of framework was outlined in this manuscript and from my understanding of a framework, this was not demonstrated. Here, a calibration report has been presented.
The post calibration time series presented in Figure 13 is for a two-day period. It is not long enough to describe the data as showing measurements across varied pollution episodes and meteorological conditions. This was not shown in the data.
General Comments
There are formatting inconsistencies throughout the manuscript regarding symbols/subscripts.
Lack of information in table and figure captions to be stand alone.
English and grammar assessment is needed throughout.
There appears to be generative AI used in the text production and for generating plots. This needs to be acknowledged as per journal guidelines.
There is a lack of discussion, therefore relevant literature is not included.
Closing remarks
Although the topic is relevant, the current manuscript does not provide a sufficiently clear or well-supported novel contribution, and the analysis requires substantial redevelopment rather than routine revision. I therefore recommend rejection at this stage. I encourage the authors to consider preparing a substantially redesigned manuscript with a clearly defined novelty, fuller contextual reporting, and deeper scientific discussion.