the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Quality Control of Historical Temperature Data for Pure Rotational Raman Lidar Using Density-Based Clustering
Abstract. This paper is the first to use two density-based clustering algorithms, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Ordering Points To Identify the Clustering Structure (OPTICS), to screen the historical detection data of pure rotational Raman (PRR) temperature measurement lidar. To address the issues of threshold radius in DBSCAN and output value processing in OPTICS, three automated processing methods suitable for PRR temperature lidar detection data characteristics are proposed. These methods are the k-distance Fast Change Region (k-FCR) Method based on the DBSCAN, the Reachability Distance (RD) Method based on the OPTICS, and the Predecessor Divergence (PD) Method based on the OPTICS. Using these three methods, quality control was conducted on the historical data detected by a PRR temperature lidar from March 2021 to May 2024, demonstrating the effectiveness of these methods in automated quality control of historical data and the complementary nature of their quality control effects. Under the reliable threshold set in this paper, compared with the traditional Signal-to-Noise Ratio (SNR) method, the RD method increased the True Positive Rate (TPR) by 23.7 %, the PD method increased the True Negative Rate (TNR) by 6.0 %, and the k-FCR method increased the TPR by 72.1 % at the cost of some TNR loss. The influence of the SNR of data points and the number of continuous observation profiles on the quality control results is also explored, providing further references for the selection and application of different quality control methods. The methods provided in this paper will allow relevant researchers to filter PRR lidar data of atmospheric temperature according to their own needs, and these methods can also be applied to the automated processing of future atmospheric temperature data from detection networks.
- Preprint
(4815 KB) - Metadata XML
-
Supplement
(2250 KB) - BibTeX
- EndNote
Status: open (extended)
-
RC1: 'Comment on egusphere-2024-2650', Anonymous Referee #1, 13 Jan 2025
reply
The paper lacks in providing any physical foundation for the applicability of the considered algorithms and the physical consistency of the achieved results. The paper, in its present form, completely lacks to illustrate what are the specific scientific questions aimed to be addressed by the application of the proposed algorithms (why a quality-control check based on a black box algorithm should be preferable to traditional Cal-Val efforts based on the comparison with independent measurements), as well as a comprehensive physical motivation behind the application of the present algorithms to the consider the data set. I cannot believe that the only motivation behind the application of these approaches to long-term series of temperature profile measurements resides in the fact that “… Atmospheric temperature, similar to wind fields, also exhibits temporal and spatial continuity, making density-based clustering methods potential for screening PRR lidar temperature detection data”, which is the only motivation the authors put forward. Authors state that: “Density-based clustering classifies data based solely on its features without the aid of external data sources, and it is a form of unsupervised learning.” This is a very strong statement that, to my opinion, can be endorsed only if substantial physical evidence is provided, which I don't seem to see in the paper. The paper is primarily dedicated to the illustration and application of two density-based clustering methods for quality control of temperature lidar data with the only argument that this approach had been used in literature for wind lidar data. Most part of the paper is dedicated to the illustration of the algorithms. To validate the algorithms and assess quality control effects of different methods authors set a threshold for reliable data to deviations from ERA5 of less than or equal to 5 K and less than or equal to 10 K. Authors well identify that “… detailed and high-resolution temperature structure observations are urgently needed for studying atmospheric energy balance, dynamics, and chemistry …, The troposphere … requires precise temperature detection for studying the atmospheric transport of pollutants … and for short to medium term weather forecasting”. However, the set thresholds are by far inadequate to validate temperature measurements to achieve these scientific objectives. Ther paper should undergo substantial modifications along the lines specified above, with a substantial integration of the text to carefully illustrate the physical motivations behind the application of the present algorithms and a substantiation of the part dedicated to assumptions made in the validation of the results. I will be happy to reconsidered the paper after these fundamental integrations.
Citation: https://doi.org/10.5194/egusphere-2024-2650-RC1
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
93 | 31 | 14 | 138 | 30 | 6 | 5 |
- HTML: 93
- PDF: 31
- XML: 14
- Total: 138
- Supplement: 30
- BibTeX: 6
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1