Improved estimation of diurnal variations in near-global PBLH through a hybrid WCT and transfer learning approach
Abstract. Diurnal variations in planetary boundary layer height (PBLH) is highly linked to weather, climate, and environmental processes. However, remaining challenges persist in estimating its diurnal behavior at a large scale due to insufficient observations and limitations of operational retrieval algorithms. This study proposed a deep learning framework based on an attention-augmented residual neural network to estimate diurnal variations in near-global PBLH, incorporating profiles from an non-sun-synchronous lidar (Cloud-Aerosol Transport System: CATS) and meteorological fields. The framework can largely address the issue of multi-layer structures in space-borne lidar signals, significantly improving the accuracy of PBLH retrieval during morning and evening (with accuracy improvement approach 40 % compared to traditional algorithms). Due to insufficient observations aligned with CATS orbits, a pre-trained model was firstly trained using pseudo-labels from reanalysis, and then was transferred to observation-based target labels. The transfer model demonstrated superior performance in most regions and periods, outperforming conventional algorithms in capturing PBLH magnitude and its diurnal variations, though under-performing over complex terrains. Further assessments over different land covers shown that the transfer-trained model estimated PBLH and diurnal patterns were highly consistent with those from radiosondes, surpassing reanalysis outputs. For model capability, wavelet covariance transformation derived potential PBLH and temperature profiles emerged as dominant factors, with contributions exhibiting diurnal patterns. Overall, this work proposes a novel framework for large-scale PBLH estimation and provides insights for improving conventional algorithms, particularly through integrating remote sensing and machine learning.
This study reports a new hybrid approach that combines the wavelet covariance transform (WCT) with a transfer-learning deep residual network to estimate the diurnal evolution of planetary boundary layer height (PBLH) at near-global scale from the non-sun-synchronous CATS spaceborne lidar. The proposed transfer-learning strategy is both novel and practical, effectively leveraging the large-sample coverage of reanalysis products and the high accuracy of radiosonde measurements. The methodology is sound, the experiments are comprehensive, and the results clearly demonstrate substantial performance gains over conventional algorithms. The findings are valuable for improving boundary-layer parameterizations and advancing our understanding of global PBL diurnal variability, and they fall well within AMT’s scope.
However, several aspects require deeper discussion and additional evidence to further strengthen the reliability of some results and enhance the paper’s scientific contribution and technical impact. I therefore recommend acceptance after minor revision.
Specific comments and suggestions
Pretraining with MERRA-2-constrained pseudo-labels inherently injects reanalysis systematic biases into the learned representation. Even after fine-tuning with 4,662 radiosonde-matched samples, residual biases may persist (as also suggested by the closer agreement of the pretrained model with MERRA-2). Please discuss and, if possible, quantify this effect and its impact on the final estimates.
The manuscript states that 2016 data are used for pretraining and that the transfer stage uses a 4,000/662 split, but it does not clarify whether the test set is strictly separated by station and time window. To avoid information leakage from adjacent or same-station samples, please clarify the split strategy and consider a station- and season-stratified (or leave-one-site-out) evaluation.
You conclude that the model exhibits a weaker afternoon decay and better agreement with radiosondes, while morning correlations are slightly lower but accuracy is higher (smaller bias). I recommend a joint assessment of statistical consistency (e.g., R, MAE) and physical consistency (e.g., decay rate after the diurnal peak and the timing of the peak). This would help reconcile performance metrics with expected PBL diurnal physics.
The diagnosis for reduced performance over high-elevation and desert regions is reasonable, but quantitative uncertainty information is missing. Please provide uncertainty maps and/or tables, for example seasonal and hourly MAE/bias boxplots specifically for these regions.
Since “candidate PBLH” and temperature together contribute >50% of the importance, with LST/elevation next and TAB/WCT shape metrics relatively low, the conclusions should more explicitly articulate the implication for classical algorithms: rather than further refining profile-shape heuristics, incorporating thermodynamic and terrain-related diagnostics appears more beneficial.
In Figure 10, please annotate each land-cover curve with the peak time and amplitude to aid interpretation. For Table 1 (hourly R/MAE/RMSE), consider adding 95% confidence intervals or bootstrap-based uncertainty bands.
Although 480 m is identified as the optimal dilation, it would be helpful to include in the Supplement a systematic comparison table showing (i) five-peak hit rates under different dilation factors and (ii) “largest-peak only” vs. “multi-peak candidate” performance.