the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Improved estimation of diurnal variations in near-global PBLH through a hybrid WCT and transfer learning approach
Abstract. Diurnal variations in planetary boundary layer height (PBLH) is highly linked to weather, climate, and environmental processes. However, remaining challenges persist in estimating its diurnal behavior at a large scale due to insufficient observations and limitations of operational retrieval algorithms. This study proposed a deep learning framework based on an attention-augmented residual neural network to estimate diurnal variations in near-global PBLH, incorporating profiles from an non-sun-synchronous lidar (Cloud-Aerosol Transport System: CATS) and meteorological fields. The framework can largely address the issue of multi-layer structures in space-borne lidar signals, significantly improving the accuracy of PBLH retrieval during morning and evening (with accuracy improvement approach 40 % compared to traditional algorithms). Due to insufficient observations aligned with CATS orbits, a pre-trained model was firstly trained using pseudo-labels from reanalysis, and then was transferred to observation-based target labels. The transfer model demonstrated superior performance in most regions and periods, outperforming conventional algorithms in capturing PBLH magnitude and its diurnal variations, though under-performing over complex terrains. Further assessments over different land covers shown that the transfer-trained model estimated PBLH and diurnal patterns were highly consistent with those from radiosondes, surpassing reanalysis outputs. For model capability, wavelet covariance transformation derived potential PBLH and temperature profiles emerged as dominant factors, with contributions exhibiting diurnal patterns. Overall, this work proposes a novel framework for large-scale PBLH estimation and provides insights for improving conventional algorithms, particularly through integrating remote sensing and machine learning.
- Preprint
(2163 KB) - Metadata XML
-
Supplement
(5724 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-4918', Anonymous Referee #1, 07 Nov 2025
-
AC1: 'Reply on RC1', Yarong Li, 11 Jan 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4918/egusphere-2025-4918-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Yarong Li, 11 Jan 2026
-
RC2: 'Comment on egusphere-2025-4918', Anonymous Referee #2, 23 Nov 2025
The manuscript presents a deep learning framework using an attention-augmented ResNet with transfer learning to estimate diurnal variations of near-global planetary boundary layer height (PBLH) from CATS lidar, explicitly addressing multi-layer structures in spaceborne backscatter profiles. The topic is timely, and the approach is interesting and potentially impactful. Please see the detailed comments below.
Specific comments:
Line 129: Please spell out the date as “January 10” rather than using an abbreviation, for consistency with the rest of the manuscript.
Lines 205–210: The accuracy metric is defined as the fraction of predictions within 500 m of radiosonde PBLH. While 500 m is a reasonable tolerance for some regimes, it can be relatively large for diurnal PBLH over land. To demonstrate robustness, please justify the choice of the 500 m threshold or provide a sensitivity analysis showing how key conclusions change with the tolerance chosen.
Line 269: Please specify the interpolation method used to map MERRA2 meteorological profiles onto the 84 CATS bins.
Lines 293–296: In the transfer-learning stage, the transfer-training set comprises 4,000 samples and the remaining 662 samples serve as a common test set. Please describe how you minimized spatial and temporal leakage between training and test sets. For example, indicate whether you used station-wise or region-wise splits, any temporal separation, and provide summaries/maps of the train/test distributions to verify independence.
Line 348: You indicate ocean profiles were removed during pre-training due to limited radiosonde matchups, yet Fig. 5 shows results over oceanic areas. Please clarify whether the model was trained only on land but applied over oceans at inference.
Lines 372–376: Please elaborate on why the model performs more poorly from April to September. If available, add supporting analyses or references.
Lines 380–381: The permutation importance approach is appropriate, but shuffling individual features across samples in a sequence task can yield unrealistic feature combinations when predictors are correlated (e.g., temperature and local time).
Lines 455–457: Given that absolute PBLH magnitudes vary substantially across regions and seasons, please report relative bias metrics in addition to absolute errors. This will better reflect performance where PBLH is small or large.
Line 584: Please specify the source of the land surface type categories used.
Please review the manuscript for tense consistency.Citation: https://doi.org/10.5194/egusphere-2025-4918-RC2 -
AC2: 'Reply on RC2', Yarong Li, 11 Jan 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4918/egusphere-2025-4918-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Yarong Li, 11 Jan 2026
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 201 | 65 | 24 | 290 | 30 | 18 | 18 |
- HTML: 201
- PDF: 65
- XML: 24
- Total: 290
- Supplement: 30
- BibTeX: 18
- EndNote: 18
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This study reports a new hybrid approach that combines the wavelet covariance transform (WCT) with a transfer-learning deep residual network to estimate the diurnal evolution of planetary boundary layer height (PBLH) at near-global scale from the non-sun-synchronous CATS spaceborne lidar. The proposed transfer-learning strategy is both novel and practical, effectively leveraging the large-sample coverage of reanalysis products and the high accuracy of radiosonde measurements. The methodology is sound, the experiments are comprehensive, and the results clearly demonstrate substantial performance gains over conventional algorithms. The findings are valuable for improving boundary-layer parameterizations and advancing our understanding of global PBL diurnal variability, and they fall well within AMT’s scope.
However, several aspects require deeper discussion and additional evidence to further strengthen the reliability of some results and enhance the paper’s scientific contribution and technical impact. I therefore recommend acceptance after minor revision.
Specific comments and suggestions
Pretraining with MERRA-2-constrained pseudo-labels inherently injects reanalysis systematic biases into the learned representation. Even after fine-tuning with 4,662 radiosonde-matched samples, residual biases may persist (as also suggested by the closer agreement of the pretrained model with MERRA-2). Please discuss and, if possible, quantify this effect and its impact on the final estimates.
The manuscript states that 2016 data are used for pretraining and that the transfer stage uses a 4,000/662 split, but it does not clarify whether the test set is strictly separated by station and time window. To avoid information leakage from adjacent or same-station samples, please clarify the split strategy and consider a station- and season-stratified (or leave-one-site-out) evaluation.
You conclude that the model exhibits a weaker afternoon decay and better agreement with radiosondes, while morning correlations are slightly lower but accuracy is higher (smaller bias). I recommend a joint assessment of statistical consistency (e.g., R, MAE) and physical consistency (e.g., decay rate after the diurnal peak and the timing of the peak). This would help reconcile performance metrics with expected PBL diurnal physics.
The diagnosis for reduced performance over high-elevation and desert regions is reasonable, but quantitative uncertainty information is missing. Please provide uncertainty maps and/or tables, for example seasonal and hourly MAE/bias boxplots specifically for these regions.
Since “candidate PBLH” and temperature together contribute >50% of the importance, with LST/elevation next and TAB/WCT shape metrics relatively low, the conclusions should more explicitly articulate the implication for classical algorithms: rather than further refining profile-shape heuristics, incorporating thermodynamic and terrain-related diagnostics appears more beneficial.
In Figure 10, please annotate each land-cover curve with the peak time and amplitude to aid interpretation. For Table 1 (hourly R/MAE/RMSE), consider adding 95% confidence intervals or bootstrap-based uncertainty bands.
Although 480 m is identified as the optimal dilation, it would be helpful to include in the Supplement a systematic comparison table showing (i) five-peak hit rates under different dilation factors and (ii) “largest-peak only” vs. “multi-peak candidate” performance.