Improved estimation of diurnal variations in near-global PBLH through a hybrid WCT and transfer learning approach

Li, Yarong; Liu, Zeyang; He, Jianjun

doi:10.5194/egusphere-2025-4918

Preprints

https://doi.org/10.5194/egusphere-2025-4918

Preprints

15 Oct 2025

| 15 Oct 2025

Improved estimation of diurnal variations in near-global PBLH through a hybrid WCT and transfer learning approach

Yarong Li, Zeyang Liu, and Jianjun He

Abstract. Diurnal variations in planetary boundary layer height (PBLH) is highly linked to weather, climate, and environmental processes. However, remaining challenges persist in estimating its diurnal behavior at a large scale due to insufficient observations and limitations of operational retrieval algorithms. This study proposed a deep learning framework based on an attention-augmented residual neural network to estimate diurnal variations in near-global PBLH, incorporating profiles from an non-sun-synchronous lidar (Cloud-Aerosol Transport System: CATS) and meteorological fields. The framework can largely address the issue of multi-layer structures in space-borne lidar signals, significantly improving the accuracy of PBLH retrieval during morning and evening (with accuracy improvement approach 40 % compared to traditional algorithms). Due to insufficient observations aligned with CATS orbits, a pre-trained model was firstly trained using pseudo-labels from reanalysis, and then was transferred to observation-based target labels. The transfer model demonstrated superior performance in most regions and periods, outperforming conventional algorithms in capturing PBLH magnitude and its diurnal variations, though under-performing over complex terrains. Further assessments over different land covers shown that the transfer-trained model estimated PBLH and diurnal patterns were highly consistent with those from radiosondes, surpassing reanalysis outputs. For model capability, wavelet covariance transformation derived potential PBLH and temperature profiles emerged as dominant factors, with contributions exhibiting diurnal patterns. Overall, this work proposes a novel framework for large-scale PBLH estimation and provides insights for improving conventional algorithms, particularly through integrating remote sensing and machine learning.

Received: 04 Oct 2025 – Discussion started: 15 Oct 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2163 KB)

Supplement (5724 KB)

Download & links

Yarong Li, Zeyang Liu, and Jianjun He

Status: final response (author comments only)

RC1:
'Comment on egusphere-2025-4918', Anonymous Referee #1, 07 Nov 2025
This study reports a new hybrid approach that combines the wavelet covariance transform (WCT) with a transfer-learning deep residual network to estimate the diurnal evolution of planetary boundary layer height (PBLH) at near-global scale from the non-sun-synchronous CATS spaceborne lidar. The proposed transfer-learning strategy is both novel and practical, effectively leveraging the large-sample coverage of reanalysis products and the high accuracy of radiosonde measurements. The methodology is sound, the experiments are comprehensive, and the results clearly demonstrate substantial performance gains over conventional algorithms. The findings are valuable for improving boundary-layer parameterizations and advancing our understanding of global PBL diurnal variability, and they fall well within AMT’s scope.

However, several aspects require deeper discussion and additional evidence to further strengthen the reliability of some results and enhance the paper’s scientific contribution and technical impact. I therefore recommend acceptance after minor revision.
Specific comments and suggestions
Pseudo-label bias from MERRA-2.

Pretraining with MERRA-2-constrained pseudo-labels inherently injects reanalysis systematic biases into the learned representation. Even after fine-tuning with 4,662 radiosonde-matched samples, residual biases may persist (as also suggested by the closer agreement of the pretrained model with MERRA-2). Please discuss and, if possible, quantify this effect and its impact on the final estimates.

Train/validation/test independence.

The manuscript states that 2016 data are used for pretraining and that the transfer stage uses a 4,000/662 split, but it does not clarify whether the test set is strictly separated by station and time window. To avoid information leakage from adjacent or same-station samples, please clarify the split strategy and consider a station- and season-stratified (or leave-one-site-out) evaluation.

Statistical vs. physical consistency of the afternoon decay.

You conclude that the model exhibits a weaker afternoon decay and better agreement with radiosondes, while morning correlations are slightly lower but accuracy is higher (smaller bias). I recommend a joint assessment of statistical consistency (e.g., R, MAE) and physical consistency (e.g., decay rate after the diurnal peak and the timing of the peak). This would help reconcile performance metrics with expected PBL diurnal physics.

Quantifying uncertainty over complex terrain and deserts.

The diagnosis for reduced performance over high-elevation and desert regions is reasonable, but quantitative uncertainty information is missing. Please provide uncertainty maps and/or tables, for example seasonal and hourly MAE/bias boxplots specifically for these regions.

Implications from feature importance.

Since “candidate PBLH” and temperature together contribute >50% of the importance, with LST/elevation next and TAB/WCT shape metrics relatively low, the conclusions should more explicitly articulate the implication for classical algorithms: rather than further refining profile-shape heuristics, incorporating thermodynamic and terrain-related diagnostics appears more beneficial.

Figure and table clarity.

In Figure 10, please annotate each land-cover curve with the peak time and amplitude to aid interpretation. For Table 1 (hourly R/MAE/RMSE), consider adding 95% confidence intervals or bootstrap-based uncertainty bands.

WCT dilation sensitivity.

Although 480 m is identified as the optimal dilation, it would be helpful to include in the Supplement a systematic comparison table showing (i) five-peak hit rates under different dilation factors and (ii) “largest-peak only” vs. “multi-peak candidate” performance.
Citation: https://doi.org/10.5194/egusphere-2025-4918-RC1
RC2: 'Comment on egusphere-2025-4918', Anonymous Referee #2, 23 Nov 2025

The manuscript presents a deep learning framework using an attention-augmented ResNet with transfer learning to estimate diurnal variations of near-global planetary boundary layer height (PBLH) from CATS lidar, explicitly addressing multi-layer structures in spaceborne backscatter profiles. The topic is timely, and the approach is interesting and potentially impactful. Please see the detailed comments below.
Specific comments:

Line 129: Please spell out the date as “January 10” rather than using an abbreviation, for consistency with the rest of the manuscript.

Lines 205–210: The accuracy metric is defined as the fraction of predictions within 500 m of radiosonde PBLH. While 500 m is a reasonable tolerance for some regimes, it can be relatively large for diurnal PBLH over land. To demonstrate robustness, please justify the choice of the 500 m threshold or provide a sensitivity analysis showing how key conclusions change with the tolerance chosen.

Line 269: Please specify the interpolation method used to map MERRA2 meteorological profiles onto the 84 CATS bins.

Lines 293–296: In the transfer-learning stage, the transfer-training set comprises 4,000 samples and the remaining 662 samples serve as a common test set. Please describe how you minimized spatial and temporal leakage between training and test sets. For example, indicate whether you used station-wise or region-wise splits, any temporal separation, and provide summaries/maps of the train/test distributions to verify independence.

Line 348: You indicate ocean profiles were removed during pre-training due to limited radiosonde matchups, yet Fig. 5 shows results over oceanic areas. Please clarify whether the model was trained only on land but applied over oceans at inference.

Lines 372–376: Please elaborate on why the model performs more poorly from April to September. If available, add supporting analyses or references.

Lines 380–381: The permutation importance approach is appropriate, but shuffling individual features across samples in a sequence task can yield unrealistic feature combinations when predictors are correlated (e.g., temperature and local time).

Lines 455–457: Given that absolute PBLH magnitudes vary substantially across regions and seasons, please report relative bias metrics in addition to absolute errors. This will better reflect performance where PBLH is small or large.

Line 584: Please specify the source of the land surface type categories used.

Please review the manuscript for tense consistency.

Citation: https://doi.org/10.5194/egusphere-2025-4918-RC2

Yarong Li, Zeyang Liu, and Jianjun He

Supplement

https://doi.org/10.5194/egusphere-2025-4918-supplement

Yarong Li, Zeyang Liu, and Jianjun He

Viewed

Total article views: 238 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
160	56	22	238	26	17	17

HTML: 160
PDF: 56
XML: 22
Total: 238
Supplement: 26
BibTeX: 17
EndNote: 17

Views and downloads (calculated since 15 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	85	12	8	105
Nov 2025	60	30	10	100
Dec 2025	15	14	4	33

Cumulative views and downloads (calculated since 15 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	85	12	8	105
Nov 2025	60	30	10	100
Dec 2025	15	14	4	33

Viewed (geographical distribution)

Total article views: 227 (including HTML, PDF, and XML) Thereof 227 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 19 Dec 2025

Short summary

An attention-augmented ResNet and a transfer training are implemented to derive diurnal variations in near-global planetary boundary layer height. The transfer-trained model shows superior performances compared to conventional algorithms and non-transfer trained mode. The model predicted more reliable diurnal PBLH behaviors, with daily amplitude and peak timing approaching radiosonde results.


Total:	0
HTML:	0
PDF:	0
XML:	0