Daily maps of Boundary Layer Height combining radiosonde, satellite, and reanalysis over Europe
Abstract. The height of the planetary boundary layer directly influences local and regional climatic phenomena, making its study and estimation of vital importance for environmental sciences. The main objective of this work was to create a gridded map of planetary boundary layer height across the European continent, with a spatial resolution of 25 km and monthly mean values at two synoptic hours (12:00 and 00:00 UTC). We implemented the regression kriging method by combining various data sources, including observations, climatic and topographic variables, and reanalysis data (ERA5), and different regression methods (linear, random forest, and gradient boosting) for the 2010–2020 period. In both UTC hours, combining reanalysis and topographic covariates with random forest regression provided the best performance. Then, we compared our seasonal predictions with reanalysis data and found a consistently higher spatio-temporal accuracy than that of the ERA5 reanalysis. For example, at 12:00 UTC, spatial variability in winter showed RMSE values ≤ 100 m, compared with ≥ 200 m for ERA5, while temporal variability in summer reached RMSE values ≤ 250 m, versus ≥ 300 m for ERA5. At 00:00 UTC, spatial variability in autumn achieved RMSE values ≤ 36 m, whereas ERA5 exhibited RMSE values ≥ 130 m.
The methodology was applied to a case study over Germany at a daily resolution. We obtained an accurate representation of boundary layer height, which was consistent with the variations in weather conditions. Results were notably better at 12:00 than at 00:00 UTC, mainly due to the limited number of available stations and the associated difficulty in resolving the stable boundary layer at night. Overall, this study represents a promising first step towards the incorporation of this type of data in atmospheric models with the aim of reducing the bias in boundary layer height simulation.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Atmospheric Measurement Techniques.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
This manuscript derives continuous boundary layer height (BLH) over Europe by combining radiosondes data and models fitted from spatiotemporally collocated meteorology and topography covariates. The main product presents BLH at 0 and 12 UTC at monthly resolution. A regression kriging (RK) method is implemented to model BLH, where different regression models are evaluated using cross-validation and spatiotemporal testing by independent soundings. The tests show the model outperforms ERA5 BLH. Daily BLHs are attempted over Germany in three 15-day periods. The results suit the scope of AMT, and the following comments helpfully help improving the manuscript before publication.
One main issue is that “daily” in the title does not well represent the core work, which is monthly, RK-interopolated BLH over Europe. The daily component over Germany appears to be a weaker attachment to the Europe study. It seemingly applied the same modeling framework from the European study, just changing data to the three short periods in Germany, but without the independent testing. As acknowledged towards the end of the Discussion section, it is not possible to assert whether these daily-model-predicted BLHs are better or worse than plain ERA5. Since ERA5 BLH is already available hourly and globally for many decades, the significance of the results and analysis (qualitative association between modeled BLH and weather conditions) in the mesoscale case study is questionable. The clarity and overall quality of the corresponding sections are also lower than the European work. The authors are suggested to either enhance the rigor of this case study, or remove it as the manuscript is already on the long side.
The independent assessments using spatial and temporal radiosonde sites not in training are considered a strength of this work. It is suggested to follow more standard machine learning terminology, using “validation” as defined in section 2.3 to select models and tune hyperparameters, while using “testing” for the independent assessment of model performance. Then, only section 2.5.1 is validation as it is part of the training process, and sections 2.5.2 and 2.5.3 are spatial and temporal testing. It reads like the spatial sites in 2.5.2 are independent of the training process, but it is good to be explicitly confirmed. In addition, hyperparameter tuning appears to be missing this work, but random forest (RF) and gradient boosting (GB) are large families of models. Key hyperparameters will determine under- vs. over-fitting and should be optimized in cross-validation. It won’t be fair to simply compare untuned RF or GB with linear regression.
This work may be further strengthened and advanced from prior works if the advantages of the interpolation (i.e., kriging) part can be quantified. It may be worth more details in methods and results if kriding makes significant contributions besides the optimized regression models.
Specific comments
Last sentence of the abstract: It is unclear by reading the main text how the BLH product can be incorporated in atmospheric model simulation.
Section 2.2.3: The importance of topography is highlighted in the manuscript, but only the altitude is considered. Are higher order topography terms worth consideration? The same altitude may be on a flat plateau or mountain ranges.
Tables: Tables B1 and B2 are referred to as “1B” and “2B” in the text. The tables are better placed within the corresponding sections. It is unclear what “a” and “b” means in Table 1.
Section 2.3: Consider streamlining and clarifying the definition of K-fold cross-validation. It is mentioned before formal definition.
Line 289: Please provide more information about how the loss function is constructed via GLS, especially the construction of observation error covariance matrix (if that is what GLS implies) and the rationales/assumptions.
Lines 332 and 347: It is unclear what a, b, and c means.
Line 336: ERA5 BLH is referred to as both upper and lower cases, BLH_ERA5 and blh_ERA5. Is the lower case reserved for testing and the upper case as a covariate in regression? Please clarify.
Line 343 and 441: It is unclear how GLMM is used here, what it accomplishes, and how it relates to/differs from the regression models (RF, GB, LR).
Section 3.1: This appear to be a comparison of training loss, which is not as important for model selection as validation loss in 3.2.1. Reconsider allocating two long tables for this.
Line 615: 12 UTC does not necessarily means convective BL.
Line 642: The regression part in RK is still machine learning models. The kriging contribution is better to be justified as mentioned above.
Lines 686-689: Not sure if altitude along can present the flatness. Consider more specific topographical indices.