the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Daily maps of Boundary Layer Height combining radiosonde, satellite, and reanalysis over Europe
Abstract. The height of the planetary boundary layer directly influences local and regional climatic phenomena, making its study and estimation of vital importance for environmental sciences. The main objective of this work was to create a gridded map of planetary boundary layer height across the European continent, with a spatial resolution of 25 km and monthly mean values at two synoptic hours (12:00 and 00:00 UTC). We implemented the regression kriging method by combining various data sources, including observations, climatic and topographic variables, and reanalysis data (ERA5), and different regression methods (linear, random forest, and gradient boosting) for the 2010–2020 period. In both UTC hours, combining reanalysis and topographic covariates with random forest regression provided the best performance. Then, we compared our seasonal predictions with reanalysis data and found a consistently higher spatio-temporal accuracy than that of the ERA5 reanalysis. For example, at 12:00 UTC, spatial variability in winter showed RMSE values ≤ 100 m, compared with ≥ 200 m for ERA5, while temporal variability in summer reached RMSE values ≤ 250 m, versus ≥ 300 m for ERA5. At 00:00 UTC, spatial variability in autumn achieved RMSE values ≤ 36 m, whereas ERA5 exhibited RMSE values ≥ 130 m.
The methodology was applied to a case study over Germany at a daily resolution. We obtained an accurate representation of boundary layer height, which was consistent with the variations in weather conditions. Results were notably better at 12:00 than at 00:00 UTC, mainly due to the limited number of available stations and the associated difficulty in resolving the stable boundary layer at night. Overall, this study represents a promising first step towards the incorporation of this type of data in atmospheric models with the aim of reducing the bias in boundary layer height simulation.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Atmospheric Measurement Techniques.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(2520 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-702', Anonymous Referee #1, 31 Mar 2026
-
RC2: 'Comment on egusphere-2026-702', Anonymous Referee #2, 08 Jun 2026
Report on egusphere-2026-702 "Daily maps of Boundary Layer Height combining radiosonde, satellite, and reanalysis over Europe"
General Comments
This manuscript presents a methodology for generating gridded boundary layer height (BLH) maps over Europe by combining radiosonde observations, ERA5 reanalysis, topographic information, and satellite-derived variables through regression kriging and machine learning approaches. The study addresses an important topic, as accurate BLH estimates are relevant for weather forecasting, air quality applications, and climate modeling. The manuscript is generally well organized and presents a comprehensive methodological framework. The comparison between different regression approaches and the independent validation exercises are valuable aspects of the study. The results indicate that combining observational and reanalysis datasets can improve BLH estimates compared to ERA5 alone, particularly during daytime conditions. However, several aspects require clarification and further discussion before publication. In particular, I believe that the manuscript would benefit from a deeper physical interpretation of the results, a more critical discussion of ERA5 limitations (especially during stable nighttime conditions), and a clearer justification of some methodological choices.
Overall, I recommend Major Revisions.
Major Comments
1. Limitations of ERA5 during stable boundary layer conditions: Throughout the manuscript, ERA5 is presented as a generally reliable predictor of BLH. While this statement is supported by several studies, the discussion remains largely focused on overall performance metrics. A substantial body of literature has shown that ERA5 exhibits significant difficulties in reproducing stable boundary layers, particularly during nighttime conditions. This issue is especially relevant because the manuscript consistently reports poorer performance at 00:00 UTC. The discussion should explicitly address the known limitations of ERA5 under stable conditions and explain how these limitations may propagate into the proposed interpolation framework.
2. Representativeness of ERA5 grid points: The manuscript states that ERA5 values were extracted from the nearest grid point to each radiosonde station. However, it is unclear how situations were handled when multiple radiosonde stations fall within the same ERA5 grid cell. Given the relatively coarse spatial resolution (~25 km), this issue could potentially affect the independence of the predictor dataset. The authors should clarify: i) How frequently this situation occurred; ii) Whether multiple stations shared the same ERA5 pixel, iii) Whether this could introduce artificial correlations between predictors and observations.
3. Justification for using MODIS land surface temperature; The manuscript employs MODIS land surface temperature (LST) as an additional predictor. However, ERA5 already includes surface temperature information and a complete surface energy balance representation. Therefore, it is not immediately obvious why satellite-derived LST should provide independent information beyond what is already contained within ERA5. The authors should better justify why MODIS LST was selected instead of ERA5 surface temperature and what additional information MODIS contributes.
4. Necessity of auxiliary datasets when radiosonde observations are available: The manuscript introduces several auxiliary variables to improve BLH estimation. However, from the reader's perspective, it is not entirely clear why some of these variables are necessary when direct radiosonde-derived BLH observations are already available at the training locations. A more explicit conceptual explanation of the interpolation strategy would help the reader understand the role of each predictor and its physical contribution to the final product.
5. Physical Interpretation of Predictor Importance: While the manuscript provides a thorough statistical evaluation of model performance, the physical interpretation of the selected predictors remains limited. Beyond identifying which variables improve the interpolation skill, it would be valuable to discuss the underlying atmospheric processes that may explain these relationships.
For example:
- Why does land surface temperature (LST) emerge as one of the most important predictors for daytime boundary layer height?
- Why does elevation (DEM) appear to contribute more strongly to nighttime BLH estimates?
- Why do wind-related variables provide relatively limited improvements despite their known influence on boundary layer development and turbulent mixing?
A deeper discussion of the physical mechanisms linking these predictors to BLH variability would considerably strengthen the manuscript and help readers understand whether the reported relationships are physically meaningful or primarily statistical. Such an analysis would also improve confidence in the applicability of the proposed framework beyond the specific training dataset.
6. Applicability for atmospheric modeling: One of the motivations presented by the authors is the potential incorporation of these BLH maps into atmospheric models. However, the proposed product is based on only two synoptic times (00:00 and 12:00 UTC) and monthly climatological averages. This raises important questions regarding its direct applicability to mesoscale or operational atmospheric models, which require much higher temporal resolution. The authors should better discuss the limitations of the dataset for operational and modeling applications and clarify the intended use of the resulting product.
Minor Comments
1) Keywords - The keyword list could be improved by including: Boundary Layer Height; Machine Learning
2) Section 2.3 (Cross-validation): The choice of k = 5 and k = 3 appears reasonable, but additional justification would be useful. It would also be helpful to report the sensitivity of the results to the selected cross-validation configuration.
3) Discussion Section - The manuscript would benefit from a broader discussion comparing the proposed methodology with recent machine-learning-based BLH products developed for other regions.
4) The objectives could be rephrased to explicitly highlight the novelty: "...to generate observation-constrained gridded BLH maps for Europe using a hybrid regression-kriging framework."
Citation: https://doi.org/10.5194/egusphere-2026-702-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 948 | 401 | 75 | 1,424 | 144 | 136 |
- HTML: 948
- PDF: 401
- XML: 75
- Total: 1,424
- BibTeX: 144
- EndNote: 136
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript derives continuous boundary layer height (BLH) over Europe by combining radiosondes data and models fitted from spatiotemporally collocated meteorology and topography covariates. The main product presents BLH at 0 and 12 UTC at monthly resolution. A regression kriging (RK) method is implemented to model BLH, where different regression models are evaluated using cross-validation and spatiotemporal testing by independent soundings. The tests show the model outperforms ERA5 BLH. Daily BLHs are attempted over Germany in three 15-day periods. The results suit the scope of AMT, and the following comments helpfully help improving the manuscript before publication.
One main issue is that “daily” in the title does not well represent the core work, which is monthly, RK-interopolated BLH over Europe. The daily component over Germany appears to be a weaker attachment to the Europe study. It seemingly applied the same modeling framework from the European study, just changing data to the three short periods in Germany, but without the independent testing. As acknowledged towards the end of the Discussion section, it is not possible to assert whether these daily-model-predicted BLHs are better or worse than plain ERA5. Since ERA5 BLH is already available hourly and globally for many decades, the significance of the results and analysis (qualitative association between modeled BLH and weather conditions) in the mesoscale case study is questionable. The clarity and overall quality of the corresponding sections are also lower than the European work. The authors are suggested to either enhance the rigor of this case study, or remove it as the manuscript is already on the long side.
The independent assessments using spatial and temporal radiosonde sites not in training are considered a strength of this work. It is suggested to follow more standard machine learning terminology, using “validation” as defined in section 2.3 to select models and tune hyperparameters, while using “testing” for the independent assessment of model performance. Then, only section 2.5.1 is validation as it is part of the training process, and sections 2.5.2 and 2.5.3 are spatial and temporal testing. It reads like the spatial sites in 2.5.2 are independent of the training process, but it is good to be explicitly confirmed. In addition, hyperparameter tuning appears to be missing this work, but random forest (RF) and gradient boosting (GB) are large families of models. Key hyperparameters will determine under- vs. over-fitting and should be optimized in cross-validation. It won’t be fair to simply compare untuned RF or GB with linear regression.
This work may be further strengthened and advanced from prior works if the advantages of the interpolation (i.e., kriging) part can be quantified. It may be worth more details in methods and results if kriding makes significant contributions besides the optimized regression models.
Specific comments
Last sentence of the abstract: It is unclear by reading the main text how the BLH product can be incorporated in atmospheric model simulation.
Section 2.2.3: The importance of topography is highlighted in the manuscript, but only the altitude is considered. Are higher order topography terms worth consideration? The same altitude may be on a flat plateau or mountain ranges.
Tables: Tables B1 and B2 are referred to as “1B” and “2B” in the text. The tables are better placed within the corresponding sections. It is unclear what “a” and “b” means in Table 1.
Section 2.3: Consider streamlining and clarifying the definition of K-fold cross-validation. It is mentioned before formal definition.
Line 289: Please provide more information about how the loss function is constructed via GLS, especially the construction of observation error covariance matrix (if that is what GLS implies) and the rationales/assumptions.
Lines 332 and 347: It is unclear what a, b, and c means.
Line 336: ERA5 BLH is referred to as both upper and lower cases, BLH_ERA5 and blh_ERA5. Is the lower case reserved for testing and the upper case as a covariate in regression? Please clarify.
Line 343 and 441: It is unclear how GLMM is used here, what it accomplishes, and how it relates to/differs from the regression models (RF, GB, LR).
Section 3.1: This appear to be a comparison of training loss, which is not as important for model selection as validation loss in 3.2.1. Reconsider allocating two long tables for this.
Line 615: 12 UTC does not necessarily means convective BL.
Line 642: The regression part in RK is still machine learning models. The kriging contribution is better to be justified as mentioned above.
Lines 686-689: Not sure if altitude along can present the flatness. Consider more specific topographical indices.