the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Improvement of near-surface wind speed modeling through refined aerodynamic roughness length in built-up regions: implementation and validation in the Weather Research and Forecasting (WRF) model version 4.0
Abstract. Aerodynamic roughness length (z0) is a key parameter determining near-surface wind profiles, significantly influencing wind-related studies and applications. In built-up areas, surface roughness has been substantially altered by land use changes such as urbanization. However, many numerical models assign z0 values based on vegetation cover types, neglecting urban effects. This has resulted in a lack of reliable z0 data in built-up regions. To address this issue, this study proposed a cost-effective method to estimate z0 values at weather stations by adjusting z0 values to minimize the wind speed differences between ERA5 reanalysis data and weather station observation data. Using this approach, z0 values were derived for 1,805 stations in the built-up areas across China. Based on these estimates, a high-resolution monthly gridded z0 dataset was then developed for built-up areas in China using Random Forest Regression algorithm. Simulations with Weather Research and Forecasting (WRF) model show that implementation of the new z0 dataset significantly improves the accuracy of 10-m wind speed over built-up areas, reducing mean wind speed errors by 89.9 % and 88.9 % compared to the default z0 in WRF and a latest gridded z0 dataset, respectively. Independent validations of 100-m wind speed against anemometer tower data further confirm the dataset’s reliability. Therefore, this approach is valuable for wind-dependent studies and applications, such as urban planning, air quality management, and wind energy utilization, by enabling more accurate simulations of wind speed in built-up areas.
- Preprint
(2614 KB) - Metadata XML
-
Supplement
(409 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
CC1: 'Comment on egusphere-2025-1513', Cheng Shen, 04 May 2025
-
AC1: 'Reply on CC1', Kun Yang, 05 May 2025
Dear Dr. Shen,
We sincerely appreciate your time and effort for your comments on our manuscript, which may help us improve our work. Nevertheless, we would like to point out that three (Q1, Q2, and Q4) out of your four comments have already been extensively addressed in our original manuscript. Please see our point-by-point responses below.
Q1. The critical assumption that ERA5 100-m wind speed data closely aligns with observational data has not been sufficiently validated, especially for areas characterized by complex terrains or significant local environmental variations. The authors need to provide robust evidence supporting the applicability and limitations of this assumption.
Response: In our manuscript, we have addressed this concern by presenting a two-fold justification for our assumption. First, we evaluated ERA5 100-m winds with measurements from 589 wind towers across China, each providing months to years of data spanning different periods between 2004 and 2022. The results show that ERA5 exhibits a smaller mean bias percentage in eastern regions compared to western areas, supporting its higher reliability in eastern regions. This finding led us to focus primarily on weather stations in eastern China and to derive z0 for 1,805 stations in these built-up regions. Please see this evaluation in Section 3.1. Second, we further validated our assumption through model experiments. The gridded z0 dataset was tested in WRF simulations and independently evaluated against both unseen station data (10-m winds) and additional tower measurements (100-m winds). Both validation tests confirmed significant improvements in wind speed simulations. Please see the model improvement in Section 3.3.
Regarding the applicability of our assumption in complex terrain areas with significant local environmental variations, we have explicitly addressed this limitation in our study (Lines 281-285). Our analysis shows that the gridded z0 produced based on station estimates provides only limited improvement for wind speed simulations in topographically complex regions. This suggests two possible explanations: (1) our fundamental assumption may not hold well in such areas, or more likely, (2) z0 is not the sole determinant of wind speed in these regions. As discussed in our manuscript, wind patterns in complex terrain are governed by multi-scale physical processes including microscale terrain features, turbulent orographic form drag, thermally-driven mountain-valley circulations, and mountain wave dynamics. These processes may make the simple z0-wind speed relationship invalid in flat terrain.
Q2. The observation dataset without homogenization from CMA has shown large bias in https://journals.ametsoc.org/view/journals/clim/36/11/JCLI-D-22-0445.1.xml. This may significantly affect the generalizability and accuracy of z0 estimations across broader geographic contexts. Direct usage of the CMA wind data would absolutely reduce the robustness of the study. Thus, the homogenization on near-surface wind data is necessary.
Response: We appreciate your reference to Zhang and Wang's study regarding wind speed inhomogeneity in CMA stations. Their work identified significant inhomogeneities with breakpoints concentrated in the late 1970s, mid-1990s, and early 2000s, but they do not affect our results. Our study exclusively uses CMA station data from 2015-2019, when the CMA network had already completed its transition to automated observations with standardized instruments. In addition, we conducted quality control procedures before use, including missing value screening, physical range validation, and temporal consistency checks.
Q3. Although a Random Forest Regression model is employed, the sensitivity analysis of different feature variables lacks depth and clarity. The authors are encouraged to conduct comprehensive sensitivity analyses to clearly illustrate the theoretical rationale and practical implications of feature selection on model accuracy.
Response: In our study, we have conducted comprehensive sensitivity tests at every step of the random forest (RF) methodology to ensure the robustness of our results in Section 2.3 and Figure 3. Specifically, for data partitioning, we evaluated the impact of random seed selection when splitting the dataset into training and test subsets (Figure 3a); for parameter tuning, we systematically adjusted multiple key parameters (e.g., max_depth, n_estimators, min_samples_split, min_samples_leaf and so on) and provided detailed sensitivity analysis on the most influential parameter--the number of decision trees (Figure 3b); for model validation: a five-fold cross-validation approach was used to further verify the stability of our model (Figure 3c); for feature importance, we conducted thorough feature importance analysis to identify the dominant predictors (Figure 3e). These rigorous sensitivity tests confirm the reliability of our RF model. Please refer to Section 2.3 for a complete description of the methodology.
Q4. The validation of the model's performance is restricted to simulations for only one month, limiting the assessment of its robustness across different seasons or under varying long-term climatic conditions. The authors should include additional simulations covering multiple seasons or a full year to demonstrate the general applicability and reliability of their approach.
Given these substantial issues, I recommend rejecting this manuscript in its current form.
Response: We appreciate the reviewer's suggestion regarding the simulation period selection. Our choice to focus on April was motivated by both physical and practical considerations. As shown in Figure S3, April consistently exhibits the highest mean wind speeds across our study domain, making simulated wind speeds particularly sensitive to z0 effects and thus ideal for evaluating our parameterization. To ensure robust results while managing computational constraints, we employed a carefully designed re-initialization approach where each 36-hour simulation (initialized daily at 12:00 LT (LT=UTC+8)) included a 12-hour spin-up period followed by 24 hours of analysis. This strategy produced 30 independent realizations, capturing diverse meteorological conditions throughout April. The consistent improvement in wind speed simulations across all cases (Section 3.3) strongly supports the reliability of our findings. While the current results are statistically robust, we may extend simulations to other months to further validate the general applicability of our z0 dataset under varying climatic conditions.
We hope that we have addressed your concerns. We remain open to further feedback and are committed to improving the quality of our work.
Thank you very much!
Sincerely,
Jiamin Wang and Kun Yang
On behalf of all co-authors-
CC2: 'Reply on AC1', Cheng Shen, 06 May 2025
Thank you very much for your detailed and comprehensive responses to my comments. Your clarifications significantly improve my understanding of your methodology and your results. I suggest explicitly mentioning this broader implication in your discussion to further strengthen the reliability and applicability of your findings.
Again, thank you for addressing my concerns thoroughly.
Citation: https://doi.org/10.5194/egusphere-2025-1513-CC2
-
CC2: 'Reply on AC1', Cheng Shen, 06 May 2025
-
AC1: 'Reply on CC1', Kun Yang, 05 May 2025
-
RC1: 'Comment on egusphere-2025-1513', Anonymous Referee #1, 27 May 2025
This study estimated the aerodynamic roughness length (Z0) values using ERA5 analyses and weather station observations to improve the near-surface wind speed modeling. Technically, the Random Forest Regression algorithm is suitable for the estimation of Z0, and the results are encouraging, significantly improving the wind speed simulation in the WRF model. However, the evaluation of the improved Z0 on the WRF near-surface wind simulation was only for one month, and a longer time evaluation is needed. Therefore, I recommend Major Revision in this round.
Major comments:
1. The new estimated Z0 values were only evaluated for 1 month. A longer time evaluation should be conducted for a thorough evaluation.
2. The grid-based Z0 statistics are only available in the inner domain. This indicates that the Z0 could only be improved where there are surface weather station observations. How to improve the Z0 destination in areas where there is no good coverage of surface weather station observations? More discussions should be included.
Minor comments:
Line 39-41: It is a little bit causing here. Please revise it to be more clear.
Line 47: ERA5 is the analysis from a DA system. In my opinion, it is the blend of observations and model forecasts. Therefore, it is not proper to use it as an example.
Line 54: What does it mean here 'low-type' and 'high-type'?
Line 87: Better to add surface weather station observations before CMA.
Line 192: This could be because of the altitude differences between observation sites and the model terrain.
Line 227: What is the temporal coverage of this monthly Z0 dataset?
Figure 5: better to a reference line of y = 0 in panel (c) for reference, indicating which has a smaller bias.
Line 317: The values are significantly large when verified against the Mean values. However, if you take a deep look at Fig. 7d, the improvements are not that large from the perspective of MAB and RMSE.
Figure 7: Better to add staticts of mean/rms/r in the panels of (a). For (d), the units of MAB is not m/s, likely %.
Citation: https://doi.org/10.5194/egusphere-2025-1513-RC1 - AC2: 'Reply on RC1', Kun Yang, 06 Jun 2025
-
RC2: 'Comment on egusphere-2025-1513', Anonymous Referee #2, 27 May 2025
General Comment:
This manuscript presents a novel and practical approach to improving the simulation of near-surface wind speed over built-up areas by refining the aerodynamic roughness length (𝑧₀) using a combination of ERA5 reanalysis and ground-based observations from the China Meteorological Administration (CMA). The authors developed a high-resolution monthly gridded 𝑧₀ dataset by applying a Random Forest Regression algorithm, and demonstrated its effectiveness through WRF simulations. The study is timely and potentially impactful for urban climate modeling and wind-related applications.
While the manuscript introduces a potentially useful methodology, the current version does not provide sufficient critical evaluation or methodological transparency. To be suitable for publication, the manuscript requires revision, including clarification of the observational setup, deeper theoretical consideration of the methodology's assumptions, and further analyses related to model resolution and 𝑧₀ scale dependency.
Major comments:
(1) Uncertainty about CMA Wind Observation Heights: The manuscript assumes that CMA stations provide 10-m wind speed observations. However, there is no clear documentation or justification of this assumption in the text. Are all CMA anemometers calibrated and installed precisely at 10 m above ground level? Given that the accuracy of 𝑧₀ estimation strongly depends on the reference height of the wind speed, this should be clarified and supported by official metadata or references. Otherwise, the credibility of the derived 𝑧₀ values may be significantly undermined.
(2) Circular Logic in Using ERA5 to Derive 𝑧₀ and Then Evaluating WRF Performance: The method uses ERA5 as the basis to derive optimal 𝑧₀ values, and then uses these 𝑧₀ values in WRF to simulate wind fields, which are subsequently compared to CMA observations. However, since the 𝑧₀ is essentially tuned to ERA5 wind characteristics, and WRF is driven by ERA5 data, it is not surprising that the WRF simulations become closer to observations. This circular logic reduces the strength of the validation. A deeper discussion is needed in the Discussion section to acknowledge this methodological dependency and to better clarify to what extent the improvements stem from 𝑧₀ refinement as opposed to alignment with the reanalysis base.
(3) Lack of Resolution-Dependent 𝑧₀ Consideration: The aerodynamic roughness length is known to be resolution-dependent due to varying representations of land cover and orography. However, the manuscript does not address why a single 𝑧₀ value (derived from coarser ERA5 resolution) is applied across finer-resolution WRF simulations. A justification is needed as to why scale-dependent roughness parameters were not considered, especially when moving from ERA5 (∼30 km) to WRF (3 km). Moreover, higher-resolution simulations are expected to better resolve local features influencing 𝑧₀. Has the relationship between horizontal resolution and 𝑧₀ been explored in this study? Such an analysis would greatly strengthen the work, and I recommend adding or expanding this aspect if possible.
Citation: https://doi.org/10.5194/egusphere-2025-1513-RC2 - AC3: 'Reply on RC2', Kun Yang, 06 Jun 2025
Model code and software
all codes Jiamin Wang and Kun Yang https://doi.org/10.5281/zenodo.15108200
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
1,112 | 66 | 27 | 1,205 | 32 | 23 | 37 |
- HTML: 1,112
- PDF: 66
- XML: 27
- Total: 1,205
- Supplement: 32
- BibTeX: 23
- EndNote: 37
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
The manuscript proposes a method for refined estimation of aerodynamic roughness length (z0) in urban built-up areas and applies the results to improve near-surface wind speed simulations in the WRF model. The authors utilized ERA5 reanalysis data and China Meteorological Administration (CMA) station observations to optimize z0 values, subsequently employing a Random Forest Regression algorithm to generate a high-resolution gridded z0 dataset. The simulations indicate significant improvements in the accuracy of 10 m and 100 m wind speeds in Chinese urban areas. However, there are significant limitations in the study. My comments below:
1: The critical assumption that ERA5 100-m wind speed data closely aligns with observational data has not been sufficiently validated, especially for areas characterized by complex terrains or significant local environmental variations. The authors need to provide robust evidence supporting the applicability and limitations of this assumption.
2: The observation dataset without homogenization from CMA has shown large bias in https://journals.ametsoc.org/view/journals/clim/36/11/JCLI-D-22-0445.1.xml. This may significantly affect the generalizability and accuracy of z0 estimations across broader geographic contexts. Direct usage of the CMA wind data would absolutely reduce the robustness of the study. Thus, the homogenization on near-surface wind data is necessary.
3: Although a Random Forest Regression model is employed, the sensitivity analysis of different feature variables lacks depth and clarity. The authors are encouraged to conduct comprehensive sensitivity analyses to clearly illustrate the theoretical rationale and practical implications of feature selection on model accuracy.
4: The validation of the model's performance is restricted to simulations for only one month, limiting the assessment of its robustness across different seasons or under varying long-term climatic conditions. The authors should include additional simulations covering multiple seasons or a full year to demonstrate the general applicability and reliability of their approach.
Given these substantial issues, I recommend rejecting this manuscript in its current form.