A new Earth Observation–based WRF configuration for urban regional climate simulations over Paris
Abstract. In this study, Earth Observation (EO) data describing urban morphology and thermal properties are used to configure the urban representation of Paris in the Weather Research and Forecasting (WRF) model. Model performance is assessed in convective permitting simulations (3 km) in three configurations covering summer 2020: (i) a baseline simulation employing a bulk parameterization of urban areas (i.e. BULK); (ii) a simulation using the BEP–BEM urban canopy model with default urban parameter values (CTRL); and (iii) a simulation in which the BEP–BEM urban canopy parameters are specifically adapted to Paris using EO–derived information (PAR). Comparison with observations from the Météo–France RADOME meteorological network indicates that the BULK simulation produces a systematic warm bias, with a persistent overestimation of nighttime temperatures. Coupling of the urban canopy model leads to an improved overall model performance relative to the BULK configuration for summer–mean conditions. Both BEP–BEM simulations are comparable, with the best performance during summer daytime achieved for the CTRL (0.93 °C lower bias compared to BULK) and during summer nighttime for the PAR simulation (0.22 °C lower bias compared to BULK) over urban grid cells. Over non–urban grid cells, the best performance is exhibited for the PAR simulation with a summer daytime bias of -0.26 °C and a summer nighttime bias of 1.62 °C. The simulated urban–rural temperature contrast for both BEP–BEM simulations is improved compared to BULK, resulting in a more realistic representation of the Urban Heat Island (UHI). Applying the Local Climate Zones (LCZ) classification, which accounts for intraurban differences in urban form and land cover, an analysis was conducted for each urban–type LCZ class present within the Paris urban area, linking each urban station to its nearest LCZ grid cell, enabling station–based comparisons by urban–type LCZ category. This analysis indicates that the PAR configuration better captures spatial urban temperature variability, reflecting the differences in urban forcing introduced by the EO data. During the heatwave of August 2020, the model – regardless of the configuration – becomes warmer and particularly over the urbanized LCZs. The PAR simulation exhibits pronounced temperature overestimations in the city center, due to the methodology adopted for EO data implementation and the selection of the area from which the EO data were derived, leading to urban characteristics which intensified heat storage and trapping. Non–urban areas are better simulated in both BEP–BEM simulations compared to the BULK. Our results demonstrate the added value of a) the coupling of an urban canopy model to WRF and b) the city–tailored configuration of urban canopy morphological parameters in convection–permitting regional climate simulations. The PAR experiment further illustrates the potential of EO–derived datasets to inform urban canopy parameter configurations, enabling a more detailed representation of the urban form and improved simulation of UHI characteristics.
A new Earth Observation–based WRF configuration for urban regional climate simulations over Paris
General comments
The manuscript titled “A new Earth Observation–based WRF configuration for urban regional climate simulations over Paris” aims to evaluate the added value of improved Urban Canopy Parameters (UCP) in the Weather Research Forecast model (WRF) when using an Urban Canopy Model (UCM). The study focuses of the city of Paris (France) for a short time period in the summer of 2020 when a significant heatwave occured.
The study compares 3 different WRF experiments. In the first, the default urban parametrization of WRF is selected, where artificial surfaces are simulated by the land surface model following a “bulk” approach. In the other two experiments, the BEP-BEM multi-layer UCM is used with two different UCP datasets: one based on a LCZ map and default parameters, and another with updated UCP adapted for the Paris region using on Earth Observation data.
This study is very interesting, but could benefit from some improvements regarding the manuscript’s structure, as well as a more in-depth analysis of the differences observed between the experiments. I have provided various comments to this effect, which I hope will improve the manuscript’s readability, and the strength of its conclusions.
More generally, I felt that the study would have benefited from a more in-depth analysis or more detailed hypotheses when examining the results regarding the difference between the WRF experiments and the potential added value of the improved UCP, as it seemed a bit too descriptive as it stands. For example, one could combine an element similar to Figure 6 with a specific examination of how UCP parameters evolve at the precise location associated with the station, and see if it is possible to make sense of this evolution (for example, if a station sees its urban fraction increase between PAR and CTRL, what is observed in terms of the temperature response).
Specific comments
Structure of the manuscript
Regarding the structure of the manuscript and its various sections, the distinction between urban cover parameters derived from Earth observation data, presented in subsection 2.2.1, and the more general section 3 devoted to land-use data did not seem clear to me.
I understand that you are moving from WRF “general physics” to urban parameterizations, and then to the required urban parameters, but Section 2.3 then seems inconsistent, as you have just introduced a high-resolution dataset specific to your study area and then return to more general datasets on land cover. Furthermore, in the current structure, you have only a single sub-sub-section 2.2.1.
One suggestion might be to clearly separate, on the one hand, the physical aspects and urban parameterizations, and on the other, the land surface datasets and parameters. It might look something like this:
2 Data & Methodology
2.1 The Weather Research and Forecasting Model
2.1.1. Model configuration and physics
2.1.2. Urban parametrizations
2.2 Land Use Forcing
2.2.1. Standard land use data in WRF
2.2.2. Improved urban canopy parameters derived from EO data
Another general point, which I return to several times in my subsequent comments, concerns the difference between point-based analyses and masks. I feel that it is always difficult to combine these two approaches, because it is extremely unlikely that the masks we construct will match the characteristics of the stations used for the evaluation. One way to get around this problem could be to split the analysis into (1) a purely point-based evaluation (vs. station), which focuses more specifically on the description of the station and the UCP used (even if only for a few stations), and then (2) a more general comparison of the experiments (without the stations) that can take masks into account: for example, the urban and rural masks you have defined, but it might also be interesting to group the model points based on their LCZ.
Section 2 Data & Methodology
I think it would be helpful to add a few sentences introducing Noah-MP (beyond simply mentioning it in Table 1) in subsection 2.1. That way, when you mention the various land surface models used with WRF later (line 92), readers will already be familiar with them. Furthermore, I think it might be interesting to introduce the dominant coverage approach used by WRF. In this regard, I was wondering if you had considered using the “mosaic” approach introduced by Li et al. (2013), now that you are moving to increasingly precise land surface datasets? [After rereading the whole section, I'm wondering if you're referring this in line 98? See my comment below].
Li, D., Bou‐Zeid, E., Barlage, M., Chen, F., & Smith, J. A. (2013). Development and evaluation of a mosaic approach in the WRF‐Noah framework. Journal of Geophysical Research: Atmospheres, 118(21), 11-918.
Line 78. “Convection is explicitly resolved in the inner domain (d02)”
I was wondering if you used the same parent domain simulation (d01) to run the three different urban experiments presented in Section 2.4, and, if so, which urban parameterization is enabled at the 12 km resolution? I assume the BULK approach is being used? I’m not certain, but it might be worth mentioning somewhere – either here or in Section 2.4 – in a brief sentence when you present the experiments.
Line 98. “This coupling involves activating UCMs for the urban fraction within urban grid cells only, replacing the land surface model for this fraction when computing fluxes (e.g. heat and momentum). The LSMs compute these fluxes for the remaining non–urban fraction. The final fluxes are area–weighted and averaged for the entire cell.”
If I understand this sentence correctly, you are indeed using the mosaic approach implemented by Li et al. (2013)? If so, this should be clearly stated, as it is not always the case in WRF studies (for example, the simulations conducted for the FPS-Convection). Following up on that, my question is whether this approach is used consistently across your various simulations; for example, what happens in the BULK experiment? Finally, it might be interesting to present maps of the urban fraction, for example in additional panels in Figure 3.
Line 109. “In this way, the vertical distribution of sources and sinks of heat, moisture and momentum induced by buildings is explicitly represented, which shapes the thermodynamic profile of the urban roughness sub–layer and, consequently, the urban boundary layer, in more detail than SLUCM”
Another point that I could see discussed is the difference in which the two UCMs are coupled to the atmospheric model. Kusakas’s SLUCM is only coupled to the atmosphere at one level located above the canyons rooftops – at the top of the urban roughness sublayer – while BEP can be coupled at multiple levels, depending on the number and height of the atmospheric models closest to the ground.
Line 120. “2) thermal and 3) human related parameters.”
I was wondering if it might be helpful to briefly mention that the LCZ classification was designed to group different urban areas based on their morphology and is therefore closely linked to building height, building density, etc., but is less closely linked to physical parameters that depend more on building materials. For example, even though a neighbourhood of a similar LCZ in Paris (France) and Shanghai (China) may have buildings of equivalent height, the materials used are likely to differ, which affects the physical properties of the walls, roofs, etc. This could potentially be raised later to justify the need to for improved UCP using EO data.
Line 130. “Starting from WRF v4.6.0, an updated lookup table with more representative values is introduced as default in WRF”.
At first, I thought these updated values came from EO data for Paris. After rereading the text, if I understand correctly, these are the new “default” values to be used with BEP and the LCZs? Is there a study we can refer to regarding this update, or could this work serve as a reference in the future? If so, would it be advisable to also include the old values in Tables S1 and S2 for future reference? That might make the tables too complicated or too cluttered, so I understand why you might disagree.
Furthermore, you note that the old default values may not be very realistic, particularly for European cities. While the new values may be better suited to Europe, could they be less suitable for other regions of the world as a consequence?
Line 140. “Additionally, UCPs related to anthropogenic urban heat emissions remained unchanged from the default values WRF provides and were assumed to be constant over 24 hours.”
I suppose this refers to an additional flux directly added in the model? If so, are there different values between the LCZ? If not maybe you could provide the value for the BULK-class?
Line 146. “2.2.1 Urban canopy parameters derived from EO data”
Would it be possible to include a table listing all the Earth observation data used, their sources, their original horizontal resolution, and the urban canopy parameters they help improve? This seems to be an important part of the article.
Line 163. “Land Use Data”.
As suggested earlier, this section could be divided into two subsections, one for the “standard” WRF configuration and one for the updated data based on EO.
After rereading this subsection several times, I’m no longer sure I fully understand what’s being done. You mention the LANDMATE PFT dataset as well as the IGBP-MODIS classification, then a modified IGBP-MODIS map, and finally the WRF-LCZ map.
Here is what I understand so far:
(1) You start with the LANDMATE PFT dataset to follow the EURO-CORDEX simulation protocol
(2) You “modify” its urban class to replace it with the default IGBP-MODIS urban class in order to obtain the appropriate parameters for the BULK simulation; but do you retain the LANDMATE “location” of the urban points?
(3) For the BEP simulation, do you use a modified IGBP MODIS that replaces the urban class with the LCZ? But this is also the case for the BULK simulation outside urban areas: the non-urban areas are identical in both experiments in Figure 3. So, where was LANDMATE ultimately used?
I would suggest adding an introductory sentence specifying which default land cover map/classification WRF uses, followed by a sentence explaining what you modify in your “default” BULK simulation, and then how you improve it using the LCZ map.
Line 180. “it was found that urban grid cells in the BEP–BEM simulations (LCZ classes) cover a larger portion of Île-de-France (23.4%) compared to the urban area represented in the BULK simulations (16.6%)”
Following up on the previous comment, I would emphasize that this is due to the land cover map associated with each experiment, and not the experiments themselves (which is what you’re doing by mentioning the LCZ classes for the BEP simulation). You could very well have chosen to define the LCZ map as a common “default” and simply replace each urban LCZ class with the IGBP-MODIS urban class for the BULK experiment.
After reading this section, I’m once again wondering about the “fraction/tile” approach mentioned earlier (see my previous comment regarding line 98). How does WRF handle the remaining non-urban fraction of urban points? I assume it involves using NOAH, but where does it obtain the information on the required natural land cover and the associated parameters?
Line 189. “2.4 WRF simulations”
I understand why you’re presenting this section next to the “Observations” section, but for now, it focuses mainly on the urban and rural masks definition, and very little on the experiments. This is tricky since you’ve already presented the WRF configuration in Section 2.1 and the Earth observation data in Section 2.2.1. I would suggest adding a sentence explaining the choice to simulate this specific period, perhaps by mentioning the design of the FPS-URB-RCC experiment, which aimed to simulate the 2020 heatwave in Paris.
Line 205. “Observations”
I would suggest, if possible, mapping all the stations used (33?) and not just those located in the urban and rural masks. I just noticed that they appear in Figure 3; you could probably remove them from there, since they are neither presented nor discussed.
One idea might be to include two maps in Figure 4: one showing all stations and their corresponding LCZ colors, and a second map – the one you already have – highlighting the stations used for the urban heat island analysis.
Perhaps a table containing some information about the stations would be helpful (for example, their names, coordinates, elevation, whether they are located in a park, etc.). Not for all of them, but perhaps for the urban ones, or those located in the masks?
More generally, in this section, what is the rationale for using different stations for the temperature analysis and for the UHI analysis, and how can the two be reconciled?
Line 208. “To minimize the impact of data gaps, only stations with less than 1% missing observations relative to the total number of expected records were retained”
What do you mean by the total number of expected records? Wouldn’t the period under analysis be sufficient? I would also suggest mentioning it again: from May to August 2020.
Line 209. “station classifications”.
What do you mean by that? Is it simply the urban/rural distinction based on your urban and rural masks? But then what happens to the station outside of the rural mask?
Line 212. “Consequently, a total of 33 stations”.
You could mention in the previous sentences how many stations were excluded due to missing data and how many because of a mismatch in the land cover.
After reading this section and the previous one (2.4), I understand that you use all the points from the urban and rural masks for your UHI analysis (Figure 10), using the averages of these masks as references. However, no explanation is provided regarding the method used for your analysis of urban and rural temperatures. It is only in the results section that you explain that, for the LCZ sub-analysis, you extract the model point closest to each station. My question, therefore, is why you rely on these two masks for your UHI analysis, especially since you are comparing it to stations.
I understand that the method by Diez-Sierra et al. (2025) was designed to create urban and rural masks in a standardized/automated manner, but I am not convinced that it is appropriate for evaluating a model when using stations; there is very little chance that the model masks are representative of the stations (or vice versa).
Section 3 Results
Line 223. “nighttime (18:00–07:00 UTC) than during daytime (06:00–17:00 UTC)”.
I think the distinction between day and night is fine as it is, but for additional information, I would like to point out that in Paris in June (as an example, when the days are longest) the sun rises around 06:00 UTC+2 and sets around 22:00 UTC+2. Another way to distinguish between day and night could also be to use the downwelling shortwave radiation from the model.
Line 230. “In non–urban areas (Fig. 5b)”.
The fact that the results of the BULK and CTRL experiments are extremely similar makes sense, since nothing was changed in the model at those specific points. These slight differences could be explained by the effect of changes made at other points (urban), which have a limited impact on neighboring points through advection in the atmospheric model. But then, how can we explain these significant differences in the PAR experiment? According to Figure 3, it appears that the modified IGBP-MODIS map shows a different urban extent; are we certain that the model points classified as rural are represented in the same way in the PAR simulation as in the others?
Before delving into the LCZ analysis, I noticed something in Figure S1: in the observations, all temperatures (urban, rural, daytime, and nighttime) increase as the summer progresses – with August > July > June – but this is not always the case in WRF for urban temperatures. The CTRL and BULK experiments appear to show higher daytime peaks in July than in August. This likely points to the heat wave that occurs in August and WRF’s ability to simulate it.
Line 245. “Although the BULK simulation does not use the LCZ land–use classification to urban grid cells (unlike the two BEP–BEM simulations), we extracted BULK values from the corresponding grid cells so that diurnal cycles could be compared across LCZ classes (Fig. 6).”
This sentence gives the impression that you are only extracting the points from the model that are closest to the stations for this LCZ analysis, and that this had not already been done for the temperature analysis in Figure 5. But it was already done for Figure 5, right?
Line 258. “Furthermore, BULK shows warm biases of similar magnitude across all LCZs, which reflects the limited urban variability captured when urban areas are represented using a single land–use category.”
I understand that you mean that, since all urban points have the same parameters in the BULK experiment, we would expect to observe similar urban temperatures. But does that necessarily imply similar biases? Indeed, if the stations are truly representative of the LCZ in which they are located, we would expect their temperatures to differ; consequently, if the BULK experiment does not reflect this variability, the biases should not be identical.
Line 259. “The analysis separated into daytime and nighttime period (Table S5) indicate the same pattern found in the urban stations group, with substantially warmer biases during the night than midday for both BEP–BEM simulations across all LCZs.”
Do you have any idea why this is the case? At first glance, one might expect the UCM to improve results at night, since they should better accounts for urban heat storage than the simpler BULK model.
Line 266. Table 3.
One suggestion I would make is to also add the BULK indicators to the table. Another idea that comes to mind is this: would it make sense to also include the observation values so we can get an idea of the temperatures observed across the different LCZs? But that would probably require reorganizing the table first by LCZ, then Observation | BULK | CTRL | PAR, to avoid repeating the observation row… But that might no longer match the text. It’s just an idea; I understand if you don’t go with it.
Line 270. “We investigate the model’s performance...”
That first sentence mentioning the precise date of the heatwave with a link to the Copernicus website could be moved to the “WRF simulations” subsection.
Line 310. “Figure 9 presents the averaged 2 m air temperature differences”
I was wondering if it might be helpful to include the PAR temperature maps as well, in addition to the delta you’ve shown. If you want to keep the maps the same size, you could reorganize the figure by placing the nighttime maps in a left column with (1) PAR temperature, (2) the PAR-CTRL difference, and (3) the PAR-BULK difference; and do the same in a right column for the daytime maps.
Line 316. “The colder daytime climatology of PAR at the non–urban areas is also apparent spatially when compared with the CTRL and BULK simulations (Fig. 9c,d).”
I touched on this briefly earlier when discussing the differences observed in the experiment regarding rural temperatures, but since this is even more striking here, I find myself asking the same question: do we have any idea what might explain such marked discrepancies outside urban areas? Could it be that the variations observed in urban areas located within the small domain, but outside the Paris region, have an impact on regional-scale flows? Because an average drop of 2 °C (figure 9.d) over the duration of the heat wave seems quite significant.
Line 330. “The urban mask contains 109 grid cells.”
This part made me wonder whether you used the same masks for all 3 experiments and, if so, what dataset is it based on, because, if I understand correctly, it uses certain urban characteristics, such as the urban fraction?
Line 358. “Figure 10”.
I could see at least two other things that could be discussed from Figure 10. The first is that the spread of the UHII (shown in pink) provides insight into the added value of the UCM compared to the BULK approach, as the latter exhibits consistently lower spread because all urban points share identical surface characteristics. Similarly, the differences in spread between UHII in CTRL and PAR are most likely related to differences in UCP, but it is difficult to determine which ones, and which experiment is the best. The second point concerns the weak temporal correlation between the three experiments and the observations. In this regard, I was wondering if you had directly looked at a similar plot for urban and rural temperatures compared to the observations. It might be interesting to see if WRF is capable of capturing the slow development of the heat wave outside the city and how urban areas subsequently react.
Section 4 Summary and Conclusions
Since this section summarizes the manuscript and because of all the comments I have made previously, I will wait for the authors’ response before commenting here.
Technical corrections
Line 52. “Currently, the Coupled Model Intercomparison Project (CMIP) protocols represent urban environments in a homogeneous manner, with limited differentiation among cities.”
I even wonder if the CMIP global climate models actually represent cities at all, given their coarse horizontal resolution.
Line 52. “the the”
Typing error.
Line 64. “EO data”.
The acronym could be introduced here for the first time (outside the abstract).
Line 66. Add (France) next to Paris.
Line 81. “All simulations are driven by the ECMWF ERA5 reanalysis dataset (Hersbach et al., 2020) at a horizontal resolution of 0.25° × 0.25° and 6–hourly temporal intervals.”
You could mention the use or absence of spectral nudging, following the CORDEX and FPS protocols.
Line 92. “i.e. urban parameterization switched off”.
This is a somewhat semantic question, but the BULK approach in WRF could still be considered a form of urban parameterization, since some climate models do not model urban areas at all: some replace them with vegetation cover, while others replace them with “rocks” without any adjustment to surface parameters.
Line 92. “which is integrated inside WRF’s any given land surface model (LSM)”.
I would suggest rephrasing the sentence as follows: “which is incorporated into all the various land surface models (LSMs) available in WRF.”
Line 92. “(e.g. roughness length, surface albedo, heat capacity, thermal conductivity)”.
It might be helpful to include the values for each of these parameters; that way, readers can use them for comparison when you later present the old and new UCM parameters in Table S2.
Line 99. “replacing the land surface model for this fraction”.
You can replace “land surface model” with LSM.
Line 107. “BEP (sf_urban_physics=2), on the other hand, parameterizes”.
I would suggest reversing the order of “BEP” and “on the other hand” to make the text easier to read: “On the other hand, BEP (sf_urban_physics=2) parameterizes…”
Line 130. “Starting from WRF v4.6.0, an updated lookup table with more representative values is introduced as default in WRF.”
I was wondering if there are any references regarding these changes, and perhaps a study analyzing the sensitivity of the new values?
Line 189. “WRF simulations”.
I think the title works, but I was trying to come up with another one to distinguish it more clearly from the title of subsection 2.1, “WRF model configuration and physics.” Something that would better highlight the fact that this section describes various sensitivity experiments involving urban canopy parameters.
Line 214. “(“Urban and Built–Up” for BULK, LCZ2, LCZ5, LCZ6 and LCZ8 for BEP–BEM runs)”.
Could change “,” for “;” between BULK and LCZ2 to separate the lists.
Line 232. “1.62°C”
Typing error, missing a space before °C. Coming back to this comment, you should double check the same typo happens in other places.
Line 264. Figure 6.
You could add the LCZ name in the panels title as you did in the text: LCZ2 (compact high–rise).
Line 270. “during a heatwave event occurred”
Should add “that” or “which” between “event” and “occurred”.
Line 332. “indicating the range of UHII variability”
maybe add “indicating the range of UHII variability relative to the average rural temperature”.