the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Barriers to operational flood forecasting in complex terrain: from precipitation forecasts to probabilistic flood forecast mapping at short lead times
Abstract. As flood alert systems move towards higher spatial resolutions, there is a continued need to enable approaches that provide robust predictions of flood extent that adequately account for the uncertainties from meteorological forcing, hydrologic and hydraulic model structure, and parameter uncertainty. In flood forecasting, two primary sources of uncertainty are the quantitative precipitation forecasts (QPF) and the representation of the channel and floodplain geometry. This is especially relevant as simple approaches (e.g., HAND) are being used to map floods operationally at field scales (< 10 m). This article investigates the benefits of using a computationally efficient probabilistic precipitation forecast (PPF) approach to generate multiple flood extension scenarios over a region of complex terrain prone to flash floods. First, we assess the limitations of using a calibrated version of the gridded version of the WRF-Hydro model to predict an extreme flash flood event in the Greenbrier River Basin (West Virginia) on 24 June 2016. We investigated an ensemble methodology to combine operational High-Resolution Rapid Refresh (HRRR) QPF with radar-based Quantitative Precipitation Estimates, specifically MRMS QPE products. This approach was most effective to increase the headwaters streamflow accuracy in the first hour lead time, which is still insufficient to issue actionable flood warnings in operational applications. At longer lead-times, success was elusive due to epistemic uncertainties in MRMS rainfall intensity and HRRR rainfall spatial patterns. Furthermore, a QPF ensemble was used to generate an ensemble of flood heights using the HAND flood mapping methodology at different spatial resolutions. Results revealed a scale-dependency with increasing dispersion among the predicted flooded areas with increasing spatial resolution down to 1 meter. We hypothesize the overprediction of flooded areas at higher spatial resolutions reflects the increasing number of river reaches and the need for scale-aware representation of river hydraulics that impacts flood propagation in the river network.
- Preprint
(6363 KB) - Metadata XML
-
Supplement
(2301 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2023-2088', Anonymous Referee #1, 22 Dec 2023
The manuscript addresses the important topic of flood forecasting and illustrates uncertainty associated over a catchment in the US. The paper includes meaningful results. Nevertheless, the paper lacks significant information about the applied methodologies and the used dataset, and several figures and resultats are poorly presented (e.g. maps too small). Moreover, some conclusions related to the model performances are not supported by the presented results. Finally, the title is misleading since it suggests a review related to flood forecasting while the paper is a case study over (only) one catchment. Thus, I recommend to reconsider this paper after major revisions.
Specific comments:
- The title is misleading. As it stands, the title is making us believe than the paper is a review, while the paper is a case study on probabilistic flood forecast. The title needs to be modified to be coherent with the paper content.
- L24: what are flood waves in this context?
- L61: please clearly state before this sentence (i) what is the HAND method and (ii) that HAND is a “non-physically based methodology”.
- L61: what are “epistemic uncertainties” in this context?
- L63: what is “two-way coupling” in this context?
- L90 to 93: what about Machine-Learning based methods and mixed approaches?
- L109: the NWM needs to be described before.
- L111: idem: the WRF-HYDRO and HAND methods need to be described before.
- L112: please explain why this region is prone to flash flood by describing its general topographic and climatic context.
- L113: what is “extreme” in this context? This term needs to be quantitatively defined.
- General comment on the introduction section: rich in terms of relevant references, but (i) lacking in concrete examples and (ii) too rich in terms of subjects covered, we're a bit lost in the sources of uncertainty, wondering what we're going to read in the article...
- Section 2.1 and Figure 1: it’s difficult to evaluate the independency of these floods, that seems to be common ones, and not extreme as announced in the introduction.... Why not other, older floods? How independent are these floods? How extreme are these floods?
- L131: the “flash flood events” need to be defined in this context in terms of precipitation intensity and reaction time of the catchment.
- L133: where is the “White Sulphur Springs neighborhood” in the Figure 1?
- L134: “constrained by QPE data availability”: what is the period of data availability?
- L136: “The approximate time to peak was calculated based on the hours between the initial time of the rainfall event (gauge measurement in mm/h) over the basin and the water level at the Alderson gauge”: what rainfall data has been used for this calculation? Rainfall gauges? Where are they? Has a catchment rainfall (i.e. interpolation of rain gauge data over the catchment boudaries) been used for this? This methodology is unclear.
- Table 1:
- What is the length of the event considered here?
- Caption: “The precipitation values represent the maximum observed by the daily rain gauge and the maximum MRMS pixel over the basin domain”: several of the rain gauges or only one rain gauge? Where are they? “MRSM” is not defined at this stage of the paper.
- Please change the peak values in meters. Why not adding peak flow values (in m3/s)?
- L140 to 143: these sentences are a repetition of elements already presented in the introduction: delete them?
- L144: what means “instantaneous” in this context?
- L145: where are the closest radars?
- Section 2.2: significant details are missing in this section (timesteps, spatial resolution, temporal coverage...)
- L149: are Quantitative Precipitation Forecasts (QPF) necessary deterministic?
- L152: what are the temporal resolution and the temporal coverage of this dataset?
- L152: “The HRRRv3 dataset leverages multiple data assimilation techniques and observation sources, including the assimilation of ground weather radar data”: what are the “observation sources” in this context? Rain gauges? The same as the one used in this study? Also, weather radar data? If yes, the same weather radar than the one used in MRMS?
- L173: “HRRR” has not been defined yet.
- L174: here and throughout the article: use QPE instead of MRMS?
- L175: here and throughout the article: use QPF instead of HRRR or HRRR QPF?
- L177: what is a “multiplicative bias factor” in this context?
- L192: where are located the rain gauges?
- L192 to 195: I do not understand these three sentences: what is the message here?
- Section 2.5: it is needed to describe first the models before describing the calibration of their parameters.
- L198: what is a “community model” in this context?
- L202: please clearly state that this LSM is used within the WRF-HYDRO model.
- L213: how many parameters characterize the LSM in total? What is this selection? Is it 16, 20 or 21 parameters?
- L214: what is the “National Water Model”?
- L215: what is the “existing calibration conducted for operational purposes” in this context?
- L218: “utilizing hourly streamflow observations”: this is unclear: are several stations being considered? Over different flood events?
- L224: please state that none upstream stations have been used for the calibration.
- L225: please consider presenting the “changes in the 20 parameters” in the paper.
- L227: please clarify if all the 2016 year has been sued for the calibration or only a flood observed during the 2016 year.
- L237 to 240: please present these datasets in detail in the “data” section.
- L249: this is confusing: the calibration is performed over only one station, but upstream stations are used for evaluation? Please clarify this point.
- L259: at this stage, it is unclear how flood maps are generated.
- L259: “field-collected flood benchmarks provided by the US Geological Survey (USGS)”: how many flood marks are available? Where are they? This dataset needs to be described in a devoted section.
- L261: this is confusing: we move from flood benchmarks to flood maps. Are these data the same? Have you both flood marks and flood maps? These datasets need to be clearly described in a devoted section.
- L263: “The Probabilistic Streamflow Forecast (PSF) was aimed at assessing the potential of using a higher spatial resolution topographic dataset for generating probabilistic flood maps in the study area.” Please temper this sentence and objective since this assessment is done analyzing only one flood event on one catchment…
- L288: how a pixel is considered to be flooded?
- L304: please add the rain gauge location in the Figure 1 and present these datasets in a devoted section.
- Figure 4: What is 4.e and 4.f? Any analysis of this subplot in the text? Delete them and add some of the supplementary analysis instead?
- L309: please define these seasons.
- L310: What about snow in this context? Any hypothesis related to these over and underestimations?
- L320: how much underestimation?
- General comment on the section 3.1: this section lacks a specific analysis of QPE performances related to heavy rainfall events and especially the ones that generated the six floods studied.
- L325: please describe this dataset in the devoted section.
- L325: please do not describe the Figure 6 before the Figure 5.
- L333: what is the duration of the event?
- L333: “The additional overlapping values of total rainfall accumulation from daily in situ gauges facilitate the interpretation of overestimation or underestimation across the domain.” I do not agree with this point: why not adding differences between accumulation maps?
- L338: what is “average rainfall” in this context? Catchment accumulated rainfall?
- L342 to 348: this methodological point needs to be presented in the “methodology section”. Moreover, the general presentation of this section needs to be corrected: first the Figure 6 is described, then being compared with results presented in the Figure 5, then being again presented with methodological points related…
- Figure 5: is the total duration the same for all maps? Usefulness of the grey distributions? Why not having maps of the same sizes (and bigger)?
- Figure 6:
- Please consider splitting this figure in two
- Correct MRMS in the figure
- Which pixels are considered here? The ones over the catchment? This point needs to be clearly stated in the paper.
- Please consider having a “short name” for “USC00463669”
- Table 3: please consider changing it into a figure.
- Line 355: please state that these scores are ensemble mean of the 6 studied events.
- Title of the section 3.3: please rephrase it since this is a rhetoric question (it has already been showed before).
- L361: the fact that you only have 16 parameters at this stage is unclear.
- L367: please change “The hydrograph” into “A part of the hydrograph”
- General comment on the 3.3 section: the model performance appears to be correct (with an underestimation of the floods!) at the downstream station (used for calibration) but bad for the upstream stations. These points need to be clearly stated in the section.
- L394: please consider changing “overestimated” into “strongly overestimated”.
- L395: please add an explanation of this overestimation for this flood while the other floods were underestimated.
- L402: please consider changing “overestimation” into “strong overestimation”.
- L404: why two flood peaks are observed for HRRR and 20 SIM? Please consider adding precipitation in the graph (#8).
- L405 to 407: this is unsatisfactory since this result is obtained by two different errors that are compensating each other, by luck! Please rephrase these sentences and discuss this point.
- L419 to 420: idem! What about other events?
- L433: this combination needs to be more deeply presented in the method section.
- Figure 9: hard to read and analyze because the maps are too small.
- L463 to 465: irrelevant analysis, since floods are of particular interest in this paper? What are the performances of the QPE on heavy rainfall and flood events?
- L471: please clarify the duration of the flood.
- L545: “We set up a physically based fully distributed hydrological-hydraulic model”: I do not agree with this point, since HAND is not a physically based model.
- L564: “The WRF-Hydro parameter calibration was satisfactory to simulate the hourly streamflow during extreme flood events.” I do not agree with this conclusion: the model was good for the downstream station, with an underestimation of the flood events studied, but was bad for upstream stations. Please rephrase this conclusion in agreement with the presented results.
- L566: “The hydrological model also benefited from the second and third hours PPF forecast delays in predicting the most intense rainfall.” Again, the presented results are not in agreement with this conclusion. Please temper this conclusion.
Additional comments:
- L2: a space needs to be added before “Li”
- L27: unclear/ grammar to be checked: sentence to be rephrased?
- L37: unclear/ grammar to be checked: sentence to be rephrased?
- L42: a space needs to be added before “Di”
- L74: an “et al.” is missing.
- L83: an extra-space needs to be deleted.
- L95: a space needs to be added before “FWS”
- L112: please state that West Virginia is in the USA.
- L210: a reference is missing.
- L282: please add a proper citation of the two R packages.
- L286: please correct “brenchmark”
- L335: please correct “previosuly"
- L369 (throughout the article): correct m3/s
- L381: please correct “hydropgrahs”
- L512: please correct “limb o fteh flood”
- L564: : please correct “The WRF-Hydro parameter calibration was satisfactory to simulate the hourly streamflow during The WRF-Hydro parameter calibration was satisfactory to simulate the hourly streamflow during extreme flood events.”
Citation: https://doi.org/10.5194/egusphere-2023-2088-RC1 -
AC1: 'Reply on RC1', Luiz Bacelar, 06 Mar 2024
We thank the reviewers for their time and helpful comments. In this response, we first provide a series of overview responses to major questions and concerns that were raised by the reviewers. After that initial overview, we provide a response to each reviewer’s comment and discuss how we will address each one in the revised manuscript. The reviewers comments are shown in blue italics, while the author responses are shown in unformatted text.
- Title clarity and study relevance to the state-of-the-art:
We agree that the current title can be misleading. The title could better convey that it is a case study rather than a broader review. We will refine it in the revised manuscript. That being said, we would argue that catchment studies are still relevant to improve the predictability of natural hazards, especially within similar frameworks in operational flood forecasting systems (i.e., WRF-Hydro and the National Water Model (NWM). The NWM framework was taken as an example of a state-of-the art in operational flood forecasting systems, for its robustness in operational capability to provide streamflow forecasting in high-resolution (2.7 million river reaches) across a continental extent in computationally efficient in real-time. Our independent experiment mirrors key operational components of the NWM framework, including QPE, hydrological model calibration, and flood mapping methodology. Conducting localized studies is essential to uncover potential errors in large-scale evaluations of such frameworks. For instance, recent studies on the NWM forecast accuracy (Johnson et al., 2023 (a) and Johnson et al., 2023 (b)) did not explore the impact of ensemble predictions on flood mapping or the influence of short-term ensemble precipitation or streamflow forecast, given their non-operational status. We used as a motivation to anticipate how increasing spatial resolution (1m HAND) may affect short-term flood mapping forecasts, contrasting it with the current state-of-the-art in near real-time high-resolution computationally efficient flood maps (10m HAND in the NWM) applicable for continental extents. We will include more information about the NWM in the introduction section, and in which points our forecast chain is similar to the operational version. Our analysis goes beyond assessing the uncertainty in flash flood mapping itself; it delves into the forecast chain, covering streamflow prediction, calibration, short-term rainfall prediction (HRRR), and the use of observed gridded rainfall (MRMS).
The operational HRRR QPF ensemble started to be available more recently in the end of 2020 (https://rapidrefresh.noaa.gov/hrrr/). The deterministic HRRR have been used by the NWM to provide its deterministic flood mapping prediction. For this reason, we chose to evaluate how a simple geostatistical methodology for generating rainfall ensembles through determinist short-term HRRR could impact flood mapping uncertainty during the 2016 flash flood event in West Virginia. It is important to clarify that our paper does not solely present a methodology for bias-correcting HRRR predictions through a straightforward geostatistical approach; rather, it aims to evaluate its consequential impacts on streamflow and flood mapping predictions within short lead times.
We acknowledge that more sophisticated QPF ensemble methodologies are available in literature and could potentially increase our accuracy for the second and third hour HRRR ensemble forecast as we mentioned it in the introduction (lines 99-103), and conclusions (line 560). We understand that different rainfall ensemble methodologies could impact statistical properties of the forecasted rainfall ensemble and influence, for example the spread of streamflow and flooded areas. But we contend that our results, for example, the uncertainty of flood mapping increases with a higher resolution of the DEM, will remain mostly unaffected by changes in the QPF ensemble methodology. The paper brings the insight that the HAND flood mapping methodology (adopted in operational flood forecast systems) in higher resolution would be more sensitive to a spread of streamflow predictions, and it reveals a potential gap in transferability of hydraulic properties in a complex terrain floodplain. We acknowledge that validating this hypothesis requires testing the same forecast chain in various mountainous regions where a 1m Lidar DEM is available.
Furthermore, we assert that a comprehensive analysis of multiple components within a flash flood forecast chain in a single study, ranging from hydrological model calibration and precipitation forecast to flood mapping over the floodplain, remains infrequent for short lead times and very high spatial resolutions due to data availability constraints. Such an approach proves invaluable for retro-analyzing deadly natural hazards and refining the operational capabilities of flash flood prediction. Unlike numerous flash-flood events, the observed flood height for the 2016 event in West Virginia was collected on a field scale by the USGS and is publicly accessible. Leveraging this dataset enabled us to demonstrate that our total forecasted ensemble spread (Figure 9, denoted by the black range over bar plots in the ensemble mean) reduced the False Alarm Ratio (FAR) in flooded areas compared to the simulated map through MRMS QPE (no prediction) and the deterministic HRRR simulation.
Finally, we would like to emphasize how our study contributes to the flood flash literature by examining two components not yet implemented in the NWM, with potential inclusion in future versions: ensemble streamflow forecasts in short lead times through QPF ensemble and higher resolution (1m HAND) flood mapping over floodplain areas. We contributed to demonstrate that operational ensembles would not only be necessary to reveal the uncertainties in WRF-Hydro streamflow prediction but also in the final HAND flood mapping methodology. Those results have not been yet demonstrated using the HAND methodology in a similar forecast chain (i.e., QPE, hydrological model, calibration methodology and deterministic QPF) such as the NWM. We will clarify the intention of our contributions more explicitly in the introduction section in the revised manuscript
- The use of metrics for ensemble evaluation
In Section 2.8, we highlighted the importance of ensemble metrics to showcase the accuracy, reliability, sharpness, and skill of the probabilistic forecasts. We acknowledge that our initially presented metrics solely focused on evaluating forecast accuracy through the ensemble mean. We commit to addressing this limitation by incorporating additional metrics, including CRPSS and others, to comprehensively analyze the reliability and skill of both rainfall and streamflow ensembles. Furthermore, we recognize the value in calculating a new metric for the evaluation of flood mapping in Section 3.5., as suggested by reviewer 2, the fraction skill score (FSS). We fully endorse that incorporating these additional ensemble metrics will enhance the overall interpretation of our results.
- Calibration methodology and number of events.
As part of forecast chain methodology, we performed the WRF-Hydro calibration and validation from 2015-2020 to keep consistency with the MRMS QPE availability, which started in operation in 2015. We will add the information that we presented for the statistics for the calibration (05/2015-10/2018) and validation (11/2018-2020) period together in Figure 7, and we will present them separately for the next round of revisions. We also used the NLDAS precipitation dataset to spin-up the model (2013-2015), before our calibration and validation periods. This approach aimed to maintain consistency and account for potential biases in QPE, as discussed in Section 3.1, throughout our parameter search process. To avoid overlapping with the NWM calibration, we sought collaboration with the NWM team, obtaining their Noah-MP LSM parameters as an initial guess for our experiments (Figures in the supplementary material). This collaboration was motivated by the shared use of similar Noah-MP land surface parametrizations between our study and the NWM. It is worth mentioning that the NWM was calibrated with a different configuration and period of input data than our study.
Keeping in mind the whole forecast chain, our calibration methodology primarily focused on ensuring the accurate simulation of runoff and streamflow for the flood mapping process. Incorporating extreme events into the hydrologic-hydraulic model calibration is a common practice in flood mapping studies, particularly to optimize parameters such as channel roughness in the routing parametrization, which significantly influences flood wave and water height estimations. A similar approach was noted in a European flash flood mapping study by Hocini et al. 2021, emphasizing event-specific calibration against discharge observations to minimize errors in peak discharges used for hydrologic-hydraulic simulations.
In our case, as outlined in Section 2.5, the calibration methodology included targeting the diffusive wave river routing parameters within WRF-Hydro. We demonstrated the accuracy of the calibration to generate ensembles for the other events, not only for the flash flood in 2016, and to validate our calibration we used different streamflow gauges through the Greenbrier river. Although our calibration methodology used only one streamflow gauge downstream the Greenbrier river (Hilldale), we demonstrated that during the 2016 flash flood event this validation presented a NSE of 0.82 and Pearson Correlation (MRMS QPE) of 0.93 in the Buckeye station (Table 4), the station upstream to Howard Creek. As noted in line 540, we acknowledge the inability to validate our calibration for simulating the outlet of Howard Creek due to the absence of a USGS gauge during the 2016 flash flood event. In response to a reviewer's suggestion, we commit to incorporating the Howard Creek location onto Figure 1, as currently, it is only presented in Figure 9.
We will enhance the clarity of the manuscript by providing a more detailed description of the methodology. Additionally, we recognize the need to incorporate concrete examples in the introduction section and offer a more detailed description of the conclusions, as suggested by reviewer 1. Addressing the minor technical and grammatical points highlighted by reviewer 2 is also a priority for us. Furthermore, we acknowledge the importance of including more European studies in the references, as recommended by reviewer 3. As mentioned previously, we are committed to calculating additional ensemble metrics to further enrich the analysis.
Point-by-point response to reviewers comments
The rest of the response addresses each reviewer’s comment; the reviewer comment is shown in
blue italics, while the author responses are in unformatted black text.
Reviewer 1
The title is misleading. As it stands, the title is making us believe than the paper is a review, while the paper is a case study on probabilistic flood forecast. The title needs to be modified to be coherent with the paper content.
Thank you for this feedback. We agree that the title needs to be changed and will do so for the revised manuscript.
L24: what are flood waves in this context?
The flood wave is the rise of flow in the hydrograph due to precipitation over the basin. In the case of flash floods, the flood waves tend to have faster rise due steep topography of the basin. We will add more context in the introduction in the revised manuscript.
L61: please clearly state before this sentence (i) what is the HAND method and (ii) that HAND is a “non-physically based methodology”.
We will provide an improved explanation of HAND in the revised manuscript.
L61: what are “epistemic uncertainties” in this context?
In this context, epistemic uncertainties refer to the change of velocity and water height calculated from the hydraulic river routing formulation in the hydrological mode and the final overlapping to the floodplain though the use of HAND.
L63: what is “two-way coupling” in this context?
In this case, the use of a two-way coupling in the HAND methodology would be if the flood height generated through HAND is used to calculate, or to update, the state of the river routing formulation in the hydrological model for the next time-step of the model. We will add this information to the paragraph in the revised manuscript.
L90 to 93: what about Machine-Learning based methods and mixed approaches?
In the revised manuscript, we will add more information about the use of machine-learning for QPFs. However, one of the assumptions of our forecast chain experiment (and the current version of NWM), is that the HRRR QPF would be sufficient to identify the cloud/no cloud in the short-term forecast, in which this cloud/no cloud mapping is highly influenced by the data assimilation techniques using weather radar as initial conditions for HRRR.
L109: the NWM needs to be described before.
Although it is mentioned before in line 58, we will add more information about the NWM in the introduction for further clarity.
L111: idem: the WRF-HYDRO and HAND methods need to be described before.
We will add to the introduction more information about how WRF-Hydro is connected to the NWM framework.
L112: please explain why this region is prone to flash flood by describing its general topographic and climatic context.
We currently describe the important for this region in terms of flash flooding in Section 2.1; however, we agree that it should be described earlier and will move it to the introduction in the revised manuscript.
L113: what is “extreme” in this context? This term needs to be quantitatively defined.
As we mentioned in the Section 2.1, the extreme events were based on streamflow data in the Alderson station in which exceeded the action stage level defined by the USGS. Those events were chosen between 2015-2020 due to MRMS QPE availability.
General comment on the introduction section: rich in terms of relevant references, but (i) lacking in concrete examples and (ii) too rich in terms of subjects covered, we're a bit lost in the sources of uncertainty, wondering what we're going to read in the article...
We mentioned in line 81 that we evaluated the uncertainties in the main components of a forecast chain. We mentioned the sources of uncertainties in each component for the forecast chain throughout the introduction. For example, the impact of QPE in hydrological model (line 81), QPF (line 93), QPF ensemble (line 95), and flood mapping methodologies (paragraph starting in line 61). We will bring them together to more clearly state our intention described in line 106.
Section 2.1 and Figure 1: it’s difficult to evaluate the independency of these floods, that seems to be common ones, and not extreme as announced in the introduction.... Why not other, older floods? How independent are these floods? How extreme are these floods?
As described in Table 1, those events appended in weeks of interval between each other. They were chosen events above the action stage threshold in Alderson after 2015 due to MRMS QPE availability. The gave more details about this choice in our key point answer number 3.
L131: the “flash flood events” need to be defined in this context in terms of precipitation intensity and reaction time of the catchment.
We will include this information in the introduction and describe more in the introduction why this region is prone to flash flood events.
L133: where is the “White Sulphur Springs neighborhood” in the Figure 1?
We will include the Howard Creek location into Figure 1. As it currently stands, it is only presented in Figure 9.
L134: “constrained by QPE data availability”: what is the period of data availability?
Within the revised manuscript, we will mention that the MRMS QPE started to be available in 2015.
L136: “The approximate time to peak was calculated based on the hours between the initial time of the rainfall event (gauge measurement in mm/h) over the basin and the water level at the Alderson gauge”: what rainfall data has been used for this calculation? Rainfall gauges? Where are they? Has a catchment rainfall (i.e. interpolation of rain gauge data over the catchment boudaries) been used for this? This methodology is unclear.
The information of the source of rainfall data for this statement is included in Table 1, as we show the differences between the rain gauges and MRMS QPE for each one of the events. The max. accumulated rainfall gauge was calculated through the interpolation of all daily rain gauges. We will add this information to the section 2.1.
Table 1:
What is the length of the event considered here?
We will add the total length of each event to the Table.
Caption: “The precipitation values represent the maximum observed by the daily rain gauge and the maximum MRMS pixel over the basin domain”: several of the rain gauges or only one rain gauge? Where are they? “MRSM” is not defined at this stage of the paper.
We will add additional context that the MRMS is described in section 2.2.
Please change the peak values in meters. Why not adding peak flow values (in m3/s)?
We intended to preserve the original data from USGS, as they define their flood thresholds. We can convert this information to m3/s.
L140 to 143: these sentences are a repetition of elements already presented in the introduction: delete them?
We will include this paragraph in the introduction to add more concrete examples of the significance of QPE for hydrological modelling.
L144: what means “instantaneous” in this context?
We will add (mm/h) after the word instantaneous.
L145: where are the closest radars?
We will add a reference with an image of the weather radar in the US.
Section 2.2: significant details are missing in this section (timesteps, spatial resolution, temporal coverage...)
We will add more information about the MRMS data.
L149: are Quantitative Precipitation Forecasts (QPF) necessary deterministic?
We will add the information that the HRRR ensemble (version 4) started to be available after the end of 2020, for that we used the deterministic version 3 to create our own HRRR ensemble for the 2016 flash flood event.
L152: what are the temporal resolution and the temporal coverage of this dataset?
It is 3km of spatial resolution, from 2015-2020. We will add this information in the section.
L152: “The HRRRv3 dataset leverages multiple data assimilation techniques and observation sources, including the assimilation of ground weather radar data”: what are the “observation sources” in this context? Rain gauges? The same as the one used in this study? Also, weather radar data? If yes, the same weather radar than the one used in MRMS?
We mentioned in our results section (line 303) that the MRMS algorithm consider hourly rain gauges as part of their retrieval algorithm. We can add this information in this Section. However, we did not find evidence in our results that real-time hourly correction happened over the domain, or over the Greenbrier basin (only 1 hourly rain gauge inside the basin, which limited our QPE evaluation at an hourly basis (mentioned in line 303)).
L173: “HRRR” has not been defined yet.
HRRR was defined in section 2.3 (QPF), line 149.
L174: here and throughout the article: use QPE instead of MRMS?
We will make more use of the term QPE for clarity in the revised manuscript.
L175: here and throughout the article: use QPF instead of HRRR or HRRR QPF?
We will consider replace HRRR and HRRR QPF with QPF.
L177: what is a “multiplicative bias factor” in this context?
The multiplicative bias factor (MBIAS) is defined in line 274.
L192: where are located the rain gauges?
The locations are shown in Figure 4.
L192 to 195: I do not understand these three sentences: what is the message here?
This is part of the methodology describing how we evaluated the results presented in Figure 4 and Tables, for instance. We will clarify this section in the revised manuscript.
Section 2.5: it is needed to describe first the models before describing the calibration of their parameters.
Thank you for pointing this out. We agree that we should provide context to the parameter by describing the model before proceeding with describing the methodology for parameter calibration. We will switch the order of these subsections to have the model description go first.
L198: what is a “community model” in this context?
This is defined as a model that has the source code open for a community contribution.
L202: please clearly state that this LSM is used within the WRF-HYDRO model.
We will add this information in the revised manuscript.
L213: how many parameters characterize the LSM in total? What is this selection? Is it 16, 20 or 21 parameters?
We will clarify in the revised manuscript that 21 parameters were used, as shown in Figure 3. We will add the information that only the LSM (Noah-MP) from the operational NWM were used as initial search for our calibration. We will further explain our choice of calibration in the revised manuscript.
L214: what is the “National Water Model”?
The National Water Model (NWM) was mentioned in the introduction (line 58 and 109). We agree that a description of the NWM is necessary in the introduction; we will add it in the revised manuscript.
L215: what is the “existing calibration conducted for operational purposes” in this context?
We will provide further information on existing calibration practices that are used in operational settings in the revised manuscript.
L218: “utilizing hourly streamflow observations”: this is unclear: are several stations being considered? Over different flood events?
We will add further explanation about our choice of calibration in the revised manuscript
L224: please state that none upstream stations have been used for the calibration.
We will mention it in the updated section.
L225: please consider presenting the “changes in the 20 parameters” in the paper.
Thank you for the suggestion. We will consider including it in the manuscript instead of the supplementary material.
L227: please clarify if all the 2016 year has been sued for the calibration or only a flood observed during the 2016 year.
Yes, we used all the 2016 year as part of the calibration. As part of forecast chain methodology, we performed the WRF-Hydro calibration and validation from 2015-2020 to keep consistency with the MRMS QPE availability, which started in operation in 2015. We will add the information that we presented the statistics for the calibration and validation period together in Figure 7, and we will present them separately for the next round of revisions. This approach aimed to maintain consistency and account for potential biases in QPE, as discussed in Section 3.1, throughout our parameter search process. To avoid overlapping with the NWM calibration, we sought collaboration with the NWM team, obtaining their Noah-MP LSM parameters as an initial guess for our experiments (Figures in the supplementary material). This collaboration was motivated by the shared use of similar Noah-MP land surface parametrizations between our study and the NWM.
L237 to 240: please present these datasets in detail in the “data” section.
We will consider rearranging the methodology and data sections to ensure datasets are contained within the data section.
L249: this is confusing: the calibration is performed over only one station, but upstream stations are used for evaluation? Please clarify this point.
In our case, as outlined in Section 2.5, the calibration methodology included targeting the diffusive wave river routing parameters within WRF-Hydro. We demonstrated the accuracy of the calibration to generate ensembles for the other events, not only for the flash flood in 2016, and to validate our calibration we used different streamflow gauges through the Greenbrier river. Although our calibration methodology used only one streamflow gauge downstream the Greenbrier river (Hilldale), we demonstrated that during the 2016 flash flood event this validation presented a NSE of 0.82 and Pearson Correlation (MRMS QPE) of 0.93 in the Buckeye station (Table 4), the station upstream to Howard Creek. As noted in line 540, we acknowledge the inability to validate our calibration for simulating the outlet of Howard Creek due to the absence of a USGS gauge during the 2016 flash flood event. In response to a reviewer's suggestion, we commit to incorporating the Howard Creek location onto Figure 1, as currently, it is only presented in Figure 9.
L259: at this stage, it is unclear how flood maps are generated.
The flood maps were generated using the water height estimation provided by the diffusive wave equation routing component in the WRF-Hydro. We will improve the description of how we generated the flood maps in the methodology section.
L259: “field-collected flood benchmarks provided by the US Geological Survey (USGS)”: how many flood marks are available? Where are they? This dataset needs to be described in a devoted section.
There are originally 51 points of flood benchmarks. The reference and link to download this dataset is mentioned in the paragraph as Watson and Cauller (2017). The dataset also provided a vectorized estimated observed flood map through the flood benchmark. We used the vectorized estimated observed flood map from the dataset for our evaluation. We will mention it in the revised section.
L261: this is confusing: we move from flood benchmarks to flood maps. Are these data the same? Have you both flood marks and flood maps? These datasets need to be clearly described in a devoted section.
The dataset provided a vectorized estimated observed flood map through the flood benchmark. We used the vectorized estimated observed flood map from the dataset for our evaluation. We will mention it in the revised section.
L263: “The Probabilistic Streamflow Forecast (PSF) was aimed at assessing the potential of using a higher spatial resolution topographic dataset for generating probabilistic flood maps in the study area.” Please temper this sentence and objective since this assessment is done analyzing only one flood event on one catchment…
Thank you for this suggestion. We agree that the language of the original manuscript was too general. We will add ‘for the 2016 flash flood event’ at the end of the sentence.
L288: how a pixel is considered to be flooded?
The vectorized estimated observed flood map from the dataset was rasterized as a Boolean mask in 10m and 1m of spatial resolution according to the resolution of the DEM, where a value of 1 means a flooded pixel.
L304: please add the rain gauge location in the Figure 1 and present these datasets in a devoted section.
The rain gauges are shown in Figure 4 for the evaluated results part. We will add this information in Figure 1.
Figure 4: What is 4.e and 4.f? Any analysis of this subplot in the text? Delete them and add some of the supplementary analysis instead?
Those plots were added to support the evaluation of the overestimation or underestimation of the QPE throughout the topography, as discussed in line 307. In the revised manuscript, we will mention that we were not able to clearly observe it 4 ( e) and (f) scatterplots but we can visualize the underestimation through the spatial interpolation 4 (a-d).
L309: please define these seasons.
The seasons are defined in the Figure 4 caption description as (a) Summer (June, July, and August), (b) Fall (September, October, and November), (d)
Winter (December, January, February), (d) Spring (March, April, May).
L310: What about snow in this context? Any hypothesis related to these over and underestimations?
We will add more information about the % of snow in the precipitation for the winter period.
L320: how much underestimation?
We will bring the Figure 3S from the supplementary material to this section in the paper to support our analysis for each event separately.
General comment on the section 3.1: this section lacks a specific analysis of QPE performances related to heavy rainfall events and especially the ones that generated the six floods studied.
Thank you for this feedback. We will bring the Figure 3S from the supplementary material to this section in the paper to support our analysis for each event separately.
L325: please describe this dataset in the devoted section.
We will add the information about the hourly gauges.
L325: please do not describe the Figure 6 before the Figure 5.
We will address this concern by mentioning the Figure 5 earlier, which is related to the previous sentence.
L333: what is the duration of the event?
We considered the 2016 flash flood event from 06/21/2016 to 07/02/2016 for analysis as shown in the hydrographs of Figure 8. We will include the duration of the events in Table 1.
L333: “The additional overlapping values of total rainfall accumulation from daily in situ gauges facilitate the interpretation of overestimation or underestimation across the domain.” I do not agree with this point: why not adding differences between accumulation maps?
We preferred to preserve the original spatial distribution from the original data. We agree that plotting the rain gauges would be the connection among them. We can add the difference plots in the supplementary material as complement for our analysis.
L338: what is “average rainfall” in this context? Catchment accumulated rainfall?
The term average will be deleted in the revised manuscript since the graph shows the intensity (mm/h) between the products.
L342 to 348: this methodological point needs to be presented in the “methodology section”.
We mention in section 2.4 how the 20 rainfall members were generated.
Moreover, the general presentation of this section needs to be corrected: first the Figure 6 is described, then being compared with results presented in the Figure 5, then being again presented with methodological points related…
Thank you for this feedback. We will overhaul this section to ensure that the sequence between Figure 5 and 6 is much more logical and clearer.
Figure 5: is the total duration the same for all maps? Usefulness of the grey distributions? Why not having maps of the same sizes (and bigger)?
Yes, it is the total duration for all maps. We will include that in the caption. The grey distributions are histograms. They serve to understand the placement of the most intense rainfall, from a latitudinal variation perspective (grey on right), and longitudinal (grey on top).
Figure 6:
Please consider splitting this figure in two
Correct MRMS in the figure
Which pixels are considered here? The ones over the catchment? This point needs to be clearly stated in the paper.
Please consider having a “short name” for “USC00463669”
We will split Figure 6 to increase readability. For Figure 6 (a) it was considered all pixels over the same domain as Figure 5, we will include this information in the revised manuscript.
Table 3: please consider changing it into a figure.
We will consider this suggestion, thank you.
Line 355: please state that these scores are ensemble mean of the 6 studied events.
We will clarify this language in the revised manuscript
Title of the section 3.3: please rephrase it since this is a rhetoric question (it has already been showed before).
We will substitute for “ how does the hydrological calibration affect the streamflow simulations along the main river?’
L361: the fact that you only have 16 parameters at this stage is unclear.
We will fix this mistake since we are not only pointing out the LSM parameters from the NWM.
L367: please change “The hydrograph” into “A part of the hydrograph”
We will change that.
General comment on the 3.3 section: the model performance appears to be correct (with an underestimation of the floods!) at the downstream station (used for calibration) but bad for the upstream stations. These points need to be clearly stated in the section.
We agree that the calibration is showing an overall underestimation of the peak discharge, but we also noticed a slight improvement compared to the default parameter from the NWM (shown in Figure 6Sb in the supplementary material). We can add this to this paragraph.
L394: please consider changing “overestimated” into “strongly overestimated”.
We will consider making that change in the revised manuscript.
L395: please add an explanation of this overestimation for this flood while the other floods were underestimated.
We will add this explanation, and it helps to describe the uncertainty in the calibration. Even though this flood event 1 was considered in the period of calibration, it was still overestimated by WRF-Hydro. It is a common practice in flood mapping studies to calibrate the hydraulic-hydrological model for each event separately. We understand that the severity of the event 1 (very high accumulated rainfall) did not agree with the bias in precipitation observed in the other events (Figure 3S in the supplementary material), which can impact the overall calibration and validation from 2015 to 2020. Due to the overestimation of the peak but quick recession in the hydrograph, we assumed the LSM calibrated runoff production was satisfactory, and the overestimation of the peak can be attributed to the overland flow or routing parameters to adjust the timing of the recession.
L402: please consider changing “overestimation” into “strong overestimation”.
We will consider this change.
L404: why two flood peaks are observed for HRRR and 20 SIM? Please consider adding precipitation in the graph (#8).
The green lines in Figure 8 refer to the deterministic HRRR and their correction in different lead times (a) 1h (b) 2h and (c) 3h. We will add this information to the caption.
L405 to 407: this is unsatisfactory since this result is obtained by two different errors that are compensating each other, by luck! Please rephrase these sentences and discuss this point.
We will rephrase this sentence to make it clear. In addition, we can include more information about the timing of the QPF in Section 3.2 for connect to this point.
L419 to 420: idem! What about other events?
We presented the hydrograph for the other events in the supplementary material. We can bring them the main text to discuss about them.
L433: this combination needs to be more deeply presented in the method section.
We will add more information about it in the methodology Section.
Figure 9: hard to read and analyze because the maps are too small.
We will increase the size of the legend in the maps to improve readability.
L463 to 465: irrelevant analysis, since floods are of particular interest in this paper? What are the performances of the QPE on heavy rainfall and flood events?
We find this reference is relevant, since it is the only study found in literature that described the 2016 flash flood event in West Virginia.
L471: please clarify the duration of the flood.
We will add the duration of the events in Table 1. We will also add more information in this paragraph related to the time of peak of this event already described in Table 1.
L545: “We set up a physically based fully distributed hydrological-hydraulic model”: I do not agree with this point, since HAND is not a physically based model.
We agree that the HAND is not a physically based model (or even a model at all). However, in this phrase we were referring to WRF-Hydro, which has the diffusive wave routing component as an approximation of Saint Venant equations to calculate streamflow.
L564: “The WRF-Hydro parameter calibration was satisfactory to simulate the hourly streamflow during extreme flood events.” I do not agree with this conclusion: the model was good for the downstream station, with an underestimation of the flood events studied, but was bad for upstream stations. Please rephrase this conclusion in agreement with the presented results.
We made this statement based on the statistics presented in Table 4. We considered NSE > 0.5 satisfactory, we can include that in the paragraph. In fact, the two stations upstream Hilldale (Alderson and Buckeye), showed better statistics for the event. The hydrograph for Buckeye is in Figure 8.
L566: “The hydrological model also benefited from the second and third hours PPF forecast delays in predicting the most intense rainfall.” Again, the presented results are not in agreement with this conclusion. Please temper this conclusion.
This conclusion is the statement the reviewer brought in their comment about L405 to 407. We will provide an improved description in the revised manuscript.
We thank the reviewer 1 for all the feedback to improve our paper.
REFERENCES
Hocini, N., Payrastre, O., Bourgin, F., Gaume, E., Davy, P., Lague, D., Poinsignon, L., and Pons, F.: Performance of automated methods for flash flood inundation mapping: a comparison of a digital terrain model (DTM) filling and two hydrodynamic methods, Hydrol. Earth Syst. Sci., 25, 2979–2995, https://doi.org/10.5194/hess-25-2979-2021, 2021.
Johnson, J.M., Blodgett, D.L., Clarke, K.C. et al. Restructuring and serving web-accessible streamflow data from the NOAA National Water Model historic simulations. Sci Data 10, 725 (2023). https://doi.org/10.1038/s41597-023-02316-7 (a)
Johnson, J. M., Fang, S., Sankarasubramanian, A., Rad, A. M., Kindl da Cunha, L., Jennings, K. S., et al. (2023). Comprehensive analysis of the NOAA National Water Model: A call for heterogeneous formulations and diagnostic model selection. Journal of Geophysical Research: Atmospheres, 128, e2023JD038534. https://doi.org/10.1029/2023JD038534 (b)
Citation: https://doi.org/10.5194/egusphere-2023-2088-AC1
-
RC2: 'Comment on egusphere-2023-2088', Anonymous Referee #2, 21 Jan 2024
This paper provides a potentially interesting case study of the challenges of flash flood forecasting at local scales. They use the WRF-Hydro model, which is uses the Noah-MP land-surface model to generate grid scale water and energy fluxes, along with lateral water transport for surface and subsurface (within the soil column) flow. The spatial resolution and lateral flow techniques within WRF-hydro are broadly appropriate for this application. The authors present the model, the case, and then discuss an approach to improve their modeling efforts and provide an in-depth discussion of limitations and some possible paths forward.
The paper has a few areas where further editing for grammar and clarity will be helpful. I found some sentences to be either confusing or in contradiction to subsequent statements. The figures are of reasonable quality and generally support their analysis. However, there are three critical issues that prevent publication of this paper in my view.Critical points:
1) The thread that this paper develops a computationally efficient probabilistic forecast approach that improves the forecast is very problematic. It is not clear that any use of the ensemble/probabilistic approach is truly used for forecast verification. Ensemble traces are generated, but that appears to be the extent of it. The paper even states on line 276 that only the ensemble mean was used for verification.
There is continual mention that this probabilistic method improves the forecasts, but it appears to me to be solely from the bias correction portion of the method, and nothing to do with improving a true probabilistic skill, such as reliability, discrimination, CRPSS, or any other metric(s) using ensemble output. There should be much more analysis to this point to show if the method is useful for ensemble forecasting outside of a simple bias correction.
There is also the incorrect usage of term reliability. The presented verification does not show the forecasts are more reliable as there is no probabilistic verification. An improvement in correlation, RMSE, or other metrics does not speak to the reliability of an ensemble forecast.
2) The experimental design and focus on one event severely limit generalizability and insight. The experimental design limits the ability to understand contributions of the issues raised. Is the precipitation input the largest issue, model structure, the inundation mapping method? The identification of specific HRRR forecast hours 2-3 is likely a unique circumstance for this basin, model, and bias correction method. There is a large literature evaluating lead-time biases in forecast models that could benefit the discussion here. For example, there are known spin-up issues in NWP models, particularly for convective permitting models like the HRRR.
Are the listed barriers to improved inundation forecasts unique to this basin and what are their partitions? There is no way to understand uniqueness and partitioning issues in this study. For example, the paragraph starting on line 564 ends suggesting we need to deal with spatial heterogeneity and scale dependence, but it is not clear to me that is the biggest issue of all the issues highlighted.
Another example: The conclusion starting on line 570 – ‘Therefore, to go beyond the point’ may be valid, but as you show in this work, there are myriad other issues. The optimization methodology and scale-awareness issues may not even be the largest issues. How could you more clearly demonstrate the issues of QPE/QPF, initial conditions, or observational uncertainty? Further, there are methodologies already in existence that use gauge networks within basins to improve performance across the basin. It is unclear to me what exactly was done here for calibration, but it appears (lines 224-225) that only one gauge, Hilldale, was used? It would be good to explore subbasin calibration (e.g., doi:10.1029/2018EF001047). This could improve WRF-Hydro performance across the basin and would provide another level of complexity to contrast against a simple calibration.
3) The flood mapping verification is problematic. Using traditional grid point based contingency table methods can result in higher-resolution forecasts being rated as worse due to spatial displacement errors. If you use spatial verification, even simpler metrics like Fractions Skill Score (FSS) (and keep in mind caveats there too), you may end up with a different result and thus an entirely different conclusion in the paper. See https://journals.ametsoc.org/view/journals/mwre/149/10/MWR-D-18-0106.1.xml for some nice discussion and references to spatial metrics like FSS.
Minor technical and grammatical points:
- Line 29 – for the environment?
- Line 30 – skill instead of skills?
- The beginning of the sentence on line 48 is confusing to me. ‘An alternative’ seems to be somewhat in opposition, but I am not sure how that is the case with this sentence and the preceding one.
- Sentence starting on line 94 feels incomplete.
- Line 97 – I suppose the statement that ensemble forecasts increase reliability could be true most of the time relative to deterministic forecasts. However, if ensemble forecasts are generated incorrectly, they may not actually provide useful information and could still be 100% unreliable.
- Regarding point 2) starting on line 108. How can you even assess this point? You are comparing to a deterministic forecast? You have no reference ensemble forecast to understand if your ensemble approach would change uncertainty?
- Line 210 – What is the bold question mark supposed to be?
- Line 222 – NSE has been shown to be suboptimal for high-flow calibration in several papers. Can you justify its usage in this paper?
- Line 233 in instead of at?
- Line 248 - I may have missed the discussion, but how did you arrive at your initial conditions (both Noah-MP and channel states) for your forecasts? That is a critical component of flash flood forecasting.
- Line 286 – Do you have any understanding of your observational uncertainty for your flood mapping? Observational uncertainty is very important to understand for your forecast verification.
- I do not think CSI is the Characteristic Stability Index in this context. I believe it should be Critical Success Index, which is a contingency table-based score.
- Line 301 – interference instead of inferences?
- Line 301 – Sentence starting with ‘This assumes that the rainfall’ is confusing and likely a run-on sentence.
- Line 304 – Are these really USGS rain gauges?
- Line 314 – What does higher accuracy error mean?
- Line 320-321 – This discussion implies MRMS has significant bias, but from Table 2 MBIAS for MRMS appears to be near 1?
- Figure 6 and elsewhere – It is clear your ensemble is very under dispersive and would not have high reliability. Following major point 1, it seems to me to be critical to evaluate and discuss this.
- Line 370 – multiplicative bias should not have units.
- Line 394-395 – I am confused by the statement that the QPE simulation overestimated the flood peak. The bias corrected QPE would have removed precipitation volume biases, which QPE was used again in this simulation?
- Line 397-398 - Sentence starting with ‘This reveals a certain level of uncertainty to simulation extreme flood events’ You don’t really know how to attribute any uncertainty to any specific modeling component or process. Your QPE and QPF have substantial biases, the calibrated model has errors, etc.
- Table 4 – How was NSE computed? In the traditional way using the long-term mean flow? For an extreme event, that is not very informative.
- Line 449-450 – Sentence starting with ‘However, the PFF members’ This improvement appears to be solely from bias correction, not any ensemble/probabilistic technique.
- Line 454-455 – Again, your conclusion is solely due to precipitation bias correction, not the ensemble forecast.
- Line 504- What is the SIN (2012) reference?
- Line 559-560. Again, you did not demonstrate anything about reliability, as you only verified the ensemble mean.
- Line 573 – What exactly are the initial conditions used in this study? How could data assimilation improve them?
- Caption for Figure 2 – Do you mean PPF instead of PFF?
- Equations 1-7 may be unnecessary in the main text; they would be better in a supplement or not shown at all as these are standard metrics/calculations.
Citation: https://doi.org/10.5194/egusphere-2023-2088-RC2 -
AC2: 'Reply on RC2', Luiz Bacelar, 06 Mar 2024
We thank the reviewers for their time and helpful comments. In this response, we first provide a series of overview responses to major questions and concerns that were raised by the reviewers. After that initial overview, we provide a response to each reviewer’s comment and discuss how we will address each one in the revised manuscript. The reviewers comments are shown in blue italics, while the author responses are shown in unformatted text.
- Title clarity and study relevance to the state-of-the-art:
We agree that the current title can be misleading. The title could better convey that it is a case study rather than a broader review. We will refine it in the revised manuscript. That being said, we would argue that catchment studies are still relevant to improve the predictability of natural hazards, especially within similar frameworks in operational flood forecasting systems (i.e., WRF-Hydro and the National Water Model (NWM). The NWM framework was taken as an example of a state-of-the art in operational flood forecasting systems, for its robustness in operational capability to provide streamflow forecasting in high-resolution (2.7 million river reaches) across a continental extent in computationally efficient in real-time. Our independent experiment mirrors key operational components of the NWM framework, including QPE, hydrological model calibration, and flood mapping methodology. Conducting localized studies is essential to uncover potential errors in large-scale evaluations of such frameworks. For instance, recent studies on the NWM forecast accuracy (Johnson et al., 2023 (a) and Johnson et al., 2023 (b)) did not explore the impact of ensemble predictions on flood mapping or the influence of short-term ensemble precipitation or streamflow forecast, given their non-operational status. We used as a motivation to anticipate how increasing spatial resolution (1m HAND) may affect short-term flood mapping forecasts, contrasting it with the current state-of-the-art in near real-time high-resolution computationally efficient flood maps (10m HAND in the NWM) applicable for continental extents. We will include more information about the NWM in the introduction section, and in which points our forecast chain is similar to the operational version. Our analysis goes beyond assessing the uncertainty in flash flood mapping itself; it delves into the forecast chain, covering streamflow prediction, calibration, short-term rainfall prediction (HRRR), and the use of observed gridded rainfall (MRMS).
The operational HRRR QPF ensemble started to be available more recently in the end of 2020 (https://rapidrefresh.noaa.gov/hrrr/). The deterministic HRRR have been used by the NWM to provide its deterministic flood mapping prediction. For this reason, we chose to evaluate how a simple geostatistical methodology for generating rainfall ensembles through determinist short-term HRRR could impact flood mapping uncertainty during the 2016 flash flood event in West Virginia. It is important to clarify that our paper does not solely present a methodology for bias-correcting HRRR predictions through a straightforward geostatistical approach; rather, it aims to evaluate its consequential impacts on streamflow and flood mapping predictions within short lead times.
We acknowledge that more sophisticated QPF ensemble methodologies are available in literature and could potentially increase our accuracy for the second and third hour HRRR ensemble forecast as we mentioned it in the introduction (lines 99-103), and conclusions (line 560). We understand that different rainfall ensemble methodologies could impact statistical properties of the forecasted rainfall ensemble and influence, for example the spread of streamflow and flooded areas. But we contend that our results, for example, the uncertainty of flood mapping increases with a higher resolution of the DEM, will remain mostly unaffected by changes in the QPF ensemble methodology. The paper brings the insight that the HAND flood mapping methodology (adopted in operational flood forecast systems) in higher resolution would be more sensitive to a spread of streamflow predictions, and it reveals a potential gap in transferability of hydraulic properties in a complex terrain floodplain. We acknowledge that validating this hypothesis requires testing the same forecast chain in various mountainous regions where a 1m Lidar DEM is available.
Furthermore, we assert that a comprehensive analysis of multiple components within a flash flood forecast chain in a single study, ranging from hydrological model calibration and precipitation forecast to flood mapping over the floodplain, remains infrequent for short lead times and very high spatial resolutions due to data availability constraints. Such an approach proves invaluable for retro-analyzing deadly natural hazards and refining the operational capabilities of flash flood prediction. Unlike numerous flash-flood events, the observed flood height for the 2016 event in West Virginia was collected on a field scale by the USGS and is publicly accessible. Leveraging this dataset enabled us to demonstrate that our total forecasted ensemble spread (Figure 9, denoted by the black range over bar plots in the ensemble mean) reduced the False Alarm Ratio (FAR) in flooded areas compared to the simulated map through MRMS QPE (no prediction) and the deterministic HRRR simulation.
Finally, we would like to emphasize how our study contributes to the flood flash literature by examining two components not yet implemented in the NWM, with potential inclusion in future versions: ensemble streamflow forecasts in short lead times through QPF ensemble and higher resolution (1m HAND) flood mapping over floodplain areas. We contributed to demonstrate that operational ensembles would not only be necessary to reveal the uncertainties in WRF-Hydro streamflow prediction but also in the final HAND flood mapping methodology. Those results have not been yet demonstrated using the HAND methodology in a similar forecast chain (i.e., QPE, hydrological model, calibration methodology and deterministic QPF) such as the NWM. We will clarify the intention of our contributions more explicitly in the introduction section in the revised manuscript
- The use of metrics for ensemble evaluation
In Section 2.8, we highlighted the importance of ensemble metrics to showcase the accuracy, reliability, sharpness, and skill of the probabilistic forecasts. We acknowledge that our initially presented metrics solely focused on evaluating forecast accuracy through the ensemble mean. We commit to addressing this limitation by incorporating additional metrics, including CRPSS and others, to comprehensively analyze the reliability and skill of both rainfall and streamflow ensembles. Furthermore, we recognize the value in calculating a new metric for the evaluation of flood mapping in Section 3.5., as suggested by reviewer 2, the fraction skill score (FSS). We fully endorse that incorporating these additional ensemble metrics will enhance the overall interpretation of our results.
- Calibration methodology and number of events.
As part of forecast chain methodology, we performed the WRF-Hydro calibration and validation from 2015-2020 to keep consistency with the MRMS QPE availability, which started in operation in 2015. We will add the information that we presented for the statistics for the calibration (05/2015-10/2018) and validation (11/2018-2020) period together in Figure 7, and we will present them separately for the next round of revisions. We also used the NLDAS precipitation dataset to spin-up the model (2013-2015), before our calibration and validation periods. This approach aimed to maintain consistency and account for potential biases in QPE, as discussed in Section 3.1, throughout our parameter search process. To avoid overlapping with the NWM calibration, we sought collaboration with the NWM team, obtaining their Noah-MP LSM parameters as an initial guess for our experiments (Figures in the supplementary material). This collaboration was motivated by the shared use of similar Noah-MP land surface parametrizations between our study and the NWM. It is worth mentioning that the NWM was calibrated with a different configuration and period of input data than our study.
Keeping in mind the whole forecast chain, our calibration methodology primarily focused on ensuring the accurate simulation of runoff and streamflow for the flood mapping process. Incorporating extreme events into the hydrologic-hydraulic model calibration is a common practice in flood mapping studies, particularly to optimize parameters such as channel roughness in the routing parametrization, which significantly influences flood wave and water height estimations. A similar approach was noted in a European flash flood mapping study by Hocini et al. 2021, emphasizing event-specific calibration against discharge observations to minimize errors in peak discharges used for hydrologic-hydraulic simulations.
In our case, as outlined in Section 2.5, the calibration methodology included targeting the diffusive wave river routing parameters within WRF-Hydro. We demonstrated the accuracy of the calibration to generate ensembles for the other events, not only for the flash flood in 2016, and to validate our calibration we used different streamflow gauges through the Greenbrier river. Although our calibration methodology used only one streamflow gauge downstream the Greenbrier river (Hilldale), we demonstrated that during the 2016 flash flood event this validation presented a NSE of 0.82 and Pearson Correlation (MRMS QPE) of 0.93 in the Buckeye station (Table 4), the station upstream to Howard Creek. As noted in line 540, we acknowledge the inability to validate our calibration for simulating the outlet of Howard Creek due to the absence of a USGS gauge during the 2016 flash flood event. In response to a reviewer's suggestion, we commit to incorporating the Howard Creek location onto Figure 1, as currently, it is only presented in Figure 9.
We will enhance the clarity of the manuscript by providing a more detailed description of the methodology. Additionally, we recognize the need to incorporate concrete examples in the introduction section and offer a more detailed description of the conclusions, as suggested by reviewer 1. Addressing the minor technical and grammatical points highlighted by reviewer 2 is also a priority for us. Furthermore, we acknowledge the importance of including more European studies in the references, as recommended by reviewer 3. As mentioned previously, we are committed to calculating additional ensemble metrics to further enrich the analysis.
Point-by-point response to reviewers comments
The rest of the response addresses each reviewer’s comment; the reviewer comment is shown in
blue italics, while the author responses are in unformatted black text.
Reviewer 2
Critical points:
- The thread that this paper develops a computationally efficient probabilistic forecast approach that improves the forecast is very problematic. It is not clear that any use of the ensemble/probabilistic approach is truly used for forecast verification. Ensemble traces are generated, but that appears to be the extent of it. The paper even states on line 276 that only the ensemble mean was used for verification.
There is continual mention that this probabilistic method improves the forecasts, but it appears to me to be solely from the bias correction portion of the method, and nothing to do with improving a true probabilistic skill, such as reliability, discrimination, CRPSS, or any other metric(s) using ensemble output. There should be much more analysis to this point to show if the method is useful for ensemble forecasting outside of a simple bias correction.
There is also the incorrect usage of term reliability. The presented verification does not show the forecasts are more reliable as there is no probabilistic verification. An improvement in correlation, RMSE, or other metrics does not speak to the reliability of an ensemble forecast.
We agree with the reviewer and we addressed this issue in our general key points 2) , since it was also brought by the reviewer 3. Section 2.8, we highlighted the importance of ensemble metrics to showcase the accuracy, reliability, sharpness, and skill of the probabilistic forecasts. We acknowledge that our initially presented metrics solely focused on evaluating forecast accuracy through the ensemble mean. We commit to addressing this limitation by incorporating additional metrics, including CRPSS and others, to comprehensively analyze the reliability and skill of both rainfall and streamflow ensembles. Furthermore, we recognize the value in calculating a new metric for the evaluation of flood mapping in Section 3.5., as suggested by reviewer 2, the fraction skill score (FSS). We fully endorse that incorporating these additional ensemble metrics will enhance the overall interpretation of our results.
- The experimental design and focus on one event severely limit generalizability and insight. The experimental design limits the ability to understand contributions of the issues raised. Is the precipitation input the largest issue, model structure, the inundation mapping method? The identification of specific HRRR forecast hours 2-3 is likely a unique circumstance for this basin, model, and bias correction method. There is a large literature evaluating lead-time biases in forecast models that could benefit the discussion here. For example, there are known spin-up issues in NWP models, particularly for convective permitting models like the HRRR.
We addressed this issue in our general key points 1) and 3) , since it was also brought by the reviewer 3. We understand that localized studies are important to discuss those unique circumstances in HRRR prediction for a specific extreme event. We will add more references in lead-time biases of short-term NWP in our discussion.
The operational HRRR QPF ensemble started to be available more recently in the end of 2020 (https://rapidrefresh.noaa.gov/hrrr/). The deterministic HRRR have been used by the NWM to provide its deterministic flood mapping prediction. For this reason, we chose to evaluate how a simple geostatistical methodology for generating rainfall ensembles through determinist short-term HRRR could impact flood mapping uncertainty during the 2016 flash flood event in West Virginia. It is important to clarify that our paper does not solely present a methodology for bias-correcting HRRR predictions through a straightforward geostatistical approach; rather, it aims to evaluate its consequential impacts on streamflow and flood mapping predictions within short lead times. We acknowledge that more sophisticated QPF ensemble methodologies are available in literature and could potentially increase our accuracy for the second and third hour HRRR ensemble forecast as we mentioned it in the introduction (lines 99-103), and conclusions (line 560). We understand that different rainfall ensemble methodologies could impact statistical properties of the forecasted rainfall ensemble and influence, for example the spread of streamflow and flooded areas. But we contend that our results, for example, the uncertainty of flood mapping increases with a higher resolution of the DEM, will remain mostly unaffected by changes in the QPF ensemble methodology. The paper brings the insight that the HAND flood mapping methodology (adopted in operational flood forecast systems) in higher resolution would be more sensitive to a spread of streamflow predictions, and it reveals a potential gap in transferability of hydraulic properties in a complex terrain floodplain. We acknowledge that validating this hypothesis requires testing the same forecast chain in various mountainous regions where a 1m Lidar DEM is available.
Are the listed barriers to improved inundation forecasts unique to this basin and what are their partitions? There is no way to understand uniqueness and partitioning issues in this study. For example, the paragraph starting on line 564 ends suggesting we need to deal with spatial heterogeneity and scale dependence, but it is not clear to me that is the biggest issue of all the issues highlighted.
We showed the QPE uncertainties and Hydrological model calibration and validation for a period from 2015-2020. Our ensemble methodology for QPF and streamflow were evaluated for six flood events. And the final step in our forecast chain, the flood mapping methodology, was evaluated for one event. We agree that to express more of the uncertainties in the flood mapping would require the application of the methodology in differences sites. But the data we used for this section, the DEM in 1m and an observed flash flood map collected in field scale in a mountainous region, were hard to find between 2015-2020 in the US territory. We mentioned we need to make clear that our study is a case study in our key answers point 1). Additionally, we desired to perform our forecast chain methodology using the most operational dataset for short-term forecast in the US territory (HRRR QPF, and MRMS QPE) and in a framework equivalent to the NWM, to test some hypotheses of such start-of-the-art in operational flood forecasting during this deadly natural disaster in West Virginia. We agree that some of the validation needs to me improved as we discussed in our general key answers 2), and it can be helpful to highlight the issues. As we evaluated a forecast chain, it is hard to assume the biggest uncertainty overall, but instead in different components of the forecast chain. We observed the MRMS uncertainty in quantify precipitation in this mountainous area in different seasons and in an event basis, as well as HRRR QPF uncertainty to predict the rainfall spatial patterns in short-lead times for extreme rainfall.
Another example: The conclusion starting on line 570 – ‘Therefore, to go beyond the point’ may be valid, but as you show in this work, there are myriad other issues. The optimization methodology and scale-awareness issues may not even be the largest issues. How could you more clearly demonstrate the issues of QPE/QPF, initial conditions, or observational uncertainty? Further, there are methodologies already in existence that use gauge networks within basins to improve performance across the basin. It is unclear to me what exactly was done here for calibration, but it appears (lines 224-225) that only one gauge, Hilldale, was used? It would be good to explore subbasin calibration (e.g., doi:10.1029/2018EF001047). This could improve WRF-Hydro performance across the basin and would provide another level of complexity to contrast against a simple calibration.
We agree that different choices for calibration methodology could impact the hydrological model performance, and subsequently the streamflow and water height predictions, we discussed our choice in the key answers 3). In fact, the two stations upstream Hilldale (Alderson and Buckeye), showed better statistics for the event. The hydrograph for Buckeye is in Figure 8. We agree to calculate and include more metrics in the revised manuscript, as mentioned in the key answers 2) to improve the demonstration for the ensemble. We also used the NLDAS precipitation data to spin-up the model (2013-2015), before our calibration and validation periods, and we will include this information. We also agree to include more information about the initial conditions of the hydrological model (i.e., state of soil saturation), prior to the events. For the forecasted events, we initialized the model with MRMS QPE forcing from 2015. So, we initialized the events as many times as its full period of test. We evaluated each lead time for each time-step of the model. We will include this information in the manuscript.
3) The flood mapping verification is problematic. Using traditional grid point based contingency table methods can result in higher-resolution forecasts being rated as worse due to spatial displacement errors. If you use spatial verification, even simpler metrics like Fractions Skill Score (FSS) (and keep in mind caveats there too), you may end up with a different result and thus an entirely different conclusion in the paper. See https://journals.ametsoc.org/view/journals/mwre/149/10/MWR-D-18-0106.1.xml for some nice discussion and references to spatial metrics like FSS.
We agree to calculate the FSS for the next round of revisions, as mentioned in our key answers 2). We will also include more information about the flood mapping methodology in the methods sections as suggested by reviewer 1).
Minor technical and grammatical points:
Line 29 – for the environment?
We will change that, thank you.
Line 30 – skill instead of skills?
We will change that, thank you.
The beginning of the sentence on line 48 is confusing to me. ‘An alternative’ seems to be somewhat in opposition, but I am not sure how that is the case with this sentence and the preceding one.
We meant an alternative to the static maps flood maps mentioned in the paragraph before. We agree we need to include more information in the revised manuscript.
Sentence starting on line 94 feels incomplete.
We will change that, thank you.
Line 97 – I suppose the statement that ensemble forecasts increase reliability could be true most of the time relative to deterministic forecasts. However, if ensemble forecasts are generated incorrectly, they may not actually provide useful information and could still be 100% unreliable.
Regarding point 2) starting on line 108. How can you even assess this point? You are comparing to a deterministic forecast? You have no reference ensemble forecast to understand if your ensemble approach would change uncertainty?
We agree to calculate more ensemble metrics as discussed in the key answers 2).
Line 210 – What is the bold question mark supposed to be?
It is a reference disconnected in the overleaf. We will fix that, thank you.
Line 222 – NSE has been shown to be suboptimal for high-flow calibration in several papers. Can you justify its usage in this paper?
We acknowledge this feedback, and we will address this caveat in the revised discussion section. However, going beyond NSE is beyond the scope of this study given timing and computational constraints to completely redo the calibration. We computed those statistics for the 2016 flash flood event only. The caption in the table says, “during the extreme flood event in 06-2016’. We provided information about the long-term NSE in Figure 7.
Line 233 in instead of at?
We will correct that, thank you.
Line 248 - I may have missed the discussion, but how did you arrive at your initial conditions (both Noah-MP and channel states) for your forecasts? That is a critical component of flash flood forecasting.
As mentioned in an earlier response, for the forecasted events, we initialized the model with MRMS QPE forcing from 2015 until the start of the prediction. So, we initialized the events as many times as its full period of test. We also used the NLDAS precipitation data to spin-up the model (2013-2015), before our calibration and validation periods. We evaluated each lead time for each time-step of the model. We will include this information in the revised manuscript.
Line 286 – Do you have any understanding of your observational uncertainty for your flood mapping? Observational uncertainty is very important to understand for your forecast verification.
We can include this information in the revised manuscript paper. The dataset provided a vectorized estimated observed flood map through the flood benchmark. We used the vectorized estimated observed flood map from the dataset for our evaluation. We will mention it in the methodology section.
I do not think CSI is the Characteristic Stability Index in this context. I believe it should be Critical Success Index, which is a contingency table-based score.
We will correct this, thank you.
Line 301 – interference instead of inferences?
We will correct that, thank you.
Line 301 – Sentence starting with ‘This assumes that the rainfall’ is confusing and likely a run-on sentence.
We will split the sentence in two parts.
Line 304 – Are these really USGS rain gauges?
They were all from the USGS website, we can add this information in the methodology section.
Line 314 – What does higher accuracy error mean?
We assumed that higher RMSE would be a indicator of high accuracy error. We will change the word “higher” for “high”.
Line 320-321 – This discussion implies MRMS has significant bias, but from Table 2 MBIAS for MRMS appears to be near 1?
We can point the differences between the evaluation of the QPE between 2015-2020 (Figure 4) and its performance during events at Table 2. But Figure 4 shows a persistent underestimation with MBIAS < 0.8 in many regions of the basin.
Figure 6 and elsewhere – It is clear your ensemble is very under dispersive and would not have high reliability. Following major point 1, it seems to me to be critical to evaluate and discuss this.
Figure 6 (a) shows mean values over the domain for each member but separated in different rainfall intensities. We agree that for lower rainfall intensities the members are in lower spread, as it is associated with how the field has been corrected. This correction increases the dispersion of the members as the rainfall rate increases. In Figure 6 (d), (e ) and (f), we can see that accumulated rainfall for the member showed a higher dispersion if considering only one pixel for evaluation. As we mentioned before, we will calculate the CRPSS to investigate the reliability.
Line 370 – multiplicative bias should not have units.
We will correct this, thank you.
Line 394-395 – I am confused by the statement that the QPE simulation overestimated the flood peak. The bias corrected QPE would have removed precipitation volume biases, which QPE was used again in this simulation?
This can be related to the calibration of model. Even though the QPE was underestimated for the total rainfall event, the hydrological model simulated a higher peak than the observed. As we can see at Figure 8 and Table 4, this overestimation of the flood peak happened mostly at downstream areas of the basin. We can assume that this overestimation was related to the calibration of the model.
Line 397-398 - Sentence starting with ‘This reveals a certain level of uncertainty to simulation extreme flood events’ You don’t really know how to attribute any uncertainty to any specific modeling component or process. Your QPE and QPF have substantial biases, the calibrated model has errors, etc.
Thank you for this feedback. We will address the language in the revised manuscript to avoid overarching statements.
Table 4 – How was NSE computed? In the traditional way using the long-term mean flow? For an extreme event, that is not very informative.
We computed those statistics for the 2016 flash flood event only. The caption in the table says, “during the extreme flood event in 06-2016’. We provided information about the long-term NSE in Figure 7.
Line 449-450 – Sentence starting with ‘However, the PFF members’ This improvement appears to be solely from bias correction, not any ensemble/probabilistic technique.
As Line 454-455 – Again, your conclusion is solely due to precipitation bias correction, not the ensemble forecast.
We will calculate more statistics such as CRPSS and ROC to verify this information.
Line 504- What is the SIN (2012) reference?
This was an error from overleaf, we will correct that in the revised manuscript.
Line 559-560. Again, you did not demonstrate anything about reliability, as you only verified the ensemble mean.
As we mentioned, we will calculate more statistics such as CRPSS and ROC to verify this information.
Line 573 – What exactly are the initial conditions used in this study? How could data assimilation improve them?
We mentioned the initial conditions to your question in line 248. We could add more information about DA and how does it improve real-time forecasting.
Caption for Figure 2 – Do you mean PPF instead of PFF?
We will correct that, thank you.
Equations 1-7 may be unnecessary in the main text; they would be better in a supplement or not shown at all as these are standard metrics/calculations.
We will consider that for the next round of revisions.
REFERENCES
Hocini, N., Payrastre, O., Bourgin, F., Gaume, E., Davy, P., Lague, D., Poinsignon, L., and Pons, F.: Performance of automated methods for flash flood inundation mapping: a comparison of a digital terrain model (DTM) filling and two hydrodynamic methods, Hydrol. Earth Syst. Sci., 25, 2979–2995, https://doi.org/10.5194/hess-25-2979-2021, 2021.
Johnson, J.M., Blodgett, D.L., Clarke, K.C. et al. Restructuring and serving web-accessible streamflow data from the NOAA National Water Model historic simulations. Sci Data 10, 725 (2023). https://doi.org/10.1038/s41597-023-02316-7 (a)
Johnson, J. M., Fang, S., Sankarasubramanian, A., Rad, A. M., Kindl da Cunha, L., Jennings, K. S., et al. (2023). Comprehensive analysis of the NOAA National Water Model: A call for heterogeneous formulations and diagnostic model selection. Journal of Geophysical Research: Atmospheres, 128, e2023JD038534. https://doi.org/10.1029/2023JD038534 (b)
Citation: https://doi.org/10.5194/egusphere-2023-2088-AC2
-
RC3: 'Comment on egusphere-2023-2088', Anonymous Referee #3, 26 Jan 2024
General remarks:
This might be an interesting study reviving the large efforts that have been realized on the topic in Europe in the 2010s. The provision of useful forecasts for flood management at short lead time is a challenge in complex topographic areas and examples on how to deal with is indeed of interest for the community. As single catchment studies might have been state-of-the art some years ago, we are now in the era of large-sample and multi-model experiments. From this point of view this study surely belongs to the category of local case study.
As a reader and contributor to the topic in the past, I am honestly disappointed, that almost none of the past European studies on the topic was found deserving to be listed either in the introduction or in the discussion.
Another major issue is the claim of being probabilistic and ending up evaluating an ensemble mean. It is long time that this was the way that ensemble experiments have been evaluated.
If you want to contribute to the research in this community, you should first better analyse what the state-of-the-art is. Even if I real like the final step to the inundation mapping, in this shape this contribution is an interesting experiment but no advance in the field.
Major issues:
Introduction:
20 – 118: I urge you to have browse on the research conducted in the European Mountains in the around the 2010s. You will find several applications of the methods you use. Staying in HESS you might find a good start in:
Liechti, K., Panziera, L., Germann, U., and Zappa, M.: The potential of radar-based ensemble forecasts for flash-flood early warning in the southern Swiss Alps, Hydrol. Earth Syst. Sci., 17, 3853–3869, https://doi.org/10.5194/hess-17-3853-2013, 2013.
And
Poletti, M. L., Silvestro, F., Davolio, S., Pignone, F., and Rebora, N.: Using nowcasting technique and data assimilation in a meteorological model to improve very short range hydrological forecasts, Hydrol. Earth Syst. Sci., 23, 3823–3841, https://doi.org/10.5194/hess-23-3823-2019, 2019.
And
Silvestro, F., N. Rebora, and L. Ferraris, 2011: Quantitative Flood Forecasting on Small- and Medium-Sized Basins: A Probabilistic Approach for Operational Purposes. J. Hydrometeor., 12, 1432–1446, https://doi.org/10.1175/JHM-D-10-05022.1.
Or in the US:
Gourley, J. J., and Coauthors, 2017: The FLASH Project: Improving the Tools for Flash Flood Monitoring and Prediction across the United States ull. Amer. Meteor. Soc., 98, 361–372, https://doi.org/10.1175/BAMS-D-15-00247.1.
A comprehensive review can be found in
Rossa, A., Liechti, K., Zappa, M., Bruen, M., Germann, U., Haase, G., … Krahe, P. (2011). The COST 731 Action: a review on uncertainty propagation in advanced hydro-meteorological forecast systems. Atmospheric Research, 100(2-3), 150-167. https://doi.org/10.1016/j.atmosres.2010.11.016
There is a whole special Issue that might deserve your interest
https://www.sciencedirect.com/journal/atmospheric-research/vol/100/issue/2
A summary of an operational demonstration can be found in
Rotach, M. W., Arpagaus, M., Dorninger, M., Hegg, C., Montani, A., and Ranzi, R.: Uncertainty propagation for flood forecasting in the Alps: different views and impacts from MAP D-PHASE, Nat. Hazards Earth Syst. Sci., 12, 2439–2448, https://doi.org/10.5194/nhess-12-2439-2012, 2012.
For the blending of QPE and QPF there is an operational product described here
Sideris IV, Foresti L, Nerini D, Germann U. NowPrecip: localized precipitation nowcasting in the complex terrain of Switzerland. QJR Meteorol Soc. 2020; 146: 1768–1800. https://doi.org/10.1002/qj.3766
Nerini, D., L. Foresti, D. Leuenberger, S. Robert, and U. Germann, 2019: A Reduced-Space Ensemble Kalman Filter Approach for Flow-Dependent Integration of Radar Extrapolation Nowcasts and NWP Precipitation Ensembles. Mon. Wea. Rev., 147, 987–1006, https://doi.org/10.1175/MWR-D-18-0258.1.
138: Table 1
Six events in a catchment are rather ambitious to assess “BARRIERS TO OPERATIONAL FLOOD FORECASTING IN COMPLEX TERRAIN”. The title should reflect more realistically the contribution.
156 – 160:
So, QPF are available with three hours lead time and you need them 6 hours in advance?
169 – 176: This seems to me a “brute force” blending of QPF and PFF.
A very advanced system on this is INCA
Haiden, T., A. Kann, C. Wittmann, G. Pistotnik, B. Bica, and C. Gruber, 2011: The Integrated Nowcasting through Comprehensive Analysis (INCA) System and Its Validation over the Eastern Alpine Region. Wea. Forecasting, 26, 166–183, https://doi.org/10.1175/2010WAF2222451.1.
You can also find some efforts to extrapolate consistent weather radar fields here:
Germann, U., Berenguer, M., Sempere-Torres, D., & Zappa, M. (2009). REAL - ensemble radar precipitation estimation for hydrology in a mountainous region. Quarterly Journal of the Royal Meteorological Society, 135(639), 445-456. https://doi.org/10.1002/qj.375
196 – Section 2.5
I do not understand why you willingly decide to invalidate you calibration by omitting to define an independent validation period. On line 378-379 you even declare “We also did not perform a separate streamflow validation against observed gauges for any other period than in calibration (2015-2020)”. All your events are in the calibration period. This weakens every finding you obtain from your data.
266 – Section 2.8
The used metrics are focussed on deterministic outcome. There is no added value in using a ensemble and in the end evaluate the ensemble mean. Any of the member might have a different timing of the peak and you just average it out with the ensemble mean. A good start as source of metrics for probabilistic forecasts is found here:
Brown, J. D., Demargne, J., Seo, D. J., & Liu, Y. (2010). The Ensemble Verification System (EVS): A software tool for verifying ensemble forecasts of hydrometeorological and hydrologic variables at discrete locations. Environmental Modelling & Software, 25(7), 854-872.
And here
https://link.springer.com/referencework/10.1007/978-3-642-40457-3
300-301: From 2006:
Germann, U., Galli, G., Boscacci, M. and Bolliger, M. (2006), Radar precipitation measurement in a mountainous region. Q.J.R. Meteorol. Soc., 132: 1669-1692. https://doi.org/10.1256/qj.05.190
318-319: In my experience “overshooting”is linked to hail cells.
355-360: Why choosing to operate an ensemble and ignore it int the assessment of forecast quality?
360: Section 3.3 is interesting, but would be more valuable to learn about validation results then to be guided into the outcomes of calibration.
382: Figure 7is purely focussed on the low-flow tail of the flow duration curve. As you are presenting a flood related study a focus on the flood tail would be more interesting.
398: You write: “This reveals a certain level of uncertainty to simulate extreme
flood events even when they were part of the calibration process that could be attributed to hydrologic heterogeneity.”. It is basically a flaw to have evaluations for events being within the calibration period. The amount of the “certain”-level of uncertainty should be the goal of such a study.
412: Figure 8. Instead of putting 20 grey lines in the legend you could put one and call that “ensemble members” (as generally done in this kind of works)
413: I fail to find Table 6 and 7. They should have shown the evaluation of selected members. This should be actually avoided and replaced by an evaluation of the ensemble as state-of-the-art in probabilistic forecasting.
421: Section 3.5 is highly interesting. I like the step from flood forecast to inundation maps and the efforts in assessing the spatial skill. I also like the use of the HAND approach for such a task. As before, a pity that you just use the ensemble mean.
Final considerations:
As noted several times in my previous comments, this paper would have been a major contribution to the community about 15 years ago. As for now it fails making a state-of-the-art study out of the available data. The major problem is the missing validation of the hydrological model, the lack of any effort to provide a probabilistic assessment of the results and the small amount of events and catchments in order to come up with robust findings on the barriers of flood-forecasting in mountainous areas.
Citation: https://doi.org/10.5194/egusphere-2023-2088-RC3 -
AC3: 'Reply on RC3', Luiz Bacelar, 06 Mar 2024
We thank the reviewers for their time and helpful comments. In this response, we first provide a series of overview responses to major questions and concerns that were raised by the reviewers. After that initial overview, we provide a response to each reviewer’s comment and discuss how we will address each one in the revised manuscript. The reviewers comments are shown in blue italics, while the author responses are shown in unformatted text.
- Title clarity and study relevance to the state-of-the-art:
We agree that the current title can be misleading. The title could better convey that it is a case study rather than a broader review. We will refine it in the revised manuscript. That being said, we would argue that catchment studies are still relevant to improve the predictability of natural hazards, especially within similar frameworks in operational flood forecasting systems (i.e., WRF-Hydro and the National Water Model (NWM). The NWM framework was taken as an example of a state-of-the art in operational flood forecasting systems, for its robustness in operational capability to provide streamflow forecasting in high-resolution (2.7 million river reaches) across a continental extent in computationally efficient in real-time. Our independent experiment mirrors key operational components of the NWM framework, including QPE, hydrological model calibration, and flood mapping methodology. Conducting localized studies is essential to uncover potential errors in large-scale evaluations of such frameworks. For instance, recent studies on the NWM forecast accuracy (Johnson et al., 2023 (a) and Johnson et al., 2023 (b)) did not explore the impact of ensemble predictions on flood mapping or the influence of short-term ensemble precipitation or streamflow forecast, given their non-operational status. We used as a motivation to anticipate how increasing spatial resolution (1m HAND) may affect short-term flood mapping forecasts, contrasting it with the current state-of-the-art in near real-time high-resolution computationally efficient flood maps (10m HAND in the NWM) applicable for continental extents. We will include more information about the NWM in the introduction section, and in which points our forecast chain is similar to the operational version. Our analysis goes beyond assessing the uncertainty in flash flood mapping itself; it delves into the forecast chain, covering streamflow prediction, calibration, short-term rainfall prediction (HRRR), and the use of observed gridded rainfall (MRMS).
The operational HRRR QPF ensemble started to be available more recently in the end of 2020 (https://rapidrefresh.noaa.gov/hrrr/). The deterministic HRRR have been used by the NWM to provide its deterministic flood mapping prediction. For this reason, we chose to evaluate how a simple geostatistical methodology for generating rainfall ensembles through determinist short-term HRRR could impact flood mapping uncertainty during the 2016 flash flood event in West Virginia. It is important to clarify that our paper does not solely present a methodology for bias-correcting HRRR predictions through a straightforward geostatistical approach; rather, it aims to evaluate its consequential impacts on streamflow and flood mapping predictions within short lead times.
We acknowledge that more sophisticated QPF ensemble methodologies are available in literature and could potentially increase our accuracy for the second and third hour HRRR ensemble forecast as we mentioned it in the introduction (lines 99-103), and conclusions (line 560). We understand that different rainfall ensemble methodologies could impact statistical properties of the forecasted rainfall ensemble and influence, for example the spread of streamflow and flooded areas. But we contend that our results, for example, the uncertainty of flood mapping increases with a higher resolution of the DEM, will remain mostly unaffected by changes in the QPF ensemble methodology. The paper brings the insight that the HAND flood mapping methodology (adopted in operational flood forecast systems) in higher resolution would be more sensitive to a spread of streamflow predictions, and it reveals a potential gap in transferability of hydraulic properties in a complex terrain floodplain. We acknowledge that validating this hypothesis requires testing the same forecast chain in various mountainous regions where a 1m Lidar DEM is available.
Furthermore, we assert that a comprehensive analysis of multiple components within a flash flood forecast chain in a single study, ranging from hydrological model calibration and precipitation forecast to flood mapping over the floodplain, remains infrequent for short lead times and very high spatial resolutions due to data availability constraints. Such an approach proves invaluable for retro-analyzing deadly natural hazards and refining the operational capabilities of flash flood prediction. Unlike numerous flash-flood events, the observed flood height for the 2016 event in West Virginia was collected on a field scale by the USGS and is publicly accessible. Leveraging this dataset enabled us to demonstrate that our total forecasted ensemble spread (Figure 9, denoted by the black range over bar plots in the ensemble mean) reduced the False Alarm Ratio (FAR) in flooded areas compared to the simulated map through MRMS QPE (no prediction) and the deterministic HRRR simulation.
Finally, we would like to emphasize how our study contributes to the flood flash literature by examining two components not yet implemented in the NWM, with potential inclusion in future versions: ensemble streamflow forecasts in short lead times through QPF ensemble and higher resolution (1m HAND) flood mapping over floodplain areas. We contributed to demonstrate that operational ensembles would not only be necessary to reveal the uncertainties in WRF-Hydro streamflow prediction but also in the final HAND flood mapping methodology. Those results have not been yet demonstrated using the HAND methodology in a similar forecast chain (i.e., QPE, hydrological model, calibration methodology and deterministic QPF) such as the NWM. We will clarify the intention of our contributions more explicitly in the introduction section in the revised manuscript
- The use of metrics for ensemble evaluation
In Section 2.8, we highlighted the importance of ensemble metrics to showcase the accuracy, reliability, sharpness, and skill of the probabilistic forecasts. We acknowledge that our initially presented metrics solely focused on evaluating forecast accuracy through the ensemble mean. We commit to addressing this limitation by incorporating additional metrics, including CRPSS and others, to comprehensively analyze the reliability and skill of both rainfall and streamflow ensembles. Furthermore, we recognize the value in calculating a new metric for the evaluation of flood mapping in Section 3.5., as suggested by reviewer 2, the fraction skill score (FSS). We fully endorse that incorporating these additional ensemble metrics will enhance the overall interpretation of our results.
- Calibration methodology and number of events.
As part of forecast chain methodology, we performed the WRF-Hydro calibration and validation from 2015-2020 to keep consistency with the MRMS QPE availability, which started in operation in 2015. We will add the information that we presented for the statistics for the calibration (05/2015-10/2018) and validation (11/2018-2020) period together in Figure 7, and we will present them separately for the next round of revisions. We also used the NLDAS precipitation dataset to spin-up the model (2013-2015), before our calibration and validation periods. This approach aimed to maintain consistency and account for potential biases in QPE, as discussed in Section 3.1, throughout our parameter search process. To avoid overlapping with the NWM calibration, we sought collaboration with the NWM team, obtaining their Noah-MP LSM parameters as an initial guess for our experiments (Figures in the supplementary material). This collaboration was motivated by the shared use of similar Noah-MP land surface parametrizations between our study and the NWM. It is worth mentioning that the NWM was calibrated with a different configuration and period of input data than our study.
Keeping in mind the whole forecast chain, our calibration methodology primarily focused on ensuring the accurate simulation of runoff and streamflow for the flood mapping process. Incorporating extreme events into the hydrologic-hydraulic model calibration is a common practice in flood mapping studies, particularly to optimize parameters such as channel roughness in the routing parametrization, which significantly influences flood wave and water height estimations. A similar approach was noted in a European flash flood mapping study by Hocini et al. 2021, emphasizing event-specific calibration against discharge observations to minimize errors in peak discharges used for hydrologic-hydraulic simulations.
In our case, as outlined in Section 2.5, the calibration methodology included targeting the diffusive wave river routing parameters within WRF-Hydro. We demonstrated the accuracy of the calibration to generate ensembles for the other events, not only for the flash flood in 2016, and to validate our calibration we used different streamflow gauges through the Greenbrier river. Although our calibration methodology used only one streamflow gauge downstream the Greenbrier river (Hilldale), we demonstrated that during the 2016 flash flood event this validation presented a NSE of 0.82 and Pearson Correlation (MRMS QPE) of 0.93 in the Buckeye station (Table 4), the station upstream to Howard Creek. As noted in line 540, we acknowledge the inability to validate our calibration for simulating the outlet of Howard Creek due to the absence of a USGS gauge during the 2016 flash flood event. In response to a reviewer's suggestion, we commit to incorporating the Howard Creek location onto Figure 1, as currently, it is only presented in Figure 9.
We will enhance the clarity of the manuscript by providing a more detailed description of the methodology. Additionally, we recognize the need to incorporate concrete examples in the introduction section and offer a more detailed description of the conclusions, as suggested by reviewer 1. Addressing the minor technical and grammatical points highlighted by reviewer 2 is also a priority for us. Furthermore, we acknowledge the importance of including more European studies in the references, as recommended by reviewer 3. As mentioned previously, we are committed to calculating additional ensemble metrics to further enrich the analysis.
Point-by-point response to reviewers comments
The rest of the response addresses each reviewer’s comment; the reviewer comment is shown in
blue italics, while the author responses are in unformatted black text.
Reviewer 3
General remarks:
This might be an interesting study reviving the large efforts that have been realized on the topic in Europe in the 2010s. The provision of useful forecasts for flood management at short lead time is a challenge in complex topographic areas and examples on how to deal with is indeed of interest for the community. As single catchment studies might have been state-of-the art some years ago, we are now in the era of large-sample and multi-model experiments. From this point of view this study surely belongs to the category of local case study.
Thank you for this feedback, we will change the title to make it clear this is more of a case study. As we discussed in our key answer points 1), we still believe catchment studies are relevant, specially following a similar framework as the NWM, as we considered such a framework the state-of-the art in operational flood forecasting for its spatial and temporal resolution in short-term forecasts.
As a reader and contributor to the topic in the past, I am honestly disappointed, that almost none of the past European studies on the topic was found deserving to be listed either in the introduction or in the discussion.
Thank you for this honest feedback. This was our mistake, and it will be addressed in the revised manuscript. Both the introduction and discussion will provide a much more broad overview of the literature and directly address how this study relates to those results.
Another major issue is the claim of being probabilistic and ending up evaluating an ensemble mean. It is long time that this was the way that ensemble experiments have been evaluated.
Thank you. We agree that this was an unnecessary mistake on our part, as we have the data from the ensemble and thus can provide a much better and thorough evaluation. We will address this by more comprehensively evaluating the full ensemble as discussed in our key answer points 2) at the beginning of this document.
If you want to contribute to the research in this community, you should first better analyse what the state-of-the-art is. Even if I real like the final step to the inundation mapping, in this shape this contribution is an interesting experiment but no advance in the field.
We discussed our view of relevance for the field in our key answer 1). We hope that including more statistics about the ensemble evaluation will bring more contribution to the field, especially in the final step to the inundation mapping. Furthermore, a more exhaustive literature review will be used to provide more context for this study as well as larger issues that we can address.
Major issues:
Introduction:
20 – 118: I urge you to have browse on the research conducted in the European Mountains in the around the 2010s. You will find several applications of the methods you use. Staying in HESS you might find a good start in:
Liechti, K., Panziera, L., Germann, U., and Zappa, M.: The potential of radar-based ensemble forecasts for flash-flood early warning in the southern Swiss Alps, Hydrol. Earth Syst. Sci., 17, 3853–3869, https://doi.org/10.5194/hess-17-3853-2013, 2013.
And
Poletti, M. L., Silvestro, F., Davolio, S., Pignone, F., and Rebora, N.: Using nowcasting technique and data assimilation in a meteorological model to improve very short range hydrological forecasts, Hydrol. Earth Syst. Sci., 23, 3823–3841, https://doi.org/10.5194/hess-23-3823-2019, 2019.
And
Silvestro, F., N. Rebora, and L. Ferraris, 2011: Quantitative Flood Forecasting on Small- and Medium-Sized Basins: A Probabilistic Approach for Operational Purposes. J. Hydrometeor., 12, 1432–1446, https://doi.org/10.1175/JHM-D-10-05022.1.
Or in the US:
Gourley, J. J., and Coauthors, 2017: The FLASH Project: Improving the Tools for Flash Flood Monitoring and Prediction across the United States ull. Amer. Meteor. Soc., 98, 361–372, https://doi.org/10.1175/BAMS-D-15-00247.1.
A comprehensive review can be found in
Rossa, A., Liechti, K., Zappa, M., Bruen, M., Germann, U., Haase, G., … Krahe, P. (2011). The COST 731 Action: a review on uncertainty propagation in advanced hydro-meteorological forecast systems. Atmospheric Research, 100(2-3), 150-167. https://doi.org/10.1016/j.atmosres.2010.11.016
There is a whole special Issue that might deserve your interest
https://www.sciencedirect.com/journal/atmospheric-research/vol/100/issue/2
A summary of an operational demonstration can be found in
Rotach, M. W., Arpagaus, M., Dorninger, M., Hegg, C., Montani, A., and Ranzi, R.: Uncertainty propagation for flood forecasting in the Alps: different views and impacts from MAP D-PHASE, Nat. Hazards Earth Syst. Sci., 12, 2439–2448, https://doi.org/10.5194/nhess-12-2439-2012, 2012.
For the blending of QPE and QPF there is an operational product described here
Sideris IV, Foresti L, Nerini D, Germann U. NowPrecip: localized precipitation nowcasting in the complex terrain of Switzerland. QJR Meteorol Soc. 2020; 146: 1768–1800. https://doi.org/10.1002/qj.3766
Nerini, D., L. Foresti, D. Leuenberger, S. Robert, and U. Germann, 2019: A Reduced-Space Ensemble Kalman Filter Approach for Flow-Dependent Integration of Radar Extrapolation Nowcasts and NWP Precipitation Ensembles. Mon. Wea. Rev., 147, 987–1006, https://doi.org/10.1175/MWR-D-18-0258.1.
Thank you. We will include some of the references suggested by the reviewer and search for more European references.
138: Table 1
Six events in a catchment are rather ambitious to assess “BARRIERS TO OPERATIONAL FLOOD FORECASTING IN COMPLEX TERRAIN”. The title should reflect more realistically the contribution.
Thank you for this honest feedback. As we discussed in our key answer points 1 at the beginning of this document we will change the title to fit a better description of a case study. We used the 6 events to evaluate our calibration methodology more explained in our key answer points 3). But yes, we agree that it was presumptuous to use such a general statement in the title. This was a mistake that will be corrected in the revised manuscript.
156 – 160:
So, QPF are available with three hours lead time and you need them 6 hours in advance?
We intended to emphasize a medium range in which flash flood forecast could be more useful (i.e., a minimum of 3 hours in advance of the event). The focus of our study as demonstrate the uncertainties within the 3 hours lead time.
169 – 176: This seems to me a “brute force” blending of QPF and PFF.
A very advanced system on this is INCA
Haiden, T., A. Kann, C. Wittmann, G. Pistotnik, B. Bica, and C. Gruber, 2011: The Integrated Nowcasting through Comprehensive Analysis (INCA) System and Its Validation over the Eastern Alpine Region. Wea. Forecasting, 26, 166–183, https://doi.org/10.1175/2010WAF2222451.1.
You can also find some efforts to extrapolate consistent weather radar fields here:
Germann, U., Berenguer, M., Sempere-Torres, D., & Zappa, M. (2009). REAL - ensemble radar precipitation estimation for hydrology in a mountainous region. Quarterly Journal of the Royal Meteorological Society, 135(639), 445-456. https://doi.org/10.1002/qj.375
We agree that more sophisticated methodologies for nowcasting are available in literature, and we justified our choice in the key answer points 1). We will include in the discussion more comparison among QPF ensemble methodologies. We will clarify that the intention of our QPF ensemble methodology was to bias correct the HRRR forecast using the MRMS spatial rainfall structure persistence, which has limitations. We will discuss more those limitations in the discussion section. But we hope that the including new ensemble metrics (key answer points 2) to investigate the reliability of the ensemble will help to demonstrate its overall performance.
196 – Section 2.5
I do not understand why you willingly decide to invalidate you calibration by omitting to define an independent validation period. On line 378-379 you even declare “We also did not perform a separate streamflow validation against observed gauges for any other period than in calibration (2015-2020)”. All your events are in the calibration period. This weakens every finding you obtain from your data.
We will include in manuscript that the statistics in Figure 8 are result from the calibration period (2015-2018) and validation period (2018-2020). We will present those statistics separately in an improved version of the manuscripts, and we acknowledge this was a mistake from our part. But still, all the events analyzed are in the calibration period. We provided some references in the key answer points (3) that using the extreme event as part of the calibration process is a common practice for flood mapping. Our intention was not to demonstrate a better calibration methodology than the NWM to such extreme events, but to ensure that our model was satisfactory to simulate runoff and streamflow for the flood mapping methodology. We noticed that the model performance in a hypothetical near-real-time scenario (using HRRR QPF as rainfall prediction) can also be different depending on the rainfall event. As suggestion of the reviewer 2, we will include more information about the initial conditions prior to the event, at this way we can demonstrate more findings about the initial characteristics of each event.
266 – Section 2.8
The used metrics are focussed on deterministic outcome. There is no added value in using a ensemble and in the end evaluate the ensemble mean. Any of the member might have a different timing of the peak and you just average it out with the ensemble mean. A good start as source of metrics for probabilistic forecasts is found here:
Brown, J. D., Demargne, J., Seo, D. J., & Liu, Y. (2010). The Ensemble Verification System (EVS): A software tool for verifying ensemble forecasts of hydrometeorological and hydrologic variables at discrete locations. Environmental Modelling & Software, 25(7), 854-872.
And here
https://link.springer.com/referencework/10.1007/978-3-642-40457-3
300-301: From 2006:
Germann, U., Galli, G., Boscacci, M. and Bolliger, M. (2006), Radar precipitation measurement in a mountainous region. Q.J.R. Meteorol. Soc., 132: 1669-1692. https://doi.org/10.1256/qj.05.190
As we discussed in our key answer points 2), we will include more metrics such as CPRSS and ROC for rainfall and streamflow verification.
318-319: In my experience “overshooting”is linked to hail cells.
We can look for more information about hail cells in this event.
355-360: Why choosing to operate an ensemble and ignore it int the assessment of forecast quality?
As we discussed in our key answer points 2), we will include more metrics such as CPRSS and ROC for rainfall and streamflow verification.
360: Section 3.3 is interesting, but would be more valuable to learn about validation results then to be guided into the outcomes of calibration.
We will include in manuscript that the statistics in Figure 8 are result from the calibration period (2015-2018) and validation period (2018-2020) for one gauge only. We will present those statistics separately in an improved version of the manuscripts.
382: Figure 7is purely focussed on the low-flow tail of the flow duration curve. As you are presenting a flood related study a focus on the flood tail would be more interesting.
We will calculate more statistics about the rising time of the hydrographs to the peaks and focus in the flood tail.
398: You write: “This reveals a certain level of uncertainty to simulate extreme
flood events even when they were part of the calibration process that could be attributed to hydrologic heterogeneity.”. It is basically a flaw to have evaluations for events being within the calibration period. The amount of the “certain”-level of uncertainty should be the goal of such a study.
We will include in this statement that we designed our calibration experiment to have a satisfactory streamflow performance for the flood mapping methodology. But we agree that this could be one more aspect explored in the paper. To address more levels of uncertainty. we will include the information about the initial conditions of each event (e.g., antecedent soil moisture conditions) to contribute to the analysis of hydrological heterogeneity, as suggested by reviewer 2.
412: Figure 8. Instead of putting 20 grey lines in the legend you could put one and call that “ensemble members” (as generally done in this kind of works)
We will consider that for the next round of revisions, thank you.
413: I fail to find Table 6 and 7. They should have shown the evaluation of selected members. This should be actually avoided and replaced by an evaluation of the ensemble as state-of-the-art in probabilistic forecasting.
This was an issue to link the correct table in the overleaf, since the ensemble mean evaluation is presented in Table 5. We will correct that. We will also include more statistics such as CRPSS for the streamflow ensemble as we mentioned before.
421: Section 3.5 is highly interesting. I like the step from flood forecast to inundation maps and the efforts in assessing the spatial skill. I also like the use of the HAND approach for such a task. As before, a pity that you just use the ensemble mean.
We will calculate more ensemble metrics for the final flood map too, as we discussed in the key answer points 2). As a suggestion of reviewer 1 we will also add more detail about our flood mapping methodology, and water level plots of longitudinal cross-section in the flood plain to compare both of spatial resolutions.
We thank the reviewer 3 for all the feedback to improve our paper.
REFERENCES
Hocini, N., Payrastre, O., Bourgin, F., Gaume, E., Davy, P., Lague, D., Poinsignon, L., and Pons, F.: Performance of automated methods for flash flood inundation mapping: a comparison of a digital terrain model (DTM) filling and two hydrodynamic methods, Hydrol. Earth Syst. Sci., 25, 2979–2995, https://doi.org/10.5194/hess-25-2979-2021, 2021.
Johnson, J.M., Blodgett, D.L., Clarke, K.C. et al. Restructuring and serving web-accessible streamflow data from the NOAA National Water Model historic simulations. Sci Data 10, 725 (2023). https://doi.org/10.1038/s41597-023-02316-7 (a)
Johnson, J. M., Fang, S., Sankarasubramanian, A., Rad, A. M., Kindl da Cunha, L., Jennings, K. S., et al. (2023). Comprehensive analysis of the NOAA National Water Model: A call for heterogeneous formulations and diagnostic model selection. Journal of Geophysical Research: Atmospheres, 128, e2023JD038534. https://doi.org/10.1029/2023JD038534 (b)
Citation: https://doi.org/10.5194/egusphere-2023-2088-AC3
-
AC3: 'Reply on RC3', Luiz Bacelar, 06 Mar 2024
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
444 | 179 | 46 | 669 | 57 | 31 | 34 |
- HTML: 444
- PDF: 179
- XML: 46
- Total: 669
- Supplement: 57
- BibTeX: 31
- EndNote: 34
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1