the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Can we reliably estimate precipitation with high resolution during disastrously large floods?
Abstract. A huge and dangerous flood occurred in September 2024 in the upper and middle Odra river basin, including mountainous areas in south-western Poland. The widespread precipitation lasted about four days, reaching more than 200 mm daily. In order to verify the possibilities of precise estimation of the precipitation field, different measurement techniques were analysed: rain gauge data, weather radar-based, satellite-based, non-conventional (CML-based) and multi-source estimates. Apart from real-time and near real-time data, later available reanalyses based on satellite information (IMERG, PDIR-Now) and numerical mesoscale model simulations (ERA5, WRF) were also examined. Manual rain gauge data for daily accumulations and multi-source RainGRS estimates for hourly accumulations were used as references to evaluate the reliability of the various techniques for measurements and estimation of precipitation accumulations. Statistical analyses and visual comparisons were carried out. Among the data available in real time the best results were found for rain gauge measurements, radar data adjusted to rain gauges, and RainGRS estimates. Fairly good reliability was achieved by non-conventional CML-based measurements. In terms of offline reanalyses, mesoscale model simulations also demonstrated reasonably good agreement with reference precipitation, while poorer results were obtained by all satellite-based estimates except the IMERG.
- Preprint
(1864 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2025-1863', Nicolas Velasquez, 03 Jul 2025
Publisher’s note: the supplement to this comment was edited on 7 July 2025. The adjustments were minor without effect on the scientific meaning.
- AC1: 'Reply on RC1', Jan Szturc, 01 Aug 2025
-
RC2: 'Comment on egusphere-2025-1863', Anonymous Referee #2, 08 Jul 2025
Review of the paper
Can we reliably estimate precipitation with high resolution during disastrously large floods?
by
Jan Szturc, Anna Jurczyk, Katarzyna Osródka, Agnieszka Kurcz, Magdalena Szaton, Mariusz Figurski, Robert PyrcGeneral
The authors analyse different precipitation data based on measurements from different sources as well as numerical weather prediction (NWP) model data. The measurement data include rain gauges, weather radar, satellite data and commercial microwave link (CML) data or products based on these data. The precipitation data are analysed for an extreme precipitation event on the Odra river catchment which occurred during four days in September 2024.The paper is presenting the different results in a clear manner and discusses the advantages and disadvantages of the different measurement / modelling principles in detail.
In particular, limitations for satellite and NWP products can be demonstrated, while rain gauge and radar based products perform the best.
Detailed discussion items
Line 124 ff: Reference for the analysis:
The authors discuss the relevance for a reference data set which is independent from the other data sets. Unfortunately, they later include precipitation products that are not independent from the reference data set.
- While for daily values, the manual gauges are selected as reference - a choice which is the best possible from the available data -, the post processed GRS Clim product which is adjusted based on these reference station data enters into the investigated methods. This should be avoided because it deviates the attention of the reader from the relevant data to be compared.
- Concerning hourly values, the choice of an independent reference of high quality is not solvable with the existing precipitation products. Therefore, the choice made by the authors is understandable, in particular since they are pinpointing this methodological weakness.Line 379ff: Selected metrics
When comparing gridded data to point data at the ground, location uncertainties may arise because rainfall observed at a certain height (or from space) does not necessarily fall down at the point which is considered by overlaying grids and points. Furthermore, uncertainties of comparing a grid area average to a point value occur in particular in heavy rain - in the order of 20% have been observed (Schellart et al., 2017). Can you please add a short section on how you take into account such uncertainties or, alternatively, which range of values has to be considered reliable?
Do the selected metrics show well the effects that you are most interested in, i.e. the best estimate for extreme intensities and also for the cumulated sums? Squared error indices tend to heavily penalize individual outlyers which may be one effect that you are after, but please discuss this aspect.Line 553: Do you want to say that interpolated gauges are more reliable than adjusted radar data? Then you contradict yourself because earlier you said that interpolated station data are underestimating the true values.
Formal aspects
Line 14: please replace "... 200 mm daily" by "... 200 mm on one day at one rain gauge location"
Line 71ff: please give the explanation for each abbreviation before using it (RLAN, GPM, NOAA, MetOp, GAU, etc.)!
Line 191: please start the sentence with "In many locations, the daily precipitation ..." - the values in the tables given later suggest that this formulation is more precise.
Line 366: please rephrase to something like "... which we consider to be the most reliable values."
Line 479: according to Table 2, the Bias is -3.8 mm (not -3.6 mm) - one of the two should be corrected ...
Technical points
Lines 274-278: This explanation should be clarified - which other approaches were tested before selecting the final method and by which means is it different to the others? Please also refer to the results from the COST OpenSense Action!Line 309: You are writing "closest to reality" - however, this is correct for one point and is of limited value for areas. Please emphasize it here again, although you mentioned this earlier already.
Lines 329 - 331:
- "... satellite data as a base line and intercalibrates." What is intercalibrated here?
- "... other observations with international satellite constellation ..." Please note that GPM as the Global Precipitation Measurement mission is heavily based on satellite-based weather radars. The chosen formulation suggests that GPM does not include radar and these data need to be retrieved from other sourcesLines 347-348: How can you analyse short lived phenomena if your resolution is not sufficient for convective cells? Please explain!
Lines 365 to 378: I understand it correctly that the daily analysis relies on 112 data points (= all manual stations in the area) and the hourly analysis on statistics calcluated from 44218 pixels? If so, please add the numbers here for a better understanding!
Line 411-412: What is the influence of data from the Czech territory? I do not understand.
Lines 413: RAD data product: please eliminate the discussion of unadjusted radar data - else readers may think that they can work with such data. Merely, a warning would be adequate to never use unadjusted radar data for any quantitative purpose, maybe with a reference to the WMO Operational Weather Radar Best Practice Guidance (WMO document no. 1257 - https://library.wmo.int/records/item/68834-guide-to-operational-weather-radar-best-practices?offset=5).
References:
- Schellart ANA, Wang L & Onof C (2017) High resolution rainfall measurement and analysis in a small urban catchment. 9th International Workshop on Precipitation in Urban Areas: Urban Challenges in Rainfall Analysis, UrbanRain 2012 (pp 115-120)Citation: https://doi.org/10.5194/egusphere-2025-1863-RC2 - AC3: 'Reply on RC2', Jan Szturc, 01 Aug 2025
-
RC3: 'Comment on egusphere-2025-1863', Anonymous Referee #3, 11 Jul 2025
General comments
The paper was well written, containing a good introduction to readers
also outside meteorological and hydrological community. Also the topic
– uncertainties in predicting intensive flooding – is very relevant in
modern society still sensitive to weather conditions. The
geographical scope of this study was limited to Poland, but it can be
expected that the results are applicable in Southern and Northern
climates as well, especially in mountaineous areas.Specific comments
The experiments section involved verification of products that depend
on inputs that are used as ground truth in comparisons. This was
clearly pointed out throughout (L377, L426, L485, L548, L556, L582,
L621) the article which is of course appreciated. Nevertheless,
interpreting verification is problematic in these cases. On one hand,
for example, one could easily think of multi-input algorithm design
where output if asymptotically forced to match point measurements used
also as reference – yielding zero error. Or if the values do vary, it
would be good to know motivations in design (for example, avoiding
overfitting). Nevertheless, such evaluations have little or no
information in my opinion. On the other hand, experts in this area do
know the challenge of evaluating input sources (measurement
technologies), none of which is perfect (L95) . It is operationally
tempting – if not inevitable – to combine inputs of various type
(L123). But as to verification of performance, one should try to use
reference data as independent as possible. Would it be easy to use
some kind of cross-validation, dropping some ground observations (in
turn) from input, and to verify the results agains those? This could
yet require more computational effort.
When evaluating prediction based
on commercial microwave links (CML),
a natural explanation (L423,L584-585) of deviations is distance from
reference measurements (GAU manual). Could it be useful to study the
effect of distance by measuring correlation inside reference data
itself? Then input data from a separate system (like CML) could be
then compared against such modelled, "theoretical" maximum – providing
estimates of measurement uncertainty at least in the vicinity of the
links. (This is more a suggestion for further work, not for this
study and this could be of more interest for CML application
developers.)The article reports errors in predictions using input from radar and
especially, satellites. Many potential error sources (L58, L59; L69) are well-known –
like measurement geometry or uncertainly of water phase (in both radar
and satellite measurements). It would be interested to read the
author's views on which of the error sources have been critical in
this study. Systematic analysis could be certainly outside the scope
of this article, but perhaps just visual inspection could be used as a
basis for discussion on error sources.Focusing separately in cases of intensive rainfall (Sec 4.4.) is well
motivated. When thresholding the cases with reference (L505, L531),
negative bias is reported for all the methods (Table 4), also
highlighting it for radar in text (L510, L546). Especially in
verifying GAU against GAU Manual, I think that the reported
underestimation (L573, L574) is a direct consequence of the applied
thresholding! Consider two measurement devices of similar climatology
some kilometres apart and long-term statistics of (convective)
rainfall: measured rainfall is then similarly distributed over the
mean value. But if studied cases are limited by thresholding data on
ONE measurement location/device, the other still includes also the
lower values, pushing its bias down! (Consider throwing two dice,
comparing averages of each, but limiting the studied cases by
thresholding the first die.) I guess also with radar, similar effect
can be observed when limiting cases by thresholding the reference
value. (Radar's bias could be basically zero, but "random noise"
ie. positive and negative deviation around mean is now caused by
non-uniform vertical profiles of precipitation and advection, for
example.) If you aggree with me in this, I suggest you somehow address
and elaborate this in text and/or presented experiments.
Technical commentsRainGRS is mentioned several times before explained or referenced. It
is also unclear, what is "RainGRS (GRS)" compared to plain "RainGRS".Long URLs embedded in the text (L328, L337, L345, L353) reduce readability a bit (). If the
publisher's style guide supports it, could they be in the references?A minor detail: place names seem to have mixed style; English names
should be preferred if they exist. (According to Wikipedia, Oder seems
to be the established English name for the river Odra (PL). Also
Sudetes (EN) and Sudety (PL) appear, but understandably the smaller
the locations/regions, less English names exist! Anyway, I leave it to
the authors to decide the naming policy.)Citation: https://doi.org/10.5194/egusphere-2025-1863-RC3 - AC2: 'Reply on RC3', Jan Szturc, 01 Aug 2025
Status: closed
-
RC1: 'Comment on egusphere-2025-1863', Nicolas Velasquez, 03 Jul 2025
Publisher’s note: the supplement to this comment was edited on 7 July 2025. The adjustments were minor without effect on the scientific meaning.
- AC1: 'Reply on RC1', Jan Szturc, 01 Aug 2025
-
RC2: 'Comment on egusphere-2025-1863', Anonymous Referee #2, 08 Jul 2025
Review of the paper
Can we reliably estimate precipitation with high resolution during disastrously large floods?
by
Jan Szturc, Anna Jurczyk, Katarzyna Osródka, Agnieszka Kurcz, Magdalena Szaton, Mariusz Figurski, Robert PyrcGeneral
The authors analyse different precipitation data based on measurements from different sources as well as numerical weather prediction (NWP) model data. The measurement data include rain gauges, weather radar, satellite data and commercial microwave link (CML) data or products based on these data. The precipitation data are analysed for an extreme precipitation event on the Odra river catchment which occurred during four days in September 2024.The paper is presenting the different results in a clear manner and discusses the advantages and disadvantages of the different measurement / modelling principles in detail.
In particular, limitations for satellite and NWP products can be demonstrated, while rain gauge and radar based products perform the best.
Detailed discussion items
Line 124 ff: Reference for the analysis:
The authors discuss the relevance for a reference data set which is independent from the other data sets. Unfortunately, they later include precipitation products that are not independent from the reference data set.
- While for daily values, the manual gauges are selected as reference - a choice which is the best possible from the available data -, the post processed GRS Clim product which is adjusted based on these reference station data enters into the investigated methods. This should be avoided because it deviates the attention of the reader from the relevant data to be compared.
- Concerning hourly values, the choice of an independent reference of high quality is not solvable with the existing precipitation products. Therefore, the choice made by the authors is understandable, in particular since they are pinpointing this methodological weakness.Line 379ff: Selected metrics
When comparing gridded data to point data at the ground, location uncertainties may arise because rainfall observed at a certain height (or from space) does not necessarily fall down at the point which is considered by overlaying grids and points. Furthermore, uncertainties of comparing a grid area average to a point value occur in particular in heavy rain - in the order of 20% have been observed (Schellart et al., 2017). Can you please add a short section on how you take into account such uncertainties or, alternatively, which range of values has to be considered reliable?
Do the selected metrics show well the effects that you are most interested in, i.e. the best estimate for extreme intensities and also for the cumulated sums? Squared error indices tend to heavily penalize individual outlyers which may be one effect that you are after, but please discuss this aspect.Line 553: Do you want to say that interpolated gauges are more reliable than adjusted radar data? Then you contradict yourself because earlier you said that interpolated station data are underestimating the true values.
Formal aspects
Line 14: please replace "... 200 mm daily" by "... 200 mm on one day at one rain gauge location"
Line 71ff: please give the explanation for each abbreviation before using it (RLAN, GPM, NOAA, MetOp, GAU, etc.)!
Line 191: please start the sentence with "In many locations, the daily precipitation ..." - the values in the tables given later suggest that this formulation is more precise.
Line 366: please rephrase to something like "... which we consider to be the most reliable values."
Line 479: according to Table 2, the Bias is -3.8 mm (not -3.6 mm) - one of the two should be corrected ...
Technical points
Lines 274-278: This explanation should be clarified - which other approaches were tested before selecting the final method and by which means is it different to the others? Please also refer to the results from the COST OpenSense Action!Line 309: You are writing "closest to reality" - however, this is correct for one point and is of limited value for areas. Please emphasize it here again, although you mentioned this earlier already.
Lines 329 - 331:
- "... satellite data as a base line and intercalibrates." What is intercalibrated here?
- "... other observations with international satellite constellation ..." Please note that GPM as the Global Precipitation Measurement mission is heavily based on satellite-based weather radars. The chosen formulation suggests that GPM does not include radar and these data need to be retrieved from other sourcesLines 347-348: How can you analyse short lived phenomena if your resolution is not sufficient for convective cells? Please explain!
Lines 365 to 378: I understand it correctly that the daily analysis relies on 112 data points (= all manual stations in the area) and the hourly analysis on statistics calcluated from 44218 pixels? If so, please add the numbers here for a better understanding!
Line 411-412: What is the influence of data from the Czech territory? I do not understand.
Lines 413: RAD data product: please eliminate the discussion of unadjusted radar data - else readers may think that they can work with such data. Merely, a warning would be adequate to never use unadjusted radar data for any quantitative purpose, maybe with a reference to the WMO Operational Weather Radar Best Practice Guidance (WMO document no. 1257 - https://library.wmo.int/records/item/68834-guide-to-operational-weather-radar-best-practices?offset=5).
References:
- Schellart ANA, Wang L & Onof C (2017) High resolution rainfall measurement and analysis in a small urban catchment. 9th International Workshop on Precipitation in Urban Areas: Urban Challenges in Rainfall Analysis, UrbanRain 2012 (pp 115-120)Citation: https://doi.org/10.5194/egusphere-2025-1863-RC2 - AC3: 'Reply on RC2', Jan Szturc, 01 Aug 2025
-
RC3: 'Comment on egusphere-2025-1863', Anonymous Referee #3, 11 Jul 2025
General comments
The paper was well written, containing a good introduction to readers
also outside meteorological and hydrological community. Also the topic
– uncertainties in predicting intensive flooding – is very relevant in
modern society still sensitive to weather conditions. The
geographical scope of this study was limited to Poland, but it can be
expected that the results are applicable in Southern and Northern
climates as well, especially in mountaineous areas.Specific comments
The experiments section involved verification of products that depend
on inputs that are used as ground truth in comparisons. This was
clearly pointed out throughout (L377, L426, L485, L548, L556, L582,
L621) the article which is of course appreciated. Nevertheless,
interpreting verification is problematic in these cases. On one hand,
for example, one could easily think of multi-input algorithm design
where output if asymptotically forced to match point measurements used
also as reference – yielding zero error. Or if the values do vary, it
would be good to know motivations in design (for example, avoiding
overfitting). Nevertheless, such evaluations have little or no
information in my opinion. On the other hand, experts in this area do
know the challenge of evaluating input sources (measurement
technologies), none of which is perfect (L95) . It is operationally
tempting – if not inevitable – to combine inputs of various type
(L123). But as to verification of performance, one should try to use
reference data as independent as possible. Would it be easy to use
some kind of cross-validation, dropping some ground observations (in
turn) from input, and to verify the results agains those? This could
yet require more computational effort.
When evaluating prediction based
on commercial microwave links (CML),
a natural explanation (L423,L584-585) of deviations is distance from
reference measurements (GAU manual). Could it be useful to study the
effect of distance by measuring correlation inside reference data
itself? Then input data from a separate system (like CML) could be
then compared against such modelled, "theoretical" maximum – providing
estimates of measurement uncertainty at least in the vicinity of the
links. (This is more a suggestion for further work, not for this
study and this could be of more interest for CML application
developers.)The article reports errors in predictions using input from radar and
especially, satellites. Many potential error sources (L58, L59; L69) are well-known –
like measurement geometry or uncertainly of water phase (in both radar
and satellite measurements). It would be interested to read the
author's views on which of the error sources have been critical in
this study. Systematic analysis could be certainly outside the scope
of this article, but perhaps just visual inspection could be used as a
basis for discussion on error sources.Focusing separately in cases of intensive rainfall (Sec 4.4.) is well
motivated. When thresholding the cases with reference (L505, L531),
negative bias is reported for all the methods (Table 4), also
highlighting it for radar in text (L510, L546). Especially in
verifying GAU against GAU Manual, I think that the reported
underestimation (L573, L574) is a direct consequence of the applied
thresholding! Consider two measurement devices of similar climatology
some kilometres apart and long-term statistics of (convective)
rainfall: measured rainfall is then similarly distributed over the
mean value. But if studied cases are limited by thresholding data on
ONE measurement location/device, the other still includes also the
lower values, pushing its bias down! (Consider throwing two dice,
comparing averages of each, but limiting the studied cases by
thresholding the first die.) I guess also with radar, similar effect
can be observed when limiting cases by thresholding the reference
value. (Radar's bias could be basically zero, but "random noise"
ie. positive and negative deviation around mean is now caused by
non-uniform vertical profiles of precipitation and advection, for
example.) If you aggree with me in this, I suggest you somehow address
and elaborate this in text and/or presented experiments.
Technical commentsRainGRS is mentioned several times before explained or referenced. It
is also unclear, what is "RainGRS (GRS)" compared to plain "RainGRS".Long URLs embedded in the text (L328, L337, L345, L353) reduce readability a bit (). If the
publisher's style guide supports it, could they be in the references?A minor detail: place names seem to have mixed style; English names
should be preferred if they exist. (According to Wikipedia, Oder seems
to be the established English name for the river Odra (PL). Also
Sudetes (EN) and Sudety (PL) appear, but understandably the smaller
the locations/regions, less English names exist! Anyway, I leave it to
the authors to decide the naming policy.)Citation: https://doi.org/10.5194/egusphere-2025-1863-RC3 - AC2: 'Reply on RC3', Jan Szturc, 01 Aug 2025
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
381 | 73 | 21 | 475 | 10 | 25 |
- HTML: 381
- PDF: 73
- XML: 21
- Total: 475
- BibTeX: 10
- EndNote: 25
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1