the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
GPROF V7 and beyond: Assessment of current and potential future versions of the GPROF passive microwave precipitation retrievals against ground radar measurements over the continental US and the Pacific Ocean
Abstract. The Goddard Profiling Algorithm (GPROF) is used operationally for the retrieval of surface precipitation and hydrometeor profiles from the passive microwave (PMW) observations of the Global Precipitation Measurement (GPM) mission. Recent updates have led to GPROF V7, which has entered operational use in May 2022. In parallel, development is underway to improve the retrieval by transitioning to a neural-network-based algorithm called GPROF-NN.
This study validates GPROF V7 and multiple configurations of the GPROF-NN retrieval against ground-based radar measurements over the conterminous United States (CONUS) and the tropical Pacific. GPROF retrievals from the GPM Microwave Imager (GMI) are validated over several years and their ability to reproduce regional precipitation characteristics and effective resolution is assessed. Moreover, the retrieval accuracy for several other sensors of the constellation is evaluated.
The validation of GPROF V7 indicates that the retrieval produces reliable precipitation estimates over CONUS. During all four assessed years, annual mean precipitation is within 8 % of gauge-corrected radar measurements. Although biases of up to 25 % are observed over sub-regions of the CONUS and the tropical Pacific, the retrieval reproduces the principal precipitation characteristics of each region. The effective resolution of GPROF V7 is found to be 51 km over CONUS and 18 km over the tropical Pacific. GPROF V7 produces robust precipitation estimates also for the other sensors of the GPM constellation.
The evaluation further shows that the GPROF-NN retrievals have the potential to significantly improve the GPROF precipi- tation retrievals. GPROF-NN 1D, the most basic neural network implementation of GPROF, improves the mean-squared error, mean absolute error, correlation and symmetric mean absolute percentage error by about twenty percent for GPROF GMI while the effective resolution is improved to 31 km over land and 15 km over oceans. The two GPROF-NN retrievals that are based on convolutional neural networks can further improve the accuracy up to the level of the combined radar/radiometer retrievals from the GPM core observatory. However, these retrievals are found to overfit on the viewing geometry at the center of the swath, reducing their overall accuracy to that of GPROF-NN 1D. For the other sensors of the constellation, the GPROF-NN retrievals produce larger biases than GPROF V7 and only GPROF-NN 3D achieves consistent improvements comparged to GPROF V7 in terms of the other assessed error metrics. This points to shortcomings in the hydrometeor profiles or radiative transfer simulations used in the training of the retrievals for the other sensors of the GPM constellation as a critical limitation for improving GPM PMW retrievals.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(9678 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(9678 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1310', Anonymous Referee #1, 27 Sep 2023
The manuscript provides an assessment of GPROF V7 capabilities, together with some comparisons with GPROF V5 and 3 new, not yet operationally implemented, machine learning based versions of GPROF (GPROF-NN). After the validation exercise, GPROF V7 is considered reliable over CONUS for both GMI and other sensors of the GPM constellation. GPROF-NN introduces substantial improvements in the retrievals, but the viewing geometry reduces the accuracy towards the edges of the swath. Overall, the performances of all GPROF versions are highly influenced by the accuracy of the a-priori database used (GPM CMB).
The manuscript is very well written, it provides a systematic validation of the precipitation products from many points of view, CONUS and Pacific ocean, regional, seasonal and diurnal cycle. The manuscript is definitely very dense and rich of information. I think the manuscript is ready for publication after some very minor revisions that might help clarifying a couple of points:
- The whole validation is based on ‘validation measurements’ and ‘reference measurements’, them being ground-based radars and GPM CMB precipitation retrievals respectively. In section 3.1.1 sometimes there is a bit of confusion on how the different databases are addressed. Line 226 has ‘reference precipitation’, line 234 has ‘retrieval database’, line 270 has ‘database’, line 281 has ‘a-priori database’, line 284 has ‘database precipitation’ etc. I suggest to get a bit of consistency since the whole validation is based on different but very similar databases (GPM CMB 2019 is the a-priori database, GPM CMB for other years is just a comparison database etc.).
- In section 3.1.2 you compare different regions and some of the explanations for high biases are attributed to winter precipitation (see line 331 for example). I am a bit confused on how you are dealing with winter precipitation since snow covered surfaces and MRMS frozen precipitation are excluded from the analysis. Please provide more context on how you analyze winter precipitation in the different regions.
Other suggestions:
Line 37: ‘resolution of 10 km’ - given the global nature of IMERG maybe 0.1x0.1deg is more appropriate?
Line 171: ‘neighboring pixels’ - is this the distance from the centers of neighboring pixels?
Line 217: ‘conditioned on the validation precipitation’ - do you mean the analysis is made only on pixels where it is precipitating according to the validation (MRMS) dataset?
Line 218: ‘GPROF a priori database’ - since this database is the same as a priori or training for NN, maybe use ‘a priori/training’.
Line 221: can the spread also be due to the preprocessing clustering?
Line 226: ‘conditioned on the reference precipitation’ - is this the same as line 217, ‘validation precipitation’? As mentioned in comment 1, there is a bit of confusion in the naming of the different datasets used.
Line 229: ‘GPROF V5 is based on a different a priori database’ - I suggest to specify that V5 was based on DPR over land and CMB over ocean.
Line 234: ‘retrieval database’ – which one is the retrieval database? I suppose you are referring to GPM CMB? This should be stated more clearly earlier in the section and be consistent throughout the manuscript.
Line 244: ‘introduced rain gauge correction’ – replace with ‘introduced by the rain gauge correction’.
Figure 2: I see a very interesting behavior in the low values trend lines. The GPM CMB vs a-priori dataset (which is GPM CMB 2019) have overestimation of the GPM CMB 2019 compared to GPM CMB ‘other years’, while all the others have the opposite behaviors. Also it looks like the comparison with MRMS 2021 and 2022 shows higher bias for low values. Also the trend for higher values is worth attention. It might be nice to reference this behavior in the section and in the bias description since it provides more information on the range of precipitation that has most issues.
Line 260: ‘conventional GPROF’ – both V7 and V5?
Line 270: ‘When compared to the database’ – which database?
Line 294: ‘the fraction of confirmed raining pixels among those retrieved as raining’ – would this be ‘the fraction of confirmed raining pixels in the a priori database among those retrieved as raining by MRMS’?
Line 295: ‘the fraction of confirmed raining pixels that are detected by retrieval’ – would this be ‘the fraction of confirmed raining pixels in the a priori database that are detected by retrieval’? I might have interpreted these last two sentences incorrectly which suggests the importance of clarifying which datasets you are talking about.
Line 294-295: you talk about raining pixels. So frozen precipitation is excluded also from GPROF? I mean, it makes sense, but you mentioned it is excluded from MRMS earlier in the manuscript and never mentioned what you are doing for GPROF or CMB. I think this is a big point since it eliminates a lot of winter observations that, together with the winter precipitation mentioned in the regional analysis, needs to be clarified earlier in the manuscript.
Line 308: ‘For both the database and the MRMS’ – do you mean the a priori database?
Figure 6 caption: Panel (a) shows the detection skill for the database collocations – I would specify a priori database.
Figure 12: for better comparison I would suggest to add a column with the GMI results in these plots.
Line 464-465: I actually see more bias for GPROF V5 and V7 than from GPROF NN, am I missing something?
Citation: https://doi.org/10.5194/egusphere-2023-1310-RC1 - AC1: 'Reply on RC1', Simon Pfreundschuh, 23 Oct 2023
-
RC2: 'Comment on egusphere-2023-1310', Anonymous Referee #2, 02 Oct 2023
Firstly, I would like to commend the authors for their excellent work. The study validates retrievals from new operational GPROF v7, and nonoperational neural-network-based algorithm GPROF-NN against ground-based radar network (validation measurement) and 2B-CMB (reference measurement). It is important to understand the performance improvements in between different algorithm versions (GPROF v05 vs. GPROF v07) and authors provide this at various spatial and temporal resolutions. Moreover, I appreciate the novel approach authors provide in order to understand to what extent priori database (2B CMB) uncertainty contribute to GPROF overall retrieval error. The manuscript is very well written with very clearly defined questions. It is a comprehensive analysis. I have some minor comments:
- Abstract, Line 12-13: What does “retrieval reproduces the principal precipitation characteristics of each region” mean? Can you please elaborate?
- Abstract, Line16-18: I appreciate that authors are providing this significant finding here at the abstract, however, can you please be more specific about the time resolution of this comparison? Meaning at what time resolution GPROF NN 1D is improving mean absolute error, corr etc.?
- Line 58-60: This is a very confusing sentence. Can you please reword it?
- Line 64: Can you please reword this question, something along the lines: “to what extent a priori database errors contribute to GPROF overall retrieval errors?”
- Line 65: I think it would be better to remove “even” from this question“… GPM PMW observations even when compared to …”
- Line 90: Authors mention that rain gauge corrected MRMS data are used. Can authors please be more specific which database they have used? Because the way it has been presented is slightly confusing. Gauge corrected MRMS precipitation magnitudes are accumulations. However, radar only MRMS data provides precipitation rates at 2 min temporal intervals. Did the authors conduct their own gauge correction to the radar only MRMS product?
- Line 211-212: Can authors please clearly indicate whether the mountain surfaces are excluded or included with a correction.
- Line 228: Can authors please explain how they calculate the bias or what is the definition of the bias? And at what temporal resolution (I am assuming this is annual but it would be nice to indicate).
- Line 239-242: To add to the explanation here (this is through a personal communication with a MRMS team member): “It is not documented on Iowa website, however, in ~Oct 2020, the gauge correction methodology and associated products are changed.” This corresponds exactly to the 2021 water year that authors are using in this study.
- Line 244: “… it is possible that the bias relative to MRMS is introduced by rain gauge correction” please include “by” in this sentence to make it clear.
- Line 266-268: Can authors please describe why they decide to use mean error, mean-squared error and mean absolute error all together? What do they explain differently and why did authors needed all of them together? Moreover, can authors please describe symmetric mean absolute percentage error in more detail i.e., what does this score mean, what are the max and min values etc.
Citation: https://doi.org/10.5194/egusphere-2023-1310-RC2 - AC2: 'Reply on RC2', Simon Pfreundschuh, 23 Oct 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1310', Anonymous Referee #1, 27 Sep 2023
The manuscript provides an assessment of GPROF V7 capabilities, together with some comparisons with GPROF V5 and 3 new, not yet operationally implemented, machine learning based versions of GPROF (GPROF-NN). After the validation exercise, GPROF V7 is considered reliable over CONUS for both GMI and other sensors of the GPM constellation. GPROF-NN introduces substantial improvements in the retrievals, but the viewing geometry reduces the accuracy towards the edges of the swath. Overall, the performances of all GPROF versions are highly influenced by the accuracy of the a-priori database used (GPM CMB).
The manuscript is very well written, it provides a systematic validation of the precipitation products from many points of view, CONUS and Pacific ocean, regional, seasonal and diurnal cycle. The manuscript is definitely very dense and rich of information. I think the manuscript is ready for publication after some very minor revisions that might help clarifying a couple of points:
- The whole validation is based on ‘validation measurements’ and ‘reference measurements’, them being ground-based radars and GPM CMB precipitation retrievals respectively. In section 3.1.1 sometimes there is a bit of confusion on how the different databases are addressed. Line 226 has ‘reference precipitation’, line 234 has ‘retrieval database’, line 270 has ‘database’, line 281 has ‘a-priori database’, line 284 has ‘database precipitation’ etc. I suggest to get a bit of consistency since the whole validation is based on different but very similar databases (GPM CMB 2019 is the a-priori database, GPM CMB for other years is just a comparison database etc.).
- In section 3.1.2 you compare different regions and some of the explanations for high biases are attributed to winter precipitation (see line 331 for example). I am a bit confused on how you are dealing with winter precipitation since snow covered surfaces and MRMS frozen precipitation are excluded from the analysis. Please provide more context on how you analyze winter precipitation in the different regions.
Other suggestions:
Line 37: ‘resolution of 10 km’ - given the global nature of IMERG maybe 0.1x0.1deg is more appropriate?
Line 171: ‘neighboring pixels’ - is this the distance from the centers of neighboring pixels?
Line 217: ‘conditioned on the validation precipitation’ - do you mean the analysis is made only on pixels where it is precipitating according to the validation (MRMS) dataset?
Line 218: ‘GPROF a priori database’ - since this database is the same as a priori or training for NN, maybe use ‘a priori/training’.
Line 221: can the spread also be due to the preprocessing clustering?
Line 226: ‘conditioned on the reference precipitation’ - is this the same as line 217, ‘validation precipitation’? As mentioned in comment 1, there is a bit of confusion in the naming of the different datasets used.
Line 229: ‘GPROF V5 is based on a different a priori database’ - I suggest to specify that V5 was based on DPR over land and CMB over ocean.
Line 234: ‘retrieval database’ – which one is the retrieval database? I suppose you are referring to GPM CMB? This should be stated more clearly earlier in the section and be consistent throughout the manuscript.
Line 244: ‘introduced rain gauge correction’ – replace with ‘introduced by the rain gauge correction’.
Figure 2: I see a very interesting behavior in the low values trend lines. The GPM CMB vs a-priori dataset (which is GPM CMB 2019) have overestimation of the GPM CMB 2019 compared to GPM CMB ‘other years’, while all the others have the opposite behaviors. Also it looks like the comparison with MRMS 2021 and 2022 shows higher bias for low values. Also the trend for higher values is worth attention. It might be nice to reference this behavior in the section and in the bias description since it provides more information on the range of precipitation that has most issues.
Line 260: ‘conventional GPROF’ – both V7 and V5?
Line 270: ‘When compared to the database’ – which database?
Line 294: ‘the fraction of confirmed raining pixels among those retrieved as raining’ – would this be ‘the fraction of confirmed raining pixels in the a priori database among those retrieved as raining by MRMS’?
Line 295: ‘the fraction of confirmed raining pixels that are detected by retrieval’ – would this be ‘the fraction of confirmed raining pixels in the a priori database that are detected by retrieval’? I might have interpreted these last two sentences incorrectly which suggests the importance of clarifying which datasets you are talking about.
Line 294-295: you talk about raining pixels. So frozen precipitation is excluded also from GPROF? I mean, it makes sense, but you mentioned it is excluded from MRMS earlier in the manuscript and never mentioned what you are doing for GPROF or CMB. I think this is a big point since it eliminates a lot of winter observations that, together with the winter precipitation mentioned in the regional analysis, needs to be clarified earlier in the manuscript.
Line 308: ‘For both the database and the MRMS’ – do you mean the a priori database?
Figure 6 caption: Panel (a) shows the detection skill for the database collocations – I would specify a priori database.
Figure 12: for better comparison I would suggest to add a column with the GMI results in these plots.
Line 464-465: I actually see more bias for GPROF V5 and V7 than from GPROF NN, am I missing something?
Citation: https://doi.org/10.5194/egusphere-2023-1310-RC1 - AC1: 'Reply on RC1', Simon Pfreundschuh, 23 Oct 2023
-
RC2: 'Comment on egusphere-2023-1310', Anonymous Referee #2, 02 Oct 2023
Firstly, I would like to commend the authors for their excellent work. The study validates retrievals from new operational GPROF v7, and nonoperational neural-network-based algorithm GPROF-NN against ground-based radar network (validation measurement) and 2B-CMB (reference measurement). It is important to understand the performance improvements in between different algorithm versions (GPROF v05 vs. GPROF v07) and authors provide this at various spatial and temporal resolutions. Moreover, I appreciate the novel approach authors provide in order to understand to what extent priori database (2B CMB) uncertainty contribute to GPROF overall retrieval error. The manuscript is very well written with very clearly defined questions. It is a comprehensive analysis. I have some minor comments:
- Abstract, Line 12-13: What does “retrieval reproduces the principal precipitation characteristics of each region” mean? Can you please elaborate?
- Abstract, Line16-18: I appreciate that authors are providing this significant finding here at the abstract, however, can you please be more specific about the time resolution of this comparison? Meaning at what time resolution GPROF NN 1D is improving mean absolute error, corr etc.?
- Line 58-60: This is a very confusing sentence. Can you please reword it?
- Line 64: Can you please reword this question, something along the lines: “to what extent a priori database errors contribute to GPROF overall retrieval errors?”
- Line 65: I think it would be better to remove “even” from this question“… GPM PMW observations even when compared to …”
- Line 90: Authors mention that rain gauge corrected MRMS data are used. Can authors please be more specific which database they have used? Because the way it has been presented is slightly confusing. Gauge corrected MRMS precipitation magnitudes are accumulations. However, radar only MRMS data provides precipitation rates at 2 min temporal intervals. Did the authors conduct their own gauge correction to the radar only MRMS product?
- Line 211-212: Can authors please clearly indicate whether the mountain surfaces are excluded or included with a correction.
- Line 228: Can authors please explain how they calculate the bias or what is the definition of the bias? And at what temporal resolution (I am assuming this is annual but it would be nice to indicate).
- Line 239-242: To add to the explanation here (this is through a personal communication with a MRMS team member): “It is not documented on Iowa website, however, in ~Oct 2020, the gauge correction methodology and associated products are changed.” This corresponds exactly to the 2021 water year that authors are using in this study.
- Line 244: “… it is possible that the bias relative to MRMS is introduced by rain gauge correction” please include “by” in this sentence to make it clear.
- Line 266-268: Can authors please describe why they decide to use mean error, mean-squared error and mean absolute error all together? What do they explain differently and why did authors needed all of them together? Moreover, can authors please describe symmetric mean absolute percentage error in more detail i.e., what does this score mean, what are the max and min values etc.
Citation: https://doi.org/10.5194/egusphere-2023-1310-RC2 - AC2: 'Reply on RC2', Simon Pfreundschuh, 23 Oct 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
232 | 100 | 24 | 356 | 12 | 11 |
- HTML: 232
- PDF: 100
- XML: 24
- Total: 356
- BibTeX: 12
- EndNote: 11
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Simon Pfreundschuh
Clément Guilloteau
Paula J. Brown
Christian D. Kummerow
Patrick Eriksson
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(9678 KB) - Metadata XML