the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Downscaling precipitation over High Mountain Asia using Multi-Fidelity Gaussian Processes: Improved estimates from ERA5
Abstract. The rivers of High Mountain Asia provide freshwater to around 2 billion people. However, precipitation, the main driver of river flow, is still poorly understood due to limited direct measurements in this area. Existing tools to interpolate these measurements or downscale and bias-correct precipitation models have several limitations. To overcome these challenges, this paper uses a probabilistic machine learning approach called Multi-Fidelity Gaussian Processes (MFGPs) to downscale ERA5 climate reanalysis. The method is first validated by downscaling ERA5 precipitation data over data-rich Europe and then data-sparse Upper Beas and Sutlej River Basins in the Himalayas. We find that MFGPs are simpler to implement and more applicable to smaller datasets than other state-of-the-art machine learning models. MFGPs are also able to quantify and narrow the uncertainty associated with the precipitation estimates, which is especially needed over ungauged areas, and can be used to estimate the likelihood of extreme events that lead to floods or droughts. Over the Upper Beas and Sutlej River Basins, the precipitation estimates from the MFGP model are similar to or more accurate than available gridded precipitation products (APHRODITE, TRMM, CRU, bias-corrected WRF). The MFGP model and APHRODITE annual mean precipitation estimates generally agree with each other for this region. The MFGP model predicting slightly higher average precipitation and variance. However, more significant spatial deviations between the MFGP model and APHRODITE over this region appear during the summer monsoon. The MFGP model also presents a more effective spatial resolution of precipitation, generating more structure at finer scales than ERA5 and APHRODITE. MFGP precipitation estimates for the Upper Beas and Sutlej Basins between 1980 and 2013 at a 0.0625° resolution (approx. 9 km) are jointly published with this paper.
- Preprint
(11447 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2023-2145', Anonymous Referee #1, 08 Apr 2024
This paper presents use of multi-fidelity Gaussian processes (MFGPs) to downscale ERA5 precipitation data over high-mountain Asia. The method is validated via application to data-rich Europe and then to a “data-sparse” European scenario before applying it to data-sparse region in the Himalayas. The MFGP method produced similar or more accurate results than existing gridded precipitation products and provided better spatial resolution.
Although I do not have expertise in either machine learning or climatology of Europe or the Himalayas, I have formal training and expertise in applied mathematics and probability and work with precipitation data in mountainous regions of the western U.S. on a daily basis. Thus, I feel qualified to review the mathematical aspects of the paper and its general applicability to the problem of downscaling climate products to mountainous regions.
Overall, this is one of the best scientific papers I have ever read, especially considering its highly technical nature. All aspects of the paper appear to be very carefully prepared, including citation of relevant literature, readability of the text by a broad audience, presentation of methods and results, and technical precision in presentation of the mathematics and statistics.
MFPGs are presented in section 2, which I found to be easy to read and precise. All mathematical formulae appeared to be correct in content and format. More detailed formulae were given in Appendix A. Both the text and Figure 2 provided clear explanation of how the multi-fidelity process works. Specific methods and data sets were clearly described in sections 3 and 4, and the performance metrics were defined in Appendix B. Although these metrics are standard, explicit inclusion in the appendix makes the paper more accessible to a wider audience without detracting from readability of the text.
Results were clearly and concisely presented in Tables 1-3 and Figure 4. In addition, the use of power spectrum density to illustrate data resolution (Figure 6) was appropriate and informative. More detailed results were presented in appendices C and D, again providing more information without detracting from readability of the main text. Sections 6 and 7 concisely presented model applicability to other settings, including both advantages and disadvantages of the MFGP method. The authors have made the full dataset from their analysis available and have also made available their computer code.
I have no suggestions for improvement of the manuscript.
Citation: https://doi.org/10.5194/egusphere-2023-2145-RC1 -
AC1: 'Reply on RC1', Kenza Tazi, 11 Jun 2024
We thank Reviewer 1 for their thorough and positive feedback (e.g., "Overall, this is one of the best scientific papers I have ever read, especially considering its highly technical nature”). We appreciate their encouraging comments on the manuscript’s presentation, including the clarity and conciseness of the writing, the precision of the mathematical formulae and the efficacy of the figures (e.g., “All aspects of the paper appear to be very carefully prepared, including citation of relevant literature, readability of the text by a broad audience, presentation of methods and results, and technical precision in presentation of the mathematics and statistics”). We are also grateful they recognised our efforts towards contextualising this research within the existing literature and giving a balanced discussion about the advantages and disadvantages of the proposed method.
Citation: https://doi.org/10.5194/egusphere-2023-2145-AC1
-
AC1: 'Reply on RC1', Kenza Tazi, 11 Jun 2024
-
RC2: 'Comment on egusphere-2023-2145', Anonymous Referee #2, 06 May 2024
The authors present a model based on Multi-Fidelity Gaussian Process (MFGP) to effectively extrapolate precipitation data over High Mountain Asia. They claim that the MFGP model outperforms recent state of the art models and traditional techniques for smaller study areas with sparse datasets. They also provide MFGP precipitation estimates for the study region at ~ 9 km resolution.
However, I am not very convinced by their analyses.
First, ECWMF also provide high-resolution reanalysis precipitation data (ERA5 Land, hourly, 0.1 degree, ~ 9 km), which is not considered in the manuscript. How does the generated MFGP precipitation estimates compare with ERA5 Land precipitation data?
Second, the authors only consider a very simple machine learning model, i.e., linear regression, and complex deep learning models that require a lot of training data, including Convolutional Conditional Neural Processes (ConvCNP) and Convolutional Gaussian Neural Processes (ConvGNP). They neglect simple machine learning methods that do not need many data, such as random forest and support vector machine.
Finally, in the model comparison shown in Table 1-3, GP is only trained on ERA5 data at station locations. Why not using all ERA5 data in the study region to train and test the model? I would expect better model performance even for GPs.
Specific comments:
I believe the authors use the Nash-Sutcliffe efficiency (NSE) in the manuscript, rather than R2, which should always be non-negative values.
Citation: https://doi.org/10.5194/egusphere-2023-2145-RC2 -
AC2: 'Reply on RC2', Kenza Tazi, 11 Jun 2024
We thank Reviewer 2 for their insightful comments and suggestions. We address the reviewer's concerns point by point below.
Comment 1
“First, ECMWF also provides high-resolution reanalysis precipitation data (ERA5 Land, hourly, 0.1 degree, ~ 9 km), which is not considered in the manuscript. How does the generated MFGP precipitation estimates compare with ERA5 Land precipitation data?”
ERA5-Land is a reanalysis dataset that provides a consistent view of the evolution of land variables at an enhanced spatial resolution of 0.1° x 0.1° (9 km) compared to ERA5’s resolution of 0.25° x 0.25° (31 km). This is produced by running a land surface model to regenerate some of the land components of the ERA5 climate reanalysis. For atmospheric forcing, it uses ERA5 atmospheric variables such as air temperature and precipitation at a 0.1° resolution by linearly interpolating the driving variables to the ERA5-Land grid. Although other forcing variables are corrected, this is not the case for precipitation. For further details please see Muñoz-Sabater et al. (2021) and ECMWF (2024). Precipitation characteristics from ERA5-Land are therefore very similar to ERA5 (Gomis-Cebolla et al., 2023; Xu et al., 2022; Xin et al., 2022). They also should theoretically perform worse than the linear regression models presented in our paper (Table 1-3) which also include elevation as a predictor.
Moreover, we would like to stress that we have carefully chosen four gridded precipitation datasets for the High Mountain Asia region to evaluate our model against (based on existing literature). These are:
- APHRODITE, a gridded rain-gauge interpolated dataset for Asia considered the gold standard for precipitation in High Mountain Asia,
- CRU-TS4, a global gridded rain-gauge interpolated dataset.
- A bias-corrected high-resolution regional climate model simulation, which used the Weather Research and Forecast (WRF) model at a spatial resolution of 5 km, with precipitation output corrected using local rain-gauge data for the region investigated in this manuscript.
- TRMM, a satellite-based precipitation dataset designed to improve our understanding of precipitation in the current climate.
To address the reviewer’s concern, we will make the connection between ERA5-Land and our linear regression models clearer in Section 4.
Gomis-Cebolla, J., et al. (2023), Evaluation of ERA5 and ERA5-Land reanalysis precipitation datasets over Spain (1951–2020), Atmos. Res., 284 https://doi.org/10.1016/j.atmosres.2023.106606
Muñoz-Sabater, J., et al. (2021), ERA5-Land: A state-of-the-art global reanalysis dataset for land applications, Earth Syst.Sci. Data,13, 4349–4383, https://doi.org/10.5194/essd-13-4349-2021
ECMWF (2024). ERA5-Land: data documentation. Available at: https://confluence.ecmwf.int/display/CKB/ERA5-Land%3A+data+documentation
Xu, J., et al. (2022), Do ERA5 and ERA5-land precipitation estimates outperform satellite-based precipitation products? A comprehensive comparison between state-of-the-art model-based and satellite-based precipitation products over mainland China, J. Hydrol., 605, 127353, https://doi.org/10.1016/j.jhydrol.2021.127353.
Xin, Y., et al. (2022), Evaluation of IMERG and ERA5 precipitation products over the Mongolian Plateau, Sci. Rep., 12, 21776, https://doi.org/10.1038/s41598-022-26047-8.
Comment 2
“Second, the authors only consider a very simple machine learning model, i.e., linear regression, and complex deep learning models that require a lot of training data, including Convolutional Conditional Neural Processes (ConvCNP) and Convolutional Gaussian Neural Processes (ConvGNP). They neglect simple machine learning methods that do not need many data, such as random forest and support vector machines.”
The main goal of the paper was to provide calibrated uncertainty distributions for precipitation in this area. This information allows hydrologists to quantify the probabilities of extreme events and policymakers to make better decisions with limited resources as highlighted in Section 1. Although there are many possible machine learning benchmarks thatwork on smaller datasets, Random Forests and Support Vector Regression are not inherently probabilistic models.
However, we appreciate that the case for the performance of Gaussian processes could be better contextualised by including these models. To address the reviewer’s concern we implement Random Forests and Support Vector Regression for the validation experiments which we present in a new appendix. These new results are shown below in Tables A-C. The Random Forests perform similarly to the GPs doing better in the Beas and Sutlej Basins but worse over Europe. The Support Vector Regression models perform similarly to the linear regression models. Nevertheless, only the Gaussian Processes output calibrated uncertainty estimates.
Table A: Comparison of model performance metrics trained on ERA5 data for the 'data rich' setup over Europe. We include a linear regression model, a Random Forest, a Support Vector Regressor with a Radial Basis Function (RBF) kernel, a GP using a RBF kernel with Automatic Relevance Determination (ARD), a GP using a Matérn 5/2 kernel with ARD. The MFGP model also trained on the gauge data is also shown for reference. The metrics include the average RMSE, the 5th percentile RMSE (RMSE5), the 95th percentile RMSE (RMSE95), and the R2 score. The bolded values represent the best scores amongst the model benchmarks (not including the MFGP model).
Model
RMSE [mm/day]
RMSE5 [mm/day]
RMSE95 [mm/day]
R2
Linear reg.
1.72±0.46
1.75±0.18
5.21±1.55
0.04±0.06
RF
1.17±0.46
0.81±0.42
2.72±1.43
0.59±0.11
SVR - RBF
1.71±0.48
1.96±0.61
5.11±1.78
0.07±0.05
GP - RBF
1.18±0.45
0.56±0.29
2.79±1.22
0.57±0.13
GP - Matérn 5/2
1.16±0.43
0.52±0.25
2.58±1.11
0.57±0.13
MFGP
1.06±0.42
0.51±0.20
2.72±1.54
0.65±0.09
Table B: As Table A, but for the 'data sparse' setup over Europe.
Model
RMSE [mm/day]
RMSE5 [mm/day]
RMSE95 [mm/day]
R2
Linear reg.
1.77±0.46
1.88±0.25
5.19±1.76
-0.02±0.13
RF
1.18±0.46
0.83±0.43
2.80±1.47
0.58±0.11
SVR - RBF
1.74±0.51
2.08±0.74
5.11±1.81
0.04±0.04
GP - RBF
1.19±0.45
0.60±0.30
2.82±1.20
0.56±0.14
GP - Matérn 5/2
1.21±0.46
0.59±0.29
2.84±1.17
0.55±0.14
MFGP
1.13±0.47
0.57±0.23
3.02±1.62
0.62±0.11
Table C: As Table B, but for the Upper Beas and Sutlej Basins.
Model
RMSE [mm/day]
RMSE5 [mm/day]
RMSE95 [mm/day]
R2
Linear reg.
4.21±0.99
2.21±0.45
14.33±4.03
-0.08±0.05
RF
3.07±0.65
0.77±0.55
7.51±3.23
0.36±0.30
SVR - RBF
4.29±1.14
2.22±0.54
14.30±4.42
-0.11±0.05
GP - RBF
3.11±0.65
0.77±0.56
7.53±3.03
0.34±0.31
GP - Matérn 5/2
3.11±0.64
0.79±0.56
7.52±3.03
0.34±0.31
MFGP
3.00±0.92
1.66±0.95
9.62±3.63
0.46±0.11
Comment 3
“Finally, in the model comparison shown in Table 1-3, GP is only trained on ERA5 data at station locations. Why not use all ERA5 data in the study region to train and test the model? I would expect better model performance even for GPs.”
The model has information about ERA5 at all the training and test locations for the validation experiments. These locations fall within the ERA5 grid boxes, so there is theoretically little to no additional information to be gained by including neighbouring grid box values. To check this, we ran an experiment that was trained on all ERA5 data for the Beas and Sutlej Basins.
Results from this experiment are shown in Table D (below) and confirm that there is no added benefit in including this data. However, we have clarified this in the revised manuscript by adding additional text in Section 3, which states ‘The experiments were also conducted with all the ERA5 data for the study area (not shown), but showed no significant improvement over using the ERA5 data at the station locations only’.
Note that we did not rerun these experiments over Europe as we would have needed to apply methodological approximations to overcome the memory and computational bottlenecks that come with this larger domain.
Table D: Comparison of MFGP performance using ERA5 for the whole study area (all ERA5) and using only ERA5 at the training and test site locations (limited ERA5) over Upper Beas and Sutlej Basins. The metrics include the average RMSE, the 5th percentile RMSE (RMSE5), the 95th percentile RMSE (RMSE95), the R2 score and the mean log loss (MLL). The bolded values highlight the best scores.
Model
RMSE [mm/day]
RMSE5 [mm/day]
RMSE95 [mm/day]
R2
MLL
MFGP - limited ERA5
3.00±0.92
1.66±0.95
9.62±3.63
0.46±0.11
1.79±0.22
MFGP - all ERA5
5.16±2.51
0.84±0.56
19.48±9.79
0.32±0.27
1.68±0.34
Comment 4
“I believe the authors use the Nash-Sutcliffe efficiency (NSE) in the manuscript, rather than R2, which should always be non-negative values.”
We confirm that we are using the coefficient of determination or R2 score. This metric, defined and explained in Appendix B, and is given by:
R2 = 1 - SSres/ SStot = 1 - ∑i(yi - fi)2 / ∑i(yi - yavg)2
where fi is the ith predicted value, yi is the ith observed value, and yavg is the mean of the observations. SSres is therefore the sum of the squared residuals and SStot is the total sum of squares. A negative R2 is possible and would indicate that the model is predicting worse than the precipitation mean. Although negative R2 scores are unlikely in interpolation settings, they are possible when making predictions outside of the training distribution. To address the reviewer's concerns, we make this interpretation of negative R2 clearer in the main body of the paper and Appendix B.
Citation: https://doi.org/10.5194/egusphere-2023-2145-AC2
-
AC2: 'Reply on RC2', Kenza Tazi, 11 Jun 2024
Status: closed
-
RC1: 'Comment on egusphere-2023-2145', Anonymous Referee #1, 08 Apr 2024
This paper presents use of multi-fidelity Gaussian processes (MFGPs) to downscale ERA5 precipitation data over high-mountain Asia. The method is validated via application to data-rich Europe and then to a “data-sparse” European scenario before applying it to data-sparse region in the Himalayas. The MFGP method produced similar or more accurate results than existing gridded precipitation products and provided better spatial resolution.
Although I do not have expertise in either machine learning or climatology of Europe or the Himalayas, I have formal training and expertise in applied mathematics and probability and work with precipitation data in mountainous regions of the western U.S. on a daily basis. Thus, I feel qualified to review the mathematical aspects of the paper and its general applicability to the problem of downscaling climate products to mountainous regions.
Overall, this is one of the best scientific papers I have ever read, especially considering its highly technical nature. All aspects of the paper appear to be very carefully prepared, including citation of relevant literature, readability of the text by a broad audience, presentation of methods and results, and technical precision in presentation of the mathematics and statistics.
MFPGs are presented in section 2, which I found to be easy to read and precise. All mathematical formulae appeared to be correct in content and format. More detailed formulae were given in Appendix A. Both the text and Figure 2 provided clear explanation of how the multi-fidelity process works. Specific methods and data sets were clearly described in sections 3 and 4, and the performance metrics were defined in Appendix B. Although these metrics are standard, explicit inclusion in the appendix makes the paper more accessible to a wider audience without detracting from readability of the text.
Results were clearly and concisely presented in Tables 1-3 and Figure 4. In addition, the use of power spectrum density to illustrate data resolution (Figure 6) was appropriate and informative. More detailed results were presented in appendices C and D, again providing more information without detracting from readability of the main text. Sections 6 and 7 concisely presented model applicability to other settings, including both advantages and disadvantages of the MFGP method. The authors have made the full dataset from their analysis available and have also made available their computer code.
I have no suggestions for improvement of the manuscript.
Citation: https://doi.org/10.5194/egusphere-2023-2145-RC1 -
AC1: 'Reply on RC1', Kenza Tazi, 11 Jun 2024
We thank Reviewer 1 for their thorough and positive feedback (e.g., "Overall, this is one of the best scientific papers I have ever read, especially considering its highly technical nature”). We appreciate their encouraging comments on the manuscript’s presentation, including the clarity and conciseness of the writing, the precision of the mathematical formulae and the efficacy of the figures (e.g., “All aspects of the paper appear to be very carefully prepared, including citation of relevant literature, readability of the text by a broad audience, presentation of methods and results, and technical precision in presentation of the mathematics and statistics”). We are also grateful they recognised our efforts towards contextualising this research within the existing literature and giving a balanced discussion about the advantages and disadvantages of the proposed method.
Citation: https://doi.org/10.5194/egusphere-2023-2145-AC1
-
AC1: 'Reply on RC1', Kenza Tazi, 11 Jun 2024
-
RC2: 'Comment on egusphere-2023-2145', Anonymous Referee #2, 06 May 2024
The authors present a model based on Multi-Fidelity Gaussian Process (MFGP) to effectively extrapolate precipitation data over High Mountain Asia. They claim that the MFGP model outperforms recent state of the art models and traditional techniques for smaller study areas with sparse datasets. They also provide MFGP precipitation estimates for the study region at ~ 9 km resolution.
However, I am not very convinced by their analyses.
First, ECWMF also provide high-resolution reanalysis precipitation data (ERA5 Land, hourly, 0.1 degree, ~ 9 km), which is not considered in the manuscript. How does the generated MFGP precipitation estimates compare with ERA5 Land precipitation data?
Second, the authors only consider a very simple machine learning model, i.e., linear regression, and complex deep learning models that require a lot of training data, including Convolutional Conditional Neural Processes (ConvCNP) and Convolutional Gaussian Neural Processes (ConvGNP). They neglect simple machine learning methods that do not need many data, such as random forest and support vector machine.
Finally, in the model comparison shown in Table 1-3, GP is only trained on ERA5 data at station locations. Why not using all ERA5 data in the study region to train and test the model? I would expect better model performance even for GPs.
Specific comments:
I believe the authors use the Nash-Sutcliffe efficiency (NSE) in the manuscript, rather than R2, which should always be non-negative values.
Citation: https://doi.org/10.5194/egusphere-2023-2145-RC2 -
AC2: 'Reply on RC2', Kenza Tazi, 11 Jun 2024
We thank Reviewer 2 for their insightful comments and suggestions. We address the reviewer's concerns point by point below.
Comment 1
“First, ECMWF also provides high-resolution reanalysis precipitation data (ERA5 Land, hourly, 0.1 degree, ~ 9 km), which is not considered in the manuscript. How does the generated MFGP precipitation estimates compare with ERA5 Land precipitation data?”
ERA5-Land is a reanalysis dataset that provides a consistent view of the evolution of land variables at an enhanced spatial resolution of 0.1° x 0.1° (9 km) compared to ERA5’s resolution of 0.25° x 0.25° (31 km). This is produced by running a land surface model to regenerate some of the land components of the ERA5 climate reanalysis. For atmospheric forcing, it uses ERA5 atmospheric variables such as air temperature and precipitation at a 0.1° resolution by linearly interpolating the driving variables to the ERA5-Land grid. Although other forcing variables are corrected, this is not the case for precipitation. For further details please see Muñoz-Sabater et al. (2021) and ECMWF (2024). Precipitation characteristics from ERA5-Land are therefore very similar to ERA5 (Gomis-Cebolla et al., 2023; Xu et al., 2022; Xin et al., 2022). They also should theoretically perform worse than the linear regression models presented in our paper (Table 1-3) which also include elevation as a predictor.
Moreover, we would like to stress that we have carefully chosen four gridded precipitation datasets for the High Mountain Asia region to evaluate our model against (based on existing literature). These are:
- APHRODITE, a gridded rain-gauge interpolated dataset for Asia considered the gold standard for precipitation in High Mountain Asia,
- CRU-TS4, a global gridded rain-gauge interpolated dataset.
- A bias-corrected high-resolution regional climate model simulation, which used the Weather Research and Forecast (WRF) model at a spatial resolution of 5 km, with precipitation output corrected using local rain-gauge data for the region investigated in this manuscript.
- TRMM, a satellite-based precipitation dataset designed to improve our understanding of precipitation in the current climate.
To address the reviewer’s concern, we will make the connection between ERA5-Land and our linear regression models clearer in Section 4.
Gomis-Cebolla, J., et al. (2023), Evaluation of ERA5 and ERA5-Land reanalysis precipitation datasets over Spain (1951–2020), Atmos. Res., 284 https://doi.org/10.1016/j.atmosres.2023.106606
Muñoz-Sabater, J., et al. (2021), ERA5-Land: A state-of-the-art global reanalysis dataset for land applications, Earth Syst.Sci. Data,13, 4349–4383, https://doi.org/10.5194/essd-13-4349-2021
ECMWF (2024). ERA5-Land: data documentation. Available at: https://confluence.ecmwf.int/display/CKB/ERA5-Land%3A+data+documentation
Xu, J., et al. (2022), Do ERA5 and ERA5-land precipitation estimates outperform satellite-based precipitation products? A comprehensive comparison between state-of-the-art model-based and satellite-based precipitation products over mainland China, J. Hydrol., 605, 127353, https://doi.org/10.1016/j.jhydrol.2021.127353.
Xin, Y., et al. (2022), Evaluation of IMERG and ERA5 precipitation products over the Mongolian Plateau, Sci. Rep., 12, 21776, https://doi.org/10.1038/s41598-022-26047-8.
Comment 2
“Second, the authors only consider a very simple machine learning model, i.e., linear regression, and complex deep learning models that require a lot of training data, including Convolutional Conditional Neural Processes (ConvCNP) and Convolutional Gaussian Neural Processes (ConvGNP). They neglect simple machine learning methods that do not need many data, such as random forest and support vector machines.”
The main goal of the paper was to provide calibrated uncertainty distributions for precipitation in this area. This information allows hydrologists to quantify the probabilities of extreme events and policymakers to make better decisions with limited resources as highlighted in Section 1. Although there are many possible machine learning benchmarks thatwork on smaller datasets, Random Forests and Support Vector Regression are not inherently probabilistic models.
However, we appreciate that the case for the performance of Gaussian processes could be better contextualised by including these models. To address the reviewer’s concern we implement Random Forests and Support Vector Regression for the validation experiments which we present in a new appendix. These new results are shown below in Tables A-C. The Random Forests perform similarly to the GPs doing better in the Beas and Sutlej Basins but worse over Europe. The Support Vector Regression models perform similarly to the linear regression models. Nevertheless, only the Gaussian Processes output calibrated uncertainty estimates.
Table A: Comparison of model performance metrics trained on ERA5 data for the 'data rich' setup over Europe. We include a linear regression model, a Random Forest, a Support Vector Regressor with a Radial Basis Function (RBF) kernel, a GP using a RBF kernel with Automatic Relevance Determination (ARD), a GP using a Matérn 5/2 kernel with ARD. The MFGP model also trained on the gauge data is also shown for reference. The metrics include the average RMSE, the 5th percentile RMSE (RMSE5), the 95th percentile RMSE (RMSE95), and the R2 score. The bolded values represent the best scores amongst the model benchmarks (not including the MFGP model).
Model
RMSE [mm/day]
RMSE5 [mm/day]
RMSE95 [mm/day]
R2
Linear reg.
1.72±0.46
1.75±0.18
5.21±1.55
0.04±0.06
RF
1.17±0.46
0.81±0.42
2.72±1.43
0.59±0.11
SVR - RBF
1.71±0.48
1.96±0.61
5.11±1.78
0.07±0.05
GP - RBF
1.18±0.45
0.56±0.29
2.79±1.22
0.57±0.13
GP - Matérn 5/2
1.16±0.43
0.52±0.25
2.58±1.11
0.57±0.13
MFGP
1.06±0.42
0.51±0.20
2.72±1.54
0.65±0.09
Table B: As Table A, but for the 'data sparse' setup over Europe.
Model
RMSE [mm/day]
RMSE5 [mm/day]
RMSE95 [mm/day]
R2
Linear reg.
1.77±0.46
1.88±0.25
5.19±1.76
-0.02±0.13
RF
1.18±0.46
0.83±0.43
2.80±1.47
0.58±0.11
SVR - RBF
1.74±0.51
2.08±0.74
5.11±1.81
0.04±0.04
GP - RBF
1.19±0.45
0.60±0.30
2.82±1.20
0.56±0.14
GP - Matérn 5/2
1.21±0.46
0.59±0.29
2.84±1.17
0.55±0.14
MFGP
1.13±0.47
0.57±0.23
3.02±1.62
0.62±0.11
Table C: As Table B, but for the Upper Beas and Sutlej Basins.
Model
RMSE [mm/day]
RMSE5 [mm/day]
RMSE95 [mm/day]
R2
Linear reg.
4.21±0.99
2.21±0.45
14.33±4.03
-0.08±0.05
RF
3.07±0.65
0.77±0.55
7.51±3.23
0.36±0.30
SVR - RBF
4.29±1.14
2.22±0.54
14.30±4.42
-0.11±0.05
GP - RBF
3.11±0.65
0.77±0.56
7.53±3.03
0.34±0.31
GP - Matérn 5/2
3.11±0.64
0.79±0.56
7.52±3.03
0.34±0.31
MFGP
3.00±0.92
1.66±0.95
9.62±3.63
0.46±0.11
Comment 3
“Finally, in the model comparison shown in Table 1-3, GP is only trained on ERA5 data at station locations. Why not use all ERA5 data in the study region to train and test the model? I would expect better model performance even for GPs.”
The model has information about ERA5 at all the training and test locations for the validation experiments. These locations fall within the ERA5 grid boxes, so there is theoretically little to no additional information to be gained by including neighbouring grid box values. To check this, we ran an experiment that was trained on all ERA5 data for the Beas and Sutlej Basins.
Results from this experiment are shown in Table D (below) and confirm that there is no added benefit in including this data. However, we have clarified this in the revised manuscript by adding additional text in Section 3, which states ‘The experiments were also conducted with all the ERA5 data for the study area (not shown), but showed no significant improvement over using the ERA5 data at the station locations only’.
Note that we did not rerun these experiments over Europe as we would have needed to apply methodological approximations to overcome the memory and computational bottlenecks that come with this larger domain.
Table D: Comparison of MFGP performance using ERA5 for the whole study area (all ERA5) and using only ERA5 at the training and test site locations (limited ERA5) over Upper Beas and Sutlej Basins. The metrics include the average RMSE, the 5th percentile RMSE (RMSE5), the 95th percentile RMSE (RMSE95), the R2 score and the mean log loss (MLL). The bolded values highlight the best scores.
Model
RMSE [mm/day]
RMSE5 [mm/day]
RMSE95 [mm/day]
R2
MLL
MFGP - limited ERA5
3.00±0.92
1.66±0.95
9.62±3.63
0.46±0.11
1.79±0.22
MFGP - all ERA5
5.16±2.51
0.84±0.56
19.48±9.79
0.32±0.27
1.68±0.34
Comment 4
“I believe the authors use the Nash-Sutcliffe efficiency (NSE) in the manuscript, rather than R2, which should always be non-negative values.”
We confirm that we are using the coefficient of determination or R2 score. This metric, defined and explained in Appendix B, and is given by:
R2 = 1 - SSres/ SStot = 1 - ∑i(yi - fi)2 / ∑i(yi - yavg)2
where fi is the ith predicted value, yi is the ith observed value, and yavg is the mean of the observations. SSres is therefore the sum of the squared residuals and SStot is the total sum of squares. A negative R2 is possible and would indicate that the model is predicting worse than the precipitation mean. Although negative R2 scores are unlikely in interpolation settings, they are possible when making predictions outside of the training distribution. To address the reviewer's concerns, we make this interpretation of negative R2 clearer in the main body of the paper and Appendix B.
Citation: https://doi.org/10.5194/egusphere-2023-2145-AC2
-
AC2: 'Reply on RC2', Kenza Tazi, 11 Jun 2024
Data sets
Downscaled ERA5 monthly precipitation data using Multi-Fidelity Gaussian Processes between 1980 and 2012 for the Upper Beas and Sutlej Basins, Himalayas Kenza Tazi https://doi.org/10.5285/b2099787-b57c-44ae-bf42-0d46d9ec87cc
Model code and software
mfgp Kenza Tazi https://github.com/kenzaxtazi/mfgp
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
531 | 195 | 34 | 760 | 30 | 26 |
- HTML: 531
- PDF: 195
- XML: 34
- Total: 760
- BibTeX: 30
- EndNote: 26
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1