the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Evaluating the Performance of Objective Functions and Regional Climate Models for Hydrologic Climate Change Impact Studies: A Case Study in the Eastern Mediterranean
Abstract. The robustness of hydrological models used in projections of future fresh water resources is compromised due to non-stationary climate conditions. This study aims to (i) develop a method for selecting a skillful hydrological model parameterization under changing climate conditions and (ii) apply a calibrated hydrological model to assess streamflow projections for 38 mountain watersheds in the eastern Mediterranean island of Cyprus over the next decades (2030–2060). A matrix-based approach was developed to evaluate six objective functions by eight performance measures. Using the GR4J hydrological model, evaluation matrices were computed for multiple 5-year simulation runs covering 1980–2015. The matrices covered 14 model calibrations and 182 validations in total, as well as 4 sets of validations under different climate change conditions, for each watershed. Based on the matrix method, the Nash-Sutcliffe Efficiency with square-root transformed streamflow resulted in the best performance for streamflow simulations in Mediterranean watersheds experiencing drying trends. This method is transferable and can be applied in different climate regions to identify the most suitable objective function and model parameterization for hydrologic climate impact assessments. Eighteen Regional Climate Models (RCMs) were bias-corrected, downscaled to 1 km and used to simulate streamflow with GR4J for 1980–2010. Nine RCMs underestimated the fraction of wet period precipitation (60–73 % instead of 82 % of annual precipitation), causing streamflow biases up to 40 %. The remaining nine RCMs selected for the study simulated the seasonal precipitation cycle accurately. The median of future projections showed a 6 % reduction in precipitation and a 17 % reduction in streamflow. In the worst case, reductions could reach 16 % and 39 %, respectively. Notably, during the driest years, streamflow reductions could reach 70 % relative to the driest years in the past. Our findings suggest that terrestrial water resources in the eastern Mediterranean may significantly deteriorate in the coming decades.
- Preprint
(1459 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 18 Dec 2025)
-
RC1: 'Comment on egusphere-2025-2478', Anonymous Referee #1, 10 Sep 2025
reply
-
AC1: 'Reply on RC1', Ioannis Sofokleous, 20 Oct 2025
reply
Our Reviewer 1 response to his/her comments is uploaded in the form of a supplement.
-
AC1: 'Reply on RC1', Ioannis Sofokleous, 20 Oct 2025
reply
-
RC2: 'Comment on egusphere-2025-2478', Anonymous Referee #2, 21 Nov 2025
reply
In their manuscript, Sofokleous et al. aim to investigate the impacts of climate change on streamflow in Cyprus. To this end, the authors test various metrics for model calibration. While their general scientific aim is valid and fits well with HESS, the article's current scientific quality requires major revisions. A revision is necessary to address the article's framing, the misleading statements in the abstract, and to provide a more precise reflection of the methodology and the terminology used to explain it.
First, I suggest the title be revised. You cannot evaluate the performance of objective functions – they are means towards evaluation. It can be a comparison of different objective functions for assessing different regional climate models. And then the focus is clearly on Cyprus, not the eastern MED; otherwise, this is highly misleading.
In general, it would also be beneficial if the authors reflect on the additional value of not only assessing model performance compared to the past, but also conducting a sensitivity analysis. For example, Wagener et al. (2022) (https://wires.onlinelibrary.wiley.com/doi/full/10.1002/wcc.772) explain how response-based evaluation can be a complementary strategy. Connected to this matter is the lack of a sensitivity analysis to prepare for calibration. Without knowledge of what parameters can and should be changed, there is limited value in comparing the calibration to different metrics. The calibration also requires the authors to state which parameters are changed clearly and the corresponding value ranges. This also requires providing evidence on what parameters should be changed in the first place, i.e., a sensitivity analysis.
Specific comments:
12: Compromised is the wrong word here. First, what does robustness mean here? Is it the ability to simulate with a similar skill under different circumstances? If that is the case, non-stationarity is something climate change is causing and might be a challenge, but it is not something that undermines model performance; rather, it questions whether models are fit for the right purpose.
16: But why would you evaluate objective functions? To check whether your calibration is good? However, that does not test the objective function; it tests how well your model was calibrated using different objective functions.
21: This is a misleading statement; you are simulating catchments in Cyprus, not in the MED. Similarly, your conclusion in line 30 is highly misleading as well.
40: Kang Ji 2023 Reference missing
55: This seems like a terrible idea. Why would you restrict model simulations to such a subjective space? Unfortunately, I am not able to find the publication in the references to understand this in more detail.
56 ff: Could you reflect on multi-objective function calibration here as well? Does this solve some of the prob,lems, and if not, why not?
81: Please specify how the projections differ for the different RCPs with respect to the study ranges you cite here.
91: Performance limits in what regard?
96: Again, you cannot evaluate the performance of an objective function. You can evaluate the model performance of a model that has been calibrated with a specific objective function or a set of objective functions.
107: Why are you using these specific metrics and their transformations? What kind of behavior space are you covering with them? Why are you not also separating them into components that would specifically tell us something about the mass balance, peak behavior, etc.? And wouldn't this be more valuable to assess in a sensitivity analysis? This would also tell us which parameters are sensitive under which objective functions for different catchments in your assessment.
135: How sensitive is the calibration to picking this specific value? How does this assumption impact the results?
Citation: https://doi.org/10.5194/egusphere-2025-2478-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 686 | 42 | 18 | 746 | 14 | 31 |
- HTML: 686
- PDF: 42
- XML: 18
- Total: 746
- BibTeX: 14
- EndNote: 31
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
In this manuscript the authors demonstrate a robust hydrologically modelling method for simulating streamflow under projected future climate conditions.
While there is not anything particularly new in this manuscript the overarching method is well considered and supported by comprehensive modelling experiments. The manuscript is well structured, and the scientific literature well referenced throughout. Figures and tables are appropriate. Manuscript is generally well written but, in some cases, mixes tenses – benefit from further proofread to improve clarity.
Subject to revision this manuscript would make a useful addition to the scientific literature.
Specific comments
Abstract
Ln 12 : Insert word …..“conceptual” hydrological models….Need to make it clear early that you are referring to conceptual hydrological models here. i.e. physically based models may not be compromised by non-stationary climate conditions.
Ln 14 Is “assess” the correct word here?
Ln 18 here and later it is not clear to me how multiple 5-year windows between 1980-2015 resulted in 14 calibration and 182 validations. A little more explanation is required in the main body of manuscript, as it is not intuitive.
Ln 19 Matrix method. Reword, ambiguous.
Ln 24 “…used to simulation streamflow with GR4J…” reword to make it clear streamflow was simulated using GR4J using inputs from the RCMs.
Ln 30 Here and elsewhere I don’t really like the term deteriorate in this context. It is ambiguous. Could you be more specific e.g. mean annual streamflow will decrease.
Ln 69 Hageman et al. (2013) is more than 12 years old. Is there not a more recent study using CMIP models?
Ln 78 Which phase of the CMIP?
Ln 80 RCP? So this manuscript is using CMIP5 models. It does beg the question how different CMIP5 is from CMIP6 over the Mediterranean region? Hopefully this is covered in the discussion as it would be necessary to place the results of this study into context with the latest climate modelling.
Ln 82 insert word “mean”? e.g. “…highlighted a MEAN annual precipitation reduction of…”
Ln 80-100 The introduction doesn’t make clear to me what the new scientific contribution this manuscript makes.
Data and methods
My main comment with respect to methods is there is no justification for the adoption of the 5-year calibration (and validation) window length. I understand that one wants windows short enough to have distinct wet/dry phases and I understand models were selected based on calibration and validation performance but why 5-years? However, considering the principle of ’equifinality’ is 5-years sufficient for calibration? Would the results/conclusions be different for a longer window length (minimum of 10 years is typically used)?
Ln 106 There are only two transformations not three. i.e. “ 1) no transformation; 2…”
Ln 127 Not clear how the validation were undertaken. Did they also have a 1 year warm up period?
Ln 149 method not methodology. Methodology is a study a methods (e.g. a study of different farming systems is a methodology).
Ln 156 I think this needs rewording as I don’t know what is a “..typical annual and interannual variability in precipitation of Mediterranean climates”? South-eastern Australia and South Africa have mediterranean climates and they are among the most variable in the world.
Ln 159 I assume these are all unregulated with minimal landuse change over the experimental period? This isn’t stated anywhere.
Results
Figure 3 it is difficult to see change in the heat map. Can the gradient be modified to better show changes (ie introducing a third colour into the colour ramp?)
Ln 366 dam storage? I think dam yield would be more appropriate? Besides changes in runoff and changed in dam yield under future climate projections are not always the same, so knowledge of the former doesn’t necessarily translate to the latter.
Figure 4 Reference evaporation is designated as ET in this manuscript, however, in this figure PET is used?
Discussion
My understanding is that this was based on CMIP5 data, how does CMIP5 data compare to CMIP6 data for this region? A brief discussion would be useful to put results of this manuscript into context of more recent CMIP6 data.
Ln 520 Yes this is true for mid-to-high flows there is more uncertainty in future climate inputs but for low-flows it has been found that there is more uncertainty in the hydrological models than climate inputs e.g. See Petheram et al. (2012), Teng et al. (2012),
References
Petheram C, Rustomji P, McVicar TR, Cai WJ, Chiew FHS, Vleeshouwer J, Van Niel TG, Li LT, Creswell RG, Donohue RJ, Teng J, and Perraud J-M (2012) Estimating the impact of projected climate change on runoff across the tropical savannas and semi-arid rangelands of northern Australia. Journal of Hydrometeorology. 13(2), 483-503, doi:10.1175/jhm-d-11-062.1; (IF 3.573; GSC: 5).
Teng J, Vaze J, Chiew F, Wang B, Perraud J-M (2012) Estimating the relative uncertainties sourced from GCMs and hydrological models in modelling climate change impact on runoff. Journal of Hydrometeorology 13(1), 122-139, doi: https://doi.org/10.1175/JHM-D-11-058.1