Bayesian parameter inference in hydrological modelling using a Hamiltonian Monte Carlo approach with a stochastic rain model
 ^{1}Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland
 ^{2}Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland
 ^{1}Institute of Computational Life Sciences, Zurich University of Applied Sciences, Wädenswil, Switzerland
 ^{2}Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland
Abstract. Conceptual models of the rainfallrunoff behaviour of hydrological catchments have proven to be useful tools for making probabilistic predictions. However, model parameters need to be calibrated to measured data and their uncertainty quantified. Bayesian statistics is a consistent framework for learning from observed data, in which knowledge about model parameters is described through probability distributions. One of the dominant sources of uncertainty in rainfallrunoff modelling is the true rainfall over the catchment, which often needs to be inferred from a few raingauge and runoff measurements. modelling this uncertainty naturally leads to stochastic differential equation models, which render traditional inference algorithms such as the Metropolis algorithm infeasible due to their expensive likelihood functions. Therefore, in hydrology and other applied fields of research, error models are traditionally oversimplified for ease of inference as additive errors on the output, leading to biased parameter estimates and unreliable predictions. However, thanks to recent advancements in algorithms and computing power, fullfledged Bayesian inference with stochastic models is no longer offlimits for hydrological applications. We demonstrate this with a case study from urban hydrology, for which we employ a highly efficient Hamiltonian Monte Carlo inference algorithm with a timescale separation.
Simone Ulzega and Carlo Albert
Status: open (until 15 Feb 2023)

RC1: 'Comment on egusphere2022857', Anonymous Referee #1, 05 Oct 2022
reply
General comments
The paper presents a Bayesian framework for forward and inverse problems in stochastic rain models based on time series observations of rainfallrunoff. The focus of the paper is on HMC as a scalable inference method. The paper provides a detailed study of the hydrological problem, giving a detailed description of the model, the parameters, and the priors. The discussion of the results is convincing.
Specific comments
 It would be good to explicitly list the contributions, making it easier for the reader to see the highlevel differences between this paper and Albert et al. (2016).
 I wanted to clarify some of the results. Looking at Fig. 56 for predictions, it seems that the discharge data alone is mostly enough to provide good predictions for the rainfall. Is that generally true? How good would be the estimates of the parameters, and the predicted rainfall if you had no rain observation data? Would it be better or worse than the lowquality data of Sc2?
 In Fig 3, you show that with a more accurate dataset (Sc1) the estimates of the parameters are more sharply peaked (less uncertain), which makes sense. There seems to be a mismatch for some of these parameters, e.g. lambda and gamma. I guess the problem of inferring gamma and lambda jointly is illposed as they both define the transformation. If so, that could be an interesting point to discuss.
 I'm also generally curious what is the end goal of this study, for example, can these results be used to aid policy making? Are quantities like groundwater flow, or retention time important to know for planning purposes? Would there ever be a need to run a system like this in realtime?
Technical comments
 Some references need fixing, e.g. Line 510, some papers are missing titles.
 Line 290: construct e reversible > construct a reversible

AC1: 'Reply on RC1', Simone Ulzega, 28 Nov 2022
reply
We are grateful for the positive comments and the very interesting remarks and questions, which give us the opportunity to clarify some important points. We answer the specific comments one by one below.
Specific comments.

In Albert et al. (2016) we described a novel implementation of an HMC algorithm combined with a multiple timescale integration for Bayesian parameter inference with nonlinear stochastic differential equation models, and we demonstrated the performance of the method using a simple rainfallrunoff toy model and synthetic data timeseries. For purely didactic purposes, the rain input to the system was modeled using a smooth sinusoidal function.
In the present work instead, we apply the HMC method with the timescale separation approach from Albert et al. (2016) for the first time to a realworld case study in urban hydrology, using real timeseries of observed rainfall and outflows. Moreover, in this work we carry out the inference process using intentionally inaccurate rainfall observations and demonstrate the ability of the algorithm to reconstruct with great accuracy the unknown true average rainfall over the catchment. The reconstructed precipitation is then used to infer the hydrological model parameters, which are thus protected from the corrupting effect of the uncertainty on the rainfall observations.

This is indeed a very interesting point. We have run the HMC inference without rainfall data, obtaining both model parameter marginals and a predicted rainfall pattern that are substantially identical to those obtained with the inaccurate data of Sc2. Therefore, in Sc2 the HMC algorithm “learns” that the observed rain should be essentially ignored, thus producing results which are practically the same as if the inference was run without any rain data at all. However, in most applications the accuracy and reliability of the measured precipitation data is unknown a priori. We show here that in those cases the rainfall observations can be safely used in the inference process, since the algorithm itself will assess its accuracy and possibly disregard it in favor of a more reliable reconstructed rainfall.
We believe that this result is interesting and worth a remark. Therefore, we will certainly be glad to add a comment on it in the next revision of the paper. 
This is also an interesting point. The inferred posterior distribution does not show any correlations between the parameters, e.g., lambda and gamma. Therefore, the problem of inferring them does not seem to be illposed.
Instead, in Sc1 the HMC algorithm tunes the parameters of the rainfall potential transformation to match the (accurate) rainfall data. This is evident for the large precipitation peak near time = 60 min, as clearly visible in the lower halves of figures 5 and 6. The smaller value of the inferred parameter gamma in Sc1 reflects exactly this attempt of the algorithm to find a better fit to the rain observations, especially where precipitation values are large. The smaller observational error for the precipitation in Sc1 is also an obvious consequence of a better match of predictions and data. All other parameter marginals exhibit much smaller discrepancies between Sc1 and Sc2. We will discuss this point more clearly in the next paper revision.  This work is intended to be a purely methodological study. Its main goal is to demonstrate that the HMC algorithm combined with a multiple timescale integration presented in Albert et al. (2016) can be successfully applied to solve realworld hydrological inference problems with computationally expensive stochastic models. This method is especially very wellsuited for cases, far from rare in hydrology, where the precipitation data is inaccurate and unreliable. It reduces considerably the bias in the inferred parameters by shielding them from the deteriorating effect of the rainfall data inaccuracy, thus leading to more reliable runoff predictions. The knowledge of all model parameters, including the groundwater flow and the retention time, is essential for making robust probabilistic predictions, which can certainly be useful in planning and policy making. This method is definitely a powerful and versatile tool for Bayesian inference with expensive stochastic models, whereas it might not be the optimal solution for realtime control of hydrological systems, where faster algorithms might be preferable. This topic, however, is not discussed in detail here since it goes beyond the scope of this study.
Technical comments
We thank the reviewer for pointing out these two technical issues. We will fix them in the next revision.

RC2: 'Reply on AC1', Anonymous Referee #1, 30 Nov 2022
reply
Thank you for the detailed reply, it clarifies all of my original questions. I agree it would be valuable to add your discussion of Q2 and Q3 to the main part of the paper.


RC3: 'Comment on egusphere2022857', Anonymous Referee #2, 12 Jan 2023
reply
General comments
This manuscript presents a Bayesian framework for forward and inverse problems and presents the Hamiltonian Monte Carlo (HMC) as a scalable inference method for calibration of models to noisy time series. The paper details an application of this framework in stochastic rain models based on time series observations of rainfallrunoff. The paper provides a case study of a single storm event over a single catchment, and although the paper is technically well written, it currently reads more like a technical note rather than a research article. Overall, the implications of the study were unclear – floods are mentioned briefly but discussion of whether this approach holds up when considering 1. different hydrological modelling approaches, 2. climate variability and nonstationarity, 3. different catchment types and antecedent conditions, and 4. flash floods, could further strengthen the argument for using this novel approach.
This paper is more akin to a technical report and is therefore not entirely suited for HESS audiences as a research article in its current form. However, the approach detailed in the paper and its suitability to model real world hydrological impacts are of interest to HESS audiences. Thus, the manuscript could be strengthened with some moderate revisions and reframing; to demonstrate the superiority of this methodology and approach, where the application of such an approach is most beneficial, and, what the implications of using this approach are in terms of hydrological services to aid decision makers. The inclusion of the above would go most of the way to addressing “relevant scientific questions within the scope of HESS” as well as providing more tangible implications for the reader.
Unfortunately, as a reviewer I only have an option to choose between minor and major revisions so I chose major to reflect the fact that the effort required to address my comments would be greater than that of addressing minor comments. However, I suspect that the effort needed to revise this manuscript would fall somewhere in between – i.e. moderate revisions. I sincerely hope these comments and the more specific ones below are helpful to the authors.
Specific comments
 I was not sure what the benefits of this approach are versus other methods. For example, why is this approach is beneficial over other hydrological modelling approaches, such as hydraulic or other physicsbased approaches (e.g. flow routing) or conceptual models (e.g. pipe flow simulations) in terms of computational efficiency and accuracy. For example, could you run this simulation in real time for now casting?
 What catchment conditions is this method suitable for? I assume modelling storm water is the reason for choosing an urban catchment for a case study, but perhaps the paper could be strengthened by stating that explicitly and focusing on the difficulty of modeling storm water runoff accurately.
 What applications is this methodology suited for? Floods are briefly mentioned and storm water is the focus, but does this methodology enhance the modelled accuracy of any other flood impacts such as inundation?
 The methods section takes up the bulk of the paper, even allowing for the fact that the focus is on the novelty of the method. Could some of the details be put into supplementary information? It is unclear as to how novel the methodology or framework proposed is given that a prior paper on the HMC has been published. Could the authors please highlight what is “new”?
 The case study description is a bit light on in detail. I could not discern the reasons why the single storm event and catchment were chosen from the case study description. A reader is likely to be skeptical as to the broad applicability of any method when only one catchment and event are modelled, can the case study be expanded to include multiple events and/or multiple catchments? In addition, the reasons for the two ScX datasets could be made clearer to the reader in the case study description. Also is there a third case that could be explored? No Sc1 or Sc2 data?
 Figure 3 could be further or more thoroughly explained in terms of the whys – e.g. why does gamma show the largest shift? Is it due to, for example, the event characteristics or catchment characteristics or both?
 The comparison between poor quality and good quality rainfall is a bit confusing (Figures 56) but looks like an interesting result? It appears that the discharge data alone is enough to provide good predictions for the rainfall. What is the purpose of using Sc2 then – can the authors please explain this in detail? Also, it would be good to know whether this is the case across the board (i.e. more than one event in one catchment).
 Results overall: A discussion of the limitations and applicability/suitability of the method along with the implications of its use would strengthen the paper. The figure discussions could be improved by relating the results to the characteristics of the event and catchment. Could the authors please detail a realworld application using this approach (e.g. nowcasting of storm water runoff during an event)?
 In considering the citations and reference list, it appears to me that the authors have considered all the major technical HMC and related publications (although I am far from a leading expert in the field of HMC), however I note here, that in addressing the above and general comments, more citations for the background and contextual information will need to be included.
Technical comments
Given that the paper could be improved and strengthened by reframing and providing more context for the reader, and thus would need moderate revisions, I have not gone through the manuscript with a finetooth comb, however I have picked a couple of things up:
References: Some references need fixing, for example, some papers are missing titles.
Edits: Line 290: construct e reversible should be “construct a reversible”
Simone Ulzega and Carlo Albert
Simone Ulzega and Carlo Albert
Viewed
HTML  XML  Total  BibTeX  EndNote  

322  108  14  444  2  2 
 HTML: 322
 PDF: 108
 XML: 14
 Total: 444
 BibTeX: 2
 EndNote: 2
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1