the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The Wasserstein distance as a hydrological objective function
Malcolm S. Sambridge
Abstract. When working with hydrological data, the ability to quantify the similarity of different datasets is useful. The choice of how to make this quantification has direct influence on the results, with different measures of similarity emphasising particular sources of error (for example, errors in amplitude as opposed to displacements in time and/or space). The Wasserstein distance considers the similarity of mass distributions through a transport lens. In a hydrological context, it measures the ‘effort’ required to rearrange one distribution of water into the other. While being more broadly applicable, particular interest is payed to hydrographs in this work. The Wasserstein distance is adapted for working with hydrographs in two different ways, and tested in a calibration and ‘averaging’ of hydrographs context. This alternate definition of fit is shown successful in accounting for timing errors due to imprecise rainfall measurements. The averaging of an ensemble of hydrographs is shown suitable when differences among the members is in peak shape and timing, but not in total peak volume, where the traditional mean works well.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(1054 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(1054 KB)  BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
workrequired to rearrange one mass distribution into the other. We also use similar mathematical techniques for defining a type of
averagebetween water distributions.
Jared C. Magyar and Malcolm S. Sambridge
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221117', Uwe Ehret, 28 Nov 2022
Dear Editor, dear Authors,
Please see my comments in the attachment.
Yours sincerely, Uwe Ehret

AC1: 'Reply on RC1', Jared Magyar, 24 Jan 2023
We thank the reviewer for their helpful feedback. Our replies to these comments are displayed in italics.
Title: The paper discusses how the Wasserstein distance can be used i) as an objective function and ii) to construct ensemble representatives, but only the first aspect is reflected in the title. I suggest changing the title such that is contains both aspects.
The authors agree with this point. We suggest the new title of ‘Hydrological objective functions and ensemble averaging with the Wasserstein distance’ to better reflect the contents of the paper.
Use of the word "metric": In the article, the authors use the term "metric" to refer to distance measures in general, not to the strict definition of a metric being a distance measure with the following properties:
 d(a,b)>=0
 d(a,b)=0 then a identical b
 d(a,b)+d(b,c)<=d(a,c) (triangle inequality).
As the authors go quite deep into the derivation of the Wasserstein distance, and the discussion of its properties, I suggest they should i) restrict usage of the term "metric" to true metrics only, and ii) mention if Wasserstein distance is a true metric or not.
The Wasserstein distance is a true metric on the space of density functions. We will therefore more explicitly note this property in line 125 with an appropriate reference.
Similarly, it will be helpful for the reader to discuss if the Wasserstein distance is a symmetrical or nonsymmetrical distance function, i.e. if d(a,b) = d(b,a) or not. In other words, does Wasserstein compare the distance of some "model" to a "reference truth", or the distance between to objects on equal terms.
The Wasserstein distance is indeed symmetric, including the modified versions (massbased penalty, and hydrographWasserstein distance). Again, we will state this property more explicitly in lines 125126.
A comment (no changes requested): Apart from the illustrative results from the synthetic test cases, for the hydrological community it would be very useful to see some applications to real world, and long, hydrographs. These will be the cases that are relevant from a practical point of view, but where the limitations of the Wasserstein "massproblem" will become most obvious. I do not request extra work here, as the authors mention in lines 7883 that the main purpose of this paper is to introduce the concept and show some illustrative examples, and I think this is enough to justify the paper. Nevertheless it would be helpful to provide such applications at least in a followup paper. This will greatly increase the chances that the method will be picked up by the community.
The authors agree with this comment, and also agree that it is beyond the scope and purpose of this particular contribution.
Sect. 3.1: The authors apply the Wasserstein and RMSEoptimization to the system with the rainfall timing errors. It would be helpful to initially mention that when using the true rainfall input, both Wasserstein and RMSEoptimization would perfectly identify the true model parameters (which I assume should be the case).
As with RMSE, the Wasserstein distance will only be zero (global minimum) when the inputs are identical. Under the assumption that the solution is unique, errorless data would indeed give us the true model parameters when the Wasserstein distance is optimised. This will be expressed explicitly in the text in line 307.
Eq. 1: at this point, it is unclear why two different sets of data x and y are required for functions f and g. Either explain or replace y by x.
Thank you, this was an error in the equation and will be fixed as suggested. Indeed, it is important that the data points are the same for both f and g.
Citation: https://doi.org/10.5194/egusphere20221117AC1

AC1: 'Reply on RC1', Jared Magyar, 24 Jan 2023

RC2: 'Comment on egusphere20221117', Luk Peeters, 04 Jan 2023
The manuscript presents an excellent introduction to the Wasserstein distance and explores its relevance for hydrological modelling. The paper is well structured, the figures are relevant and of high quality and the writing is clear. I have no reservation to recommend the paper to be published. I do have a few minor comments as detailed below.
1. References to hydrological literature. The literature review is comprehensive and relevant, but it would benefit from some references to hydrological literature on objective functions. Below are a couple of references I believe to be relevant. I leave it to the authors to decide if they want to include them (never include a reference just because a reviewer suggests it).
2. The Wasserstein distance appears to have concepts in common with flow anamorphosis (http://www.stat.boogaart.de/Publications/g1901.pdf). Can you explore or comment on how the Wasserstein distance compares to flow anamorphosis?
References
Schaefli, B., & Kavetski, D. (2017). Bayesian spectral likelihood for hydrological parameter inference. Water Resources Research, 53(8), 6857–6884. https://doi.org/10.1002/2016WR019465
Schoups, G., & Vrugt, J. A. (2010). A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and nonGaussian errors. Water Resources Research, 46, W10531. https://doi.org/10.1029/2009WR008933
Viney, N. R., Perraud, J. M., Vaze, J., Chiew, F. H. S., Post, D. A., & Yang, A. (2009). The usefulness of bias constraints in model calibration for regionalisation to ungauged catchments. 18th World IMACS Congress and MODSIM09 International Congress on Modelling and Simulation, 13–17. https://mssanz.org.au/modsim09/I7/viney_I7a.pdf
Vrugt, J. A., de Oliveira, D. Y., Schoups, G., & Diks, C. G. H. (2022). On the use of distributionadaptive likelihood functions: Generalized and universal likelihood functions, scoring rules and multicriteria ranking. Journal of Hydrology, 615, 128542. https://doi.org/10.1016/j.jhydrol.2022.128542
Citation: https://doi.org/10.5194/egusphere20221117RC2 
AC2: 'Reply on RC2', Jared Magyar, 24 Jan 2023
We thank the reviewer for their thoughful comments on our manuscript. Replies to the comments are displayed in italics.
The manuscript presents an excellent introduction to the Wasserstein distance and explores its relevance for hydrological modelling. The paper is well structured, the figures are relevant and of high quality and the writing is clear. I have no reservation to recommend the paper to be published. I do have a few minor comments as detailed below.
1. References to hydrological literature. The literature review is comprehensive and relevant, but it would benefit from some references to hydrological literature on objective functions. Below are a couple of references I believe to be relevant. I leave it to the authors to decide if they want to include them (never include a reference just because a reviewer suggests it).
The authors agree that some additional reference to hydrological literature would be suitable. Priority thus far has been placed on deterministic objective functions, as significant discussion of Bayesian approaches may confuse the reader on the proposed use of the Wasserstein distance. We also do not propose any way of building a likelihood function from the Wasserstein distance, so we believe any additions referring to likelihood functions should be kept brief.
However, we will add a short note after line 77 about the need of a likelihood function for Bayesian inference, but this being distinct from an objective function as it is dependent on understanding the noise statistics of the data. It is worth noting however that likelihood functions such as Schoups & Vrugt (2010) and Vrugt et al. (2022) which account for temporally correlated errors operate directly on the pointwise residuals, while we use a different definition of residual (see Fig. 7c, where residuals are between the inverse cumulative distributions). It is therefore not straightforward to design a Wassersteinbased likelihood function and beyond the scope of this manuscript.
2. The Wasserstein distance appears to have concepts in common with flow anamorphosis (http://www.stat.boogaart.de/Publications/g1901.pdf). Can you explore or comment on how the Wasserstein distance compares to flow anamorphosis?
This is an interesting concept, and the authors thank you for bringing it to our attention. While there are similarities in the transportbased methods, the emphasis for the given flow anamorphosis method is the mapping between distributions, rather than quantifying the work required to do so as is the case with the Wasserstein distance. While van den Boogaart et al. (2015) do use a form of transport map via Lagrangian trajectories, whether this is the optimal transport map from which the Wasserstein distance is defined in Eq. 6 is not immediately clear.
We should note the potential use of the optimal transport map for this application. In particular, the Sliced Wasserstein distance (Rabin et al. 2011) may find use in mapping points in high dimensions to a multivariate normal to allow sampling of arbitrary distributions through the inverse mapping (Magyar, unpublished work).
While there are interesting links and applications here, much of it is beyond the scope of this manuscript, so there are two additions that we suggest:
 When introducing the optimal transport map, mentioning the similarity to sampling transformations (it is the same as the inverse transform sampling method in 1D, and similar in concept to flow anamorphosis for higher dimensions). We will also make very clear that the Wasserstein distance is defined for the optimal (workminimising) map, not just any map between the distributions.
 In the conclusions, mentioning the untapped potential of the optimal transport map (rather than just the Wasserstein distance) for a twoway mapping between collected data and a reference distribution (e.g. multivariate Gaussian) in hydrological applications. Further details however are beyond the scope of this contribution.
References:
Schoups, G., & Vrugt, J. A. (2010). A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and nonGaussian errors. Water Resources Research, 46, W10531. https://doi.org/10.1029/2009WR008933
Vrugt, J. A., de Oliveira, D. Y., Schoups, G., & Diks, C. G. H. (2022). On the use of distributionadaptive likelihood functions: Generalized and universal likelihood functions, scoring rules and multicriteria ranking. Journal of Hydrology, 615, 128542. https://doi.org/10.1016/j.jhydrol.2022.128542
Rabin, J., Peyré, G., Delon, J., and Bernot, M.: Wasserstein Barycenter and Its Application to Texture Mixing, in: International Conference on Scale Space and Variational Methods in Computer Vision, pp. 435–446, 2011.
Citation: https://doi.org/10.5194/egusphere20221117AC2

AC2: 'Reply on RC2', Jared Magyar, 24 Jan 2023
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221117', Uwe Ehret, 28 Nov 2022
Dear Editor, dear Authors,
Please see my comments in the attachment.
Yours sincerely, Uwe Ehret

AC1: 'Reply on RC1', Jared Magyar, 24 Jan 2023
We thank the reviewer for their helpful feedback. Our replies to these comments are displayed in italics.
Title: The paper discusses how the Wasserstein distance can be used i) as an objective function and ii) to construct ensemble representatives, but only the first aspect is reflected in the title. I suggest changing the title such that is contains both aspects.
The authors agree with this point. We suggest the new title of ‘Hydrological objective functions and ensemble averaging with the Wasserstein distance’ to better reflect the contents of the paper.
Use of the word "metric": In the article, the authors use the term "metric" to refer to distance measures in general, not to the strict definition of a metric being a distance measure with the following properties:
 d(a,b)>=0
 d(a,b)=0 then a identical b
 d(a,b)+d(b,c)<=d(a,c) (triangle inequality).
As the authors go quite deep into the derivation of the Wasserstein distance, and the discussion of its properties, I suggest they should i) restrict usage of the term "metric" to true metrics only, and ii) mention if Wasserstein distance is a true metric or not.
The Wasserstein distance is a true metric on the space of density functions. We will therefore more explicitly note this property in line 125 with an appropriate reference.
Similarly, it will be helpful for the reader to discuss if the Wasserstein distance is a symmetrical or nonsymmetrical distance function, i.e. if d(a,b) = d(b,a) or not. In other words, does Wasserstein compare the distance of some "model" to a "reference truth", or the distance between to objects on equal terms.
The Wasserstein distance is indeed symmetric, including the modified versions (massbased penalty, and hydrographWasserstein distance). Again, we will state this property more explicitly in lines 125126.
A comment (no changes requested): Apart from the illustrative results from the synthetic test cases, for the hydrological community it would be very useful to see some applications to real world, and long, hydrographs. These will be the cases that are relevant from a practical point of view, but where the limitations of the Wasserstein "massproblem" will become most obvious. I do not request extra work here, as the authors mention in lines 7883 that the main purpose of this paper is to introduce the concept and show some illustrative examples, and I think this is enough to justify the paper. Nevertheless it would be helpful to provide such applications at least in a followup paper. This will greatly increase the chances that the method will be picked up by the community.
The authors agree with this comment, and also agree that it is beyond the scope and purpose of this particular contribution.
Sect. 3.1: The authors apply the Wasserstein and RMSEoptimization to the system with the rainfall timing errors. It would be helpful to initially mention that when using the true rainfall input, both Wasserstein and RMSEoptimization would perfectly identify the true model parameters (which I assume should be the case).
As with RMSE, the Wasserstein distance will only be zero (global minimum) when the inputs are identical. Under the assumption that the solution is unique, errorless data would indeed give us the true model parameters when the Wasserstein distance is optimised. This will be expressed explicitly in the text in line 307.
Eq. 1: at this point, it is unclear why two different sets of data x and y are required for functions f and g. Either explain or replace y by x.
Thank you, this was an error in the equation and will be fixed as suggested. Indeed, it is important that the data points are the same for both f and g.
Citation: https://doi.org/10.5194/egusphere20221117AC1

AC1: 'Reply on RC1', Jared Magyar, 24 Jan 2023

RC2: 'Comment on egusphere20221117', Luk Peeters, 04 Jan 2023
The manuscript presents an excellent introduction to the Wasserstein distance and explores its relevance for hydrological modelling. The paper is well structured, the figures are relevant and of high quality and the writing is clear. I have no reservation to recommend the paper to be published. I do have a few minor comments as detailed below.
1. References to hydrological literature. The literature review is comprehensive and relevant, but it would benefit from some references to hydrological literature on objective functions. Below are a couple of references I believe to be relevant. I leave it to the authors to decide if they want to include them (never include a reference just because a reviewer suggests it).
2. The Wasserstein distance appears to have concepts in common with flow anamorphosis (http://www.stat.boogaart.de/Publications/g1901.pdf). Can you explore or comment on how the Wasserstein distance compares to flow anamorphosis?
References
Schaefli, B., & Kavetski, D. (2017). Bayesian spectral likelihood for hydrological parameter inference. Water Resources Research, 53(8), 6857–6884. https://doi.org/10.1002/2016WR019465
Schoups, G., & Vrugt, J. A. (2010). A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and nonGaussian errors. Water Resources Research, 46, W10531. https://doi.org/10.1029/2009WR008933
Viney, N. R., Perraud, J. M., Vaze, J., Chiew, F. H. S., Post, D. A., & Yang, A. (2009). The usefulness of bias constraints in model calibration for regionalisation to ungauged catchments. 18th World IMACS Congress and MODSIM09 International Congress on Modelling and Simulation, 13–17. https://mssanz.org.au/modsim09/I7/viney_I7a.pdf
Vrugt, J. A., de Oliveira, D. Y., Schoups, G., & Diks, C. G. H. (2022). On the use of distributionadaptive likelihood functions: Generalized and universal likelihood functions, scoring rules and multicriteria ranking. Journal of Hydrology, 615, 128542. https://doi.org/10.1016/j.jhydrol.2022.128542
Citation: https://doi.org/10.5194/egusphere20221117RC2 
AC2: 'Reply on RC2', Jared Magyar, 24 Jan 2023
We thank the reviewer for their thoughful comments on our manuscript. Replies to the comments are displayed in italics.
The manuscript presents an excellent introduction to the Wasserstein distance and explores its relevance for hydrological modelling. The paper is well structured, the figures are relevant and of high quality and the writing is clear. I have no reservation to recommend the paper to be published. I do have a few minor comments as detailed below.
1. References to hydrological literature. The literature review is comprehensive and relevant, but it would benefit from some references to hydrological literature on objective functions. Below are a couple of references I believe to be relevant. I leave it to the authors to decide if they want to include them (never include a reference just because a reviewer suggests it).
The authors agree that some additional reference to hydrological literature would be suitable. Priority thus far has been placed on deterministic objective functions, as significant discussion of Bayesian approaches may confuse the reader on the proposed use of the Wasserstein distance. We also do not propose any way of building a likelihood function from the Wasserstein distance, so we believe any additions referring to likelihood functions should be kept brief.
However, we will add a short note after line 77 about the need of a likelihood function for Bayesian inference, but this being distinct from an objective function as it is dependent on understanding the noise statistics of the data. It is worth noting however that likelihood functions such as Schoups & Vrugt (2010) and Vrugt et al. (2022) which account for temporally correlated errors operate directly on the pointwise residuals, while we use a different definition of residual (see Fig. 7c, where residuals are between the inverse cumulative distributions). It is therefore not straightforward to design a Wassersteinbased likelihood function and beyond the scope of this manuscript.
2. The Wasserstein distance appears to have concepts in common with flow anamorphosis (http://www.stat.boogaart.de/Publications/g1901.pdf). Can you explore or comment on how the Wasserstein distance compares to flow anamorphosis?
This is an interesting concept, and the authors thank you for bringing it to our attention. While there are similarities in the transportbased methods, the emphasis for the given flow anamorphosis method is the mapping between distributions, rather than quantifying the work required to do so as is the case with the Wasserstein distance. While van den Boogaart et al. (2015) do use a form of transport map via Lagrangian trajectories, whether this is the optimal transport map from which the Wasserstein distance is defined in Eq. 6 is not immediately clear.
We should note the potential use of the optimal transport map for this application. In particular, the Sliced Wasserstein distance (Rabin et al. 2011) may find use in mapping points in high dimensions to a multivariate normal to allow sampling of arbitrary distributions through the inverse mapping (Magyar, unpublished work).
While there are interesting links and applications here, much of it is beyond the scope of this manuscript, so there are two additions that we suggest:
 When introducing the optimal transport map, mentioning the similarity to sampling transformations (it is the same as the inverse transform sampling method in 1D, and similar in concept to flow anamorphosis for higher dimensions). We will also make very clear that the Wasserstein distance is defined for the optimal (workminimising) map, not just any map between the distributions.
 In the conclusions, mentioning the untapped potential of the optimal transport map (rather than just the Wasserstein distance) for a twoway mapping between collected data and a reference distribution (e.g. multivariate Gaussian) in hydrological applications. Further details however are beyond the scope of this contribution.
References:
Schoups, G., & Vrugt, J. A. (2010). A formal likelihood function for parameter and predictive inference of hydrologic models with correlated, heteroscedastic, and nonGaussian errors. Water Resources Research, 46, W10531. https://doi.org/10.1029/2009WR008933
Vrugt, J. A., de Oliveira, D. Y., Schoups, G., & Diks, C. G. H. (2022). On the use of distributionadaptive likelihood functions: Generalized and universal likelihood functions, scoring rules and multicriteria ranking. Journal of Hydrology, 615, 128542. https://doi.org/10.1016/j.jhydrol.2022.128542
Rabin, J., Peyré, G., Delon, J., and Bernot, M.: Wasserstein Barycenter and Its Application to Texture Mixing, in: International Conference on Scale Space and Variational Methods in Computer Vision, pp. 435–446, 2011.
Citation: https://doi.org/10.5194/egusphere20221117AC2

AC2: 'Reply on RC2', Jared Magyar, 24 Jan 2023
Peer review completion
Journal article(s) based on this preprint
workrequired to rearrange one mass distribution into the other. We also use similar mathematical techniques for defining a type of
averagebetween water distributions.
Jared C. Magyar and Malcolm S. Sambridge
Model code and software
The Wasserstein distance as a hydrological objective function Jared Magyar https://doi.org/10.5281/zenodo.7217989
Jared C. Magyar and Malcolm S. Sambridge
Viewed
HTML  XML  Total  BibTeX  EndNote  

308  129  14  451  7  4 
 HTML: 308
 PDF: 129
 XML: 14
 Total: 451
 BibTeX: 7
 EndNote: 4
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(1054 KB)  Metadata XML