Preprints
https://doi.org/10.5194/egusphere-2025-1699
https://doi.org/10.5194/egusphere-2025-1699
05 May 2025
 | 05 May 2025
Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

When physics gets in the way: an entropy-based evaluation of conceptual constraints in hybrid hydrological models

Manuel Álvarez Chaves, Eduardo Acuña Espinoza, Uwe Ehret, and Anneli Guthke

Abstract. Merging physics-based with data-driven approaches in hybrid hydrological modeling offers new opportunities to enhance predictive accuracy while addressing challenges of model interpretability and fidelity. Traditional hydrological models, developed using physical principles, are easily interpretable but often limited by their rigidity and assumptions. In contrast, machine learning methods, such as Long Short-Term Memory (LSTM) networks, offer exceptional predictive performance but are often criticized for their black-box nature. Hybrid models aim to reconcile these approaches by imposing physics to constrain and understand what the ML part of the model does. This study introduces a quantitative metric based on Information Theory to evaluate the relative contributions of physics-based and data-driven components in hybrid models. Through synthetic examples and a large-sample case study, we examine the role of physics-based conceptual constraints: can we actually call the hybrid model "physics-constrained", or does the data-driven component overwrite these constraints for the sake of performance? We test this on the arguably most constrained form of hybrid models, i.e., we prescribe structures of typical conceptual hydrological models and allow an LSTM to modify only its parameters over time, as learned during training against observed discharge data. Our findings indicate that performance predominantly relies on the data-driven component, with the physics-constraint often adding minimal value or even making the prediction problem harder. This observation challenges the assumption that integrating physics should enhance model performance by informing the LSTM. Even more alarming, the data-driven component is able to avoid (parts of) the conceptual constraint by driving certain parameters to insensitive constants or value sequences that effectively cancel out certain storage behavior. Our proposed approach helps to analyse such conditions in-depth, which provides valuable insights into model functioning, case study specifics, and the power or problems of prior knowledge prescribed in the form of conceptual constraints. Notably, our results also show that hybrid modeling may offer hints towards parsimonious model representations that capture dominant physical processes, but avoid illegitimate constraints. Overall, our framework can (1) uncover the true role of constraints in presumably "physics-constrained" machine learning, and (2) guide the development of more accurate representations of hydrological systems through careful evaluation of the utility of expert knowledge to tackle the prediction problem at hand.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share
Manuel Álvarez Chaves, Eduardo Acuña Espinoza, Uwe Ehret, and Anneli Guthke

Status: open (until 21 Jun 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Manuel Álvarez Chaves, Eduardo Acuña Espinoza, Uwe Ehret, and Anneli Guthke
Manuel Álvarez Chaves, Eduardo Acuña Espinoza, Uwe Ehret, and Anneli Guthke

Viewed

Total article views: 115 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
92 18 5 115 0 0
  • HTML: 92
  • PDF: 18
  • XML: 5
  • Total: 115
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 05 May 2025)
Cumulative views and downloads (calculated since 05 May 2025)

Viewed (geographical distribution)

Total article views: 181 (including HTML, PDF, and XML) Thereof 181 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 15 May 2025
Download
Short summary
This study evaluates hybrid hydrological models that combine physics-based and data-driven components, using Information Theory to measure their relative contributions. When testing conceptual models with LSTMs that adjust parameters over time, we found performance primarily comes from the data-driven component, with physics constraints adding minimal value. We propose a quantitative tool to analyse this behaviour and suggest a workflow for diagnosing hybrid models.
Share