the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Bakaano-Hydro (v1.1). A distributed hydrology-guided deep learning model for streamflow prediction
Abstract. Reliable streamflow prediction is fundamental to hydrological forecasting, water resources planning, and climate adaptation. However, existing data-driven approaches often lack physical interpretability and struggle to incorporate spatial heterogeneity and hydrological connectivity. Conversely, traditional process-based models are limited by high calibration demands and structural uncertainty, especially in data-scarce regions. These challenges underscore the need for hybrid frameworks that combine the strengths of physically based modeling with the predictive capacity of machine learning. Here, I present Bakaano-Hydro, a distributed hydrology-guided deep learning model for streamflow prediction. The model integrates a gridded runoff generation method, a topographic flow routing scheme, and a temporal convolutional network to capture both spatial and temporal hydrological dynamics. This architecture enables incorporation of spatial heterogeneity and explicitly represents hydrological connectivity, while using neural networks to learn streamflow dynamics and enhance predictive performance. Bakaano-Hydro’s performance is evaluated across six river basins spanning four continents, encompassing diverse climate zones, land-use patterns, and hydrological regimes. Results indicate that Bakaano-Hydro demonstrates robust performance in humid and snow-fed basins where saturation-excess runoff dominates, while revealing key limitations in arid and semi-arid regions characterized by infiltration-excess processes. Bakaano-Hydro advances the state of the art in data-driven hydrological modeling by integrating physical realism with deep learning. Its modular and fully automated pipeline enables rapid deployment in data-scarce regions, while maintaining high reliability and interpretability. These features make Bakaano-Hydro a promising tool for operational forecasting, climate risk assessment, and adaptation planning across diverse hydrological and socio-environmental contexts. The model code is publicly available at https://github.com/confidence-duku/bakaano-hydro to facilitate reproducibility and community-driven development.
- Preprint
(1814 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-1633', Anonymous Referee #1, 30 Jun 2025
-
RC2: 'Comment on egusphere-2025-1633', Anonymous Referee #2, 30 Dec 2025
General Assessment
This manuscript presents Bakaano-Hydro, a hybrid modeling framework that combines a gridded process-based runoff generation scheme (VegET), topographic flow routing, and a deep learning architecture (TCN with attention and FiLM conditioning) to simulate distributed streamflow. The paper is ambitious in scope, technically detailed, and addresses an important and timely problem in hydrology: how to reconcile physical realism, spatial heterogeneity, and predictive skill within data-driven modeling frameworks.
The manuscript is generally well written, clearly structured, and accompanied by open-source code, which is a strong asset. The author demonstrates a deep understanding of both hydrological theory and modern machine learning architectures. The evaluation across six large basins spanning multiple hydroclimatic regimes is a notable strength, as is the explicit diagnostic discussion of where and why the model fails.
However, despite these strengths, I have substantial concerns regarding the framing of novelty, the strength of some claims (especially concerning data-scarce regions), the choice and rigidity of the runoff generation mechanism, and the lack of comparison against relevant baselines. In its current form, the manuscript would benefit from significant revision to better position Bakaano-Hydro within the rapidly evolving literature on physics-guided and hybrid hydrological machine learning, and to more carefully delimit the conditions under which the model is genuinely advantageous.
Overall, I believe this work has clear potential for publication, but major revisions are required before it can be considered for acceptance.
Major Comments
I. Novelty and Positioning Relative to Existing Hybrid and Physics-Guided ML Models
The manuscript repeatedly emphasizes that Bakaano-Hydro addresses limitations of “state-of-the-art data-driven hydrological models” by incorporating spatial heterogeneity and hydrological connectivity. While this motivation is valid, the literature review and framing do not sufficiently acknowledge how much progress has already been made in this direction.
In recent years, numerous studies have: Incorporated physical constraints directly into neural networks; Used distributed or graph-based representations of river networks; Combined conceptual or process-based runoff modules with ML-based routing or correction layers; Explicitly targeted spatial generalization and ungauged basins. As a result, statements implying that most existing data-driven models are lumped and physically uninterpretable are overly broad and somewhat outdated.
The manuscript would benefit from: 1. A clearer comparison between Bakaano-Hydro and other physics-guided or hybrid frameworks, not only lumped LSTM baselines. 2. Explicit discussion of why Bakaano-Hydro’s serial hybridization (VegET → routing → TCN) offers conceptual or practical advantages over alternative coupling strategies. 3. Without this, the novelty risks being perceived as incremental rather than transformative, especially given that the runoff generation and routing components are externally imposed rather than learned or dynamically coupled.
II. Strong Claims About Applicability in Data-Scarce Regions Are Not Fully Supported
The manuscript repeatedly argues that Bakaano-Hydro is well suited for data-scarce regions and may outperform traditional process-based models under such conditions. However, the presented results do not convincingly support this claim.
Key concerns: 1. All experiments rely on GRDC stations with at least three years of data, many of them in well-monitored basins. 2. The neural network is trained on multi-station, multi-year observations, which are precisely what data-scarce regions lack. 3. Performance degrades substantially in arid and semi-arid basins (e.g., Orange), which are often the most data-scarce and management-critical regions.
In fact, the results suggest the opposite: where hydrology is complex, threshold-driven, or poorly aligned with the assumed runoff mechanism, the model struggles.
I strongly recommend that the author: 1. Substantially tone down claims regarding superiority or robustness in data-scarce regions, or 2. Provide explicit experiments demonstrating performance under reduced training data availability (e.g., leave-one-basin-out, reduced station density, or short-record training).
As written, the claim that Bakaano-Hydro is “particularly well-suited” for data-scarce regions is not adequately justified.
III Structural Dependence on Saturation-Excess Runoff Is a Major Limitation
The manuscript commendably provides an honest diagnosis of model weaknesses, especially the reliance on VegET’s saturation-excess runoff formulation. However, this limitation is not merely a secondary detail—it is structural and foundational.
Key issues: 1. VegET does not represent infiltration-excess (Hortonian) runoff, transmission losses, or event-scale runoff generation. 2. Routing ignores travel time, channel storage, and attenuation. 3. The neural network can only correct patterns that are present in the routed runoff signal; it cannot invent missing hydrological processes.
As a result: 1. The model systematically underperforms in arid, semi-arid, and regulated basins. 2. Peak flows and flash responses are poorly captured. 3. The claim of “hydrology-guided” modeling becomes ambiguous when the guiding physics are incomplete for many real-world systems.
I encourage the author to: 1. More explicitly acknowledge that Bakaano-Hydro is not process-agnostic, but rather strongly tailored to humid, saturation-dominated systems. 2. Reframe the contribution as a successful prototype architecture, rather than a broadly applicable solution. 3. Discuss whether alternative runoff generators (or multiple regimes) could be modularly integrated, and what that would imply for training and interpretability.
IV. Lack of Benchmark Comparisons Limits Interpretability of Performance
While the manuscript presents extensive diagnostic metrics (NSE, KGE, log-transformed variants, decomposition terms), there is no direct comparison to meaningful baselines such as: 1. A lumped LSTM or TCN, 2. A conceptual hydrological model (e.g., VIC model), 3. A routing-only ML model using the same inputs.
Without such benchmarks, it is difficult to assess: 1. How much predictive skill comes from the neural network versus the runoff generator, 2. Whether spatial routing materially improves performance, 3. Whether the additional complexity is justified.
Even a limited comparison in one or two basins would significantly strengthen the manuscript and help readers contextualize the reported skill levels.
V. Conceptual Ambiguity Between “Physical Guidance” and “Preprocessing”
The paper repeatedly describes Bakaano-Hydro as “physically guided” or “physics-informed.” However, in practice: 1. The physics-based components operate entirely upstream of the neural network, 2. There is no physical constraint enforced during learning, 3. The network does not feed back into runoff generation or routing.
This raises a conceptual question: Is Bakaano-Hydro a physics-guided model, or a data-driven model with physically motivated preprocessing?
This distinction matters, especially for readers comparing this framework to: 1. Physics-informed neural networks (PINNs), 2. Differentiable hydrological models, 3. Hybrid models with joint optimization.
Clarifying this distinction would improve conceptual clarity and prevent overinterpretation of the model’s physical grounding.
Minor Comments:
The introduction occasionally overgeneralizes shortcomings of process-based models (e.g., calibration demands, structural uncertainty). These statements would benefit from more nuanced phrasing.
The rationale for choosing a 365-day lookback window is not fully justified. Is this optimal across all basins?
The choice of Hargreaves PET should be briefly justified, given its known limitations in humid or energy-limited environments.
The description of the three-branch vs. two-branch architectures could be streamlined; the practical implications of choosing one over the other remain somewhat unclear.
Figures are generally clear, but some captions (e.g., Fig. 5) are overly dense and could be simplified.
Technical and Editorial Suggestions:
Consider adding a concise table summarizing model assumptions, including runoff generation, routing, and learning components.
Explicitly state computational requirements (runtime, memory) for basin-scale applications.
Minor grammatical issues are present but do not impede understanding.
Ensure consistent use of terminology (e.g., “hydrology-guided,” “physics-based,” “process-based”).
Citation: https://doi.org/10.5194/egusphere-2025-1633-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,934 | 119 | 16 | 2,069 | 28 | 41 |
- HTML: 1,934
- PDF: 119
- XML: 16
- Total: 2,069
- BibTeX: 28
- EndNote: 41
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General comments
The pre‑print presents Bakaano‑Hydro (v1.1), a fully distributed hybrid framework combining VegET‑based grid‑cell runoff generation, MFD routing and a Temporal Convolutional Network (TCN) with attention + FiLM conditioning. The code is open, the design modular and the evaluation spans six hydro‑climatic basins.
The chief shortcoming is the lack of an empirical benchmark against the data‑driven approaches that motivate the study. At minimum the authors should compare against (i) a lumped LSTM trained on catchment‑aggregated forcings and (ii) ideally a Conv‑LSTM fed with the same gridded inputs; a physics‑only baseline (VegET + routing) would further contextualise gains. Without these, neither the added predictive value nor computational overhead of the proposed architecture can be quantified.
Specific comments
Technical corrections