Bakaano-Hydro (v1.1). A distributed hydrology-guided deep learning model for streamflow prediction

Duku, Confidence

doi:10.5194/egusphere-2025-1633

Preprints

https://doi.org/10.5194/egusphere-2025-1633

Preprints

05 May 2025

| 05 May 2025

Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

Bakaano-Hydro (v1.1). A distributed hydrology-guided deep learning model for streamflow prediction

Confidence Duku

Abstract. Reliable streamflow prediction is fundamental to hydrological forecasting, water resources planning, and climate adaptation. However, existing data-driven approaches often lack physical interpretability and struggle to incorporate spatial heterogeneity and hydrological connectivity. Conversely, traditional process-based models are limited by high calibration demands and structural uncertainty, especially in data-scarce regions. These challenges underscore the need for hybrid frameworks that combine the strengths of physically based modeling with the predictive capacity of machine learning. Here, I present Bakaano-Hydro, a distributed hydrology-guided deep learning model for streamflow prediction. The model integrates a gridded runoff generation method, a topographic flow routing scheme, and a temporal convolutional network to capture both spatial and temporal hydrological dynamics. This architecture enables incorporation of spatial heterogeneity and explicitly represents hydrological connectivity, while using neural networks to learn streamflow dynamics and enhance predictive performance. Bakaano-Hydro’s performance is evaluated across six river basins spanning four continents, encompassing diverse climate zones, land-use patterns, and hydrological regimes. Results indicate that Bakaano-Hydro demonstrates robust performance in humid and snow-fed basins where saturation-excess runoff dominates, while revealing key limitations in arid and semi-arid regions characterized by infiltration-excess processes. Bakaano-Hydro advances the state of the art in data-driven hydrological modeling by integrating physical realism with deep learning. Its modular and fully automated pipeline enables rapid deployment in data-scarce regions, while maintaining high reliability and interpretability. These features make Bakaano-Hydro a promising tool for operational forecasting, climate risk assessment, and adaptation planning across diverse hydrological and socio-environmental contexts. The model code is publicly available at https://github.com/confidence-duku/bakaano-hydro to facilitate reproducibility and community-driven development.

Received: 07 Apr 2025 – Discussion started: 05 May 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Confidence Duku

Status: open (extended)

Post a comment Subscribe to comment alert

RC1:
'Comment on egusphere-2025-1633', Anonymous Referee #1, 30 Jun 2025 reply
General comments
The pre‑print presents Bakaano‑Hydro (v1.1), a fully distributed hybrid framework combining VegET‑based grid‑cell runoff generation, MFD routing and a Temporal Convolutional Network (TCN) with attention + FiLM conditioning. The code is open, the design modular and the evaluation spans six hydro‑climatic basins.

The chief shortcoming is the lack of an empirical benchmark against the data‑driven approaches that motivate the study. At minimum the authors should compare against (i) a lumped LSTM trained on catchment‑aggregated forcings and (ii) ideally a Conv‑LSTM fed with the same gridded inputs; a physics‑only baseline (VegET + routing) would further contextualise gains. Without these, neither the added predictive value nor computational overhead of the proposed architecture can be quantified.
Specific comments
Abstract: state training (1989–2016) and evaluation (1982–1988) periods and, once baselines are added, give a headline improvement (e.g. median ΔKGE vs lumped LSTM).

Baseline experiment: implement and report at least one well‑tuned baseline (lumped LSTM, Conv‑LSTM or both) on the same split; a small table of KGE/NSE and wall‑time for 2–3 basins suffices.

Runoff generation (Sect 2.1): clarify whether VegET parameters are default or calibrated and explain beforehand why saturation‑excess may fail in arid basins.

Neural network (Sect 2.3): list trainable parameters for both variants and typical wall‑time per basin on CPU/GPU (batch size, optimiser). Justify the 365‑day look‑back or summarise sensitivity tests (this mirrors Kratzert 2018 and provides a natural benchmark).

Data split & basin stats (Sect 4.1): justify the 1989 cut‑off and supply a table with initial vs final stations, mean record length, missing‑data threshold.

Diagnostics (Sect 4.3): add an extreme‑flow metric (Peak Flow Bias, FHV/FLV) and fix the truncated Figure 5 caption.

Figures/layout: ensure ≥ 300 dpi and greyscale‑safe colour palettes, consistent across plots.

Technical corrections
Complete Figure 5 caption and define all metrics.

Standardise “VegET” capitalisation; use km², mm day⁻¹, etc.

Line 200: “relavant” → “relevant”.

Define β‑KGE and α‑NSE at first mention.

Format all DOIs with https://doi.org/…

Reply
Citation: https://doi.org/10.5194/egusphere-2025-1633-RC1

Confidence Duku

Viewed

Total article views: 2,025 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,912	101	12	2,025	27	40

HTML: 1,912
PDF: 101
XML: 12
Total: 2,025
BibTeX: 27
EndNote: 40

Views and downloads (calculated since 05 May 2025)

Month	HTML	PDF	XML	Total
May 2025	109	19	3	131
Jun 2025	72	8	3	83
Jul 2025	29	11	2	42
Aug 2025	333	9	1	343
Sep 2025	1,251	18	1	1,270
Oct 2025	81	8	1	90
Nov 2025	20	16	0	36
Dec 2025	17	12	1	30

Cumulative views and downloads (calculated since 05 May 2025)

Month	HTML	PDF	XML	Total
May 2025	109	19	3	131
Jun 2025	72	8	3	83
Jul 2025	29	11	2	42
Aug 2025	333	9	1	343
Sep 2025	1,251	18	1	1,270
Oct 2025	81	8	1	90
Nov 2025	20	16	0	36
Dec 2025	17	12	1	30

Viewed (geographical distribution)

Total article views: 2,020 (including HTML, PDF, and XML) Thereof 2,020 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 19 Dec 2025

Short summary

Reliable streamflow prediction is vital for managing floods, droughts, and water resources, yet remains challenging due to data limitations and complex hydrological processes. Traditional models require intensive calibration, while many machine learning methods lack physical realism. Bakaano-Hydro integrates physical hydrology with machine learning to improve interpretability, generalizability, and performance, offering a robust approach for streamflow prediction in data-scarce regions.


Total:	0
HTML:	0
PDF:	0
XML:	0