Bakaano-Hydro (v1.1). A distributed hydrology-guided deep learning model for streamflow prediction
Abstract. Reliable streamflow prediction is fundamental to hydrological forecasting, water resources planning, and climate adaptation. However, existing data-driven approaches often lack physical interpretability and struggle to incorporate spatial heterogeneity and hydrological connectivity. Conversely, traditional process-based models are limited by high calibration demands and structural uncertainty, especially in data-scarce regions. These challenges underscore the need for hybrid frameworks that combine the strengths of physically based modeling with the predictive capacity of machine learning. Here, I present Bakaano-Hydro, a distributed hydrology-guided deep learning model for streamflow prediction. The model integrates a gridded runoff generation method, a topographic flow routing scheme, and a temporal convolutional network to capture both spatial and temporal hydrological dynamics. This architecture enables incorporation of spatial heterogeneity and explicitly represents hydrological connectivity, while using neural networks to learn streamflow dynamics and enhance predictive performance. Bakaano-Hydro’s performance is evaluated across six river basins spanning four continents, encompassing diverse climate zones, land-use patterns, and hydrological regimes. Results indicate that Bakaano-Hydro demonstrates robust performance in humid and snow-fed basins where saturation-excess runoff dominates, while revealing key limitations in arid and semi-arid regions characterized by infiltration-excess processes. Bakaano-Hydro advances the state of the art in data-driven hydrological modeling by integrating physical realism with deep learning. Its modular and fully automated pipeline enables rapid deployment in data-scarce regions, while maintaining high reliability and interpretability. These features make Bakaano-Hydro a promising tool for operational forecasting, climate risk assessment, and adaptation planning across diverse hydrological and socio-environmental contexts. The model code is publicly available at https://github.com/confidence-duku/bakaano-hydro to facilitate reproducibility and community-driven development.
General comments
The pre‑print presents Bakaano‑Hydro (v1.1), a fully distributed hybrid framework combining VegET‑based grid‑cell runoff generation, MFD routing and a Temporal Convolutional Network (TCN) with attention + FiLM conditioning. The code is open, the design modular and the evaluation spans six hydro‑climatic basins.
The chief shortcoming is the lack of an empirical benchmark against the data‑driven approaches that motivate the study. At minimum the authors should compare against (i) a lumped LSTM trained on catchment‑aggregated forcings and (ii) ideally a Conv‑LSTM fed with the same gridded inputs; a physics‑only baseline (VegET + routing) would further contextualise gains. Without these, neither the added predictive value nor computational overhead of the proposed architecture can be quantified.
Specific comments
Technical corrections