Enhancing flood forecasting reliability in data-scarce regions with a distributed hydrology-guided neural network framework
Abstract. Flood early warning systems are critical for reducing disaster impacts, yet their effectiveness remains limited in data-scarce regions such as Africa and South America. Existing global platforms – including GloFAS and the Google Flood Hub – exhibit low reliability in these areas, particularly for rare flood events and under strict timing constraints. Here, I demonstrate the potential of a distributed, hydrology-guided neural network framework, Bakaano-Hydro, to enhance flood forecasting reliability in data-scarce regions. The proposed framework integrates process-based runoff generation, topographic routing, and a Temporal Convolutional Network for streamflow simulation. Using a hindcast-based evaluation across 470 gauging stations from 1982 to 2016, I benchmark Bakaano-Hydro's flood detection skill against GloFAS and Google AI model across multiple return periods (1-, 2-, 5-, and 10-year) and timing tolerances (0–2 days). Results show that Bakaano-Hydro consistently achieves higher Critical Success Index (CSI), lower False Alarm Rate (FAR), and higher Probability of Detection (POD), even under exact-day (0-day) timing constraints. Its median CSI scores at 0-day tolerance exceed or match those of GloFAS and Google AI model under more lenient timing thresholds. These performance gains are statistically significant across diverse hydroclimatic regions, including arid and tropical basins, demonstrating the model's spatial generalization capacity. By coupling physical realism with machine learning generalizability, Bakaano-Hydro provides a reliable, interpretable, and open-source tool for enhancing flood forecasting in regions most vulnerable to climate extremes and least equipped with observational infrastructure.