Use of nonlinear principal components of CHIRPS precipitation data and ocean-atmospheric variables for streamflow forecasting in an area of scarce data. Case study, Tocaría river basin – Orinoquia Colombiana
Abstract. Accurate streamflow forecasting is critical for mitigating the impacts of hydrological extremes and guiding sustainable water resource management, particularly in poorly gauged tropical catchments. This study presents a hybrid forecasting framework that integrates Neural Network Seasonal Autoregressive Integrated Moving Average using exogenous variables (NN-SARIMAX) models with nonlinear principal components (NLPCs) derived from CHIRPS precipitation data, and large-scale ocean–atmosphere indices (macroclimatic variables, MVs). Four monthly models were developed and tested for the Tocaría River basin in the Colombian Orinoquía region: (1) a baseline SARIMA (4,0,4) (0,0,3)12 model; (2) SARIMAX with exogenous MVs; (3) NN-SARIMAX with NLPCs; and (4) a hybrid NN-SARIMAX combining both MVs and NLPCs. The hybrid model achieved the best performance with an R2 of 0.78 during the validation period. These results underscore the effectiveness of integrating local precipitation variability and large-scale climatic drivers to enhance forecast accuracy under data-scarce conditions. The proposed methodology offers a transferable approach for operational forecasting in ungauged or sparsely monitored basins, contributing to early warning systems, drought preparedness, and adaptive water governance in vulnerable tropical regions.