Uncovering a Key Predictors for Enhancing Daily Streamflow Simulation Using Machine Learning
Abstract. The sequence of droughts and wetter periods in Australia poses challenges for long-term hydrologic modelling. This paper develops a novel machine learning-based approach to uncover key predictors that improve daily streamflow predictions during and after the Millennium drought (1997 to 2009) in 39 gauged sub-catchments in Western Victoria, Australia.
For this purpose, a hybrid approach is adopted, combining simulations from the GR4J hydrological model with physical data as forcing (predictors) for multiple ML algorithms to identify the key predictors for improving streamflow prediction. GR4J is a widely used operational hydrological model in Australia. ML models including predictors representing long-term runoff coefficient and short-term runoff and rainfall showed the greatest improvement in streamflow predictions, particularly for low flows. This suggests that GR4J has limited ability to capture short/long-term persistence and therefore model enhancement should focus on these shortcomings. All ML algorithms resulted in improved streamflow prediction, with Multilayer Perceptron (MLP) consistently yielding the highest Nash Sutcliffe Efficiency, and Random Forest showing the strongest improvement in terms of low-flow prediction. Long-term runoff coefficient and machine learning were most effective in catchments with lower long-term runoff coefficients. Overall, this study provides insights for water resources management in drought-prone regions, highlighting the key predictors in the combination of ML and hydrological modelling to improve streamflow predictions during and after droughts.