Preprints
https://doi.org/10.5194/egusphere-2025-1845
https://doi.org/10.5194/egusphere-2025-1845
05 May 2025
 | 05 May 2025
Status: this preprint is open for discussion and under review for The Cryosphere (TC).

Improving forecasts of snow water equivalent with hybrid machine learning

Oriol Pomarol Moya, Madlene Nussbaum, Siamak Mehrkanoon, Philip D. A. Kraaijenbrink, Isabelle Gouttevin, Derek Karssenberg, and Walter W. Immerzeel

Abstract. Accurate characterization of snow water equivalent (SWE) is important for water resource management in large parts of the Northern Hemisphere, but its large spatio-temporal variability and limited observational data make it difficult to quantify. Complex physically-based models have been developed that allow long-term SWE prediction, including scenarios without snowpack observations or in future events. However, those still suffer from large errors in their simulations, have long run times at large scales and provide challenges for integrating observational data. There have been attempts at using machine learning (ML) to improve SWE forecasting from meteorological data with promising results, but the data scarcity issue and concerns about the ability to extrapolate in time and space remain. In this study, we evaluate two hybrid setups that integrate physically-based simulations and ML. The first setup, referred to as post-processing, follows a common approach in which the simulated outputs from a numerical snow model, Crocus, are used as predictors to the ML component in addition to the meteorological data. The second setup, named data augmentation, involves an ML model trained not only on measured SWE but also on Crocus-simulated SWE at additional locations. These approaches are deployed using in-situ meteorological and SWE measurements available at ten stations throughout the Northern Hemisphere, and compared to Crocus and a ML setup using measured data only. The results show that the post processing setup outperforms all other approaches when predicting on left-out years in the training stations, but performs poorly when extrapolating to other locations compared to Crocus. The addition of a large set of Crocus-simulated variables besides SWE in the post-processing setup results in similar performance for left-out years but exacerbates the spatial extrapolation issue. On the other hand, the data-augmentation setup performs slightly worse on the left-out years, but showed much better transferability to new locations, improving the other ML-based setups greatly and reducing the RMSE in Crocus by more than 10%. The feature importances of the ML-models are consistent with physical knowledge, despite having unusual deviations at extreme values, which could be further improved with the data-augmentation setup. Lastly, the addition of lagged variables results in improved results, but they are only relevant for up to a week. These results prove the usefulness of hybrid models and particularly the data-augmentation setup for SWE prediction even in data-scarce domains, which has the potential to improve forecasts of SWE at unprecedented spatio-temporal scales.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share
Oriol Pomarol Moya, Madlene Nussbaum, Siamak Mehrkanoon, Philip D. A. Kraaijenbrink, Isabelle Gouttevin, Derek Karssenberg, and Walter W. Immerzeel

Status: open (until 20 Jun 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Oriol Pomarol Moya, Madlene Nussbaum, Siamak Mehrkanoon, Philip D. A. Kraaijenbrink, Isabelle Gouttevin, Derek Karssenberg, and Walter W. Immerzeel
Oriol Pomarol Moya, Madlene Nussbaum, Siamak Mehrkanoon, Philip D. A. Kraaijenbrink, Isabelle Gouttevin, Derek Karssenberg, and Walter W. Immerzeel

Viewed

Total article views: 59 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
34 22 3 59 3 3
  • HTML: 34
  • PDF: 22
  • XML: 3
  • Total: 59
  • BibTeX: 3
  • EndNote: 3
Views and downloads (calculated since 05 May 2025)
Cumulative views and downloads (calculated since 05 May 2025)

Viewed (geographical distribution)

Total article views: 86 (including HTML, PDF, and XML) Thereof 86 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 15 May 2025
Download
Short summary
Two hybrid Machine Learning (ML) approaches using meteorological data and snowpack simulations from the Crocus snow model were evaluated for daily snow water equivalent (SWE) prediction at ten locations in the Northern Hemisphere, where they improved both Crocus and traditional ML approaches. In particular, a hybrid setup augmenting the measured data with Crocus simulations considerably enhanced prediction on unseen locations, paving the way for better long-term SWE monitoring.
Share