Learning to filter: Snow data assimilation using a Long Short-Term Memory network

Blandini, Giulia; Avanzi, Francesco; Campo, Lorenzo; Gabellani, Simone; Aalstad, Kristoffer; Girotto, Manuela; Yamaguchi, Satoru; Hirashima, Hiroyuki; Ferraris, Luca

doi:10.5194/egusphere-2025-423

Preprints

https://doi.org/10.5194/egusphere-2025-423

Preprints

12 Feb 2025

| 12 Feb 2025

Learning to filter: Snow data assimilation using a Long Short-Term Memory network

Giulia Blandini, Francesco Avanzi, Lorenzo Campo, Simone Gabellani, Kristoffer Aalstad, Manuela Girotto, Satoru Yamaguchi, Hiroyuki Hirashima, and Luca Ferraris

Abstract. Trustworthy estimates of snow water equivalent and snow depth are essential for water resource management in snow-dominated regions. While ensemble-based data assimilation techniques, such as the Ensemble Kalman Filter (EnKF), are commonly used in this context to combine model predictions with observations therefore to improve model performance, these ensemble methods are computationally demanding and thus face significant challenges when integrated into time-sensitive operational workflows. To address this challenge, we present a novel approach for data assimilation in snow hydrology by utilizing Long Short-Term Memory (LSTM) networks. By leveraging data from 7 diverse study sites across the world to train the algorithm on the output of an EnKF, the proposed framework aims to further unlock the use of data assimilation in snow hydrology by balancing computational efficiency and complexity.

We found that a LSTM-based data assimilation framework achieves comparable performance to state estimation based on an EnKF in improving open-loop estimates with only a small performance drop in terms of RMSE for snow water equivalent (+ 6 mm on average) and snow depth (+ 6 cm), respectively. All but 2 out of 14 LSTM site specific configurations improved on the Open Loop estimates. The inclusion of a memory component further enhanced LSTM stability and performance, particularly in situations of data sparsity. When trained on long datasets (25 years), this LSTM data assimilation approach also showed promising spatial transferability, with less than a 20 % reduction in accuracy for snow water equivalent and snow depth estimation.

Once trained, the framework is computationally efficient, achieving a 70 % reduction in computational time compared to a parallelized EnKF. Training this new data assimilation approach on data from multiple sites showed that its performance is robust across various climate regimes, during dry and average water-year types, with only a limited drop in performance compared to the EnKF (+6 mm RMSE for SWE and +18 cm RMSE for snow depth). This work paves the way for the use of deep learning for data assimilation in snow hydrology and provides novel insights into efficient, scalable, and less computationally demanding modeling framework for operational applications.

Received: 29 Jan 2025 – Discussion started: 12 Feb 2025

Competing interests: one author is an editor of The Cryosphere

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 19896 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (19896 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

21 Oct 2025

Learning to filter: snow data assimilation using a Long Short-Term Memory network

Giulia Blandini, Francesco Avanzi, Lorenzo Campo, Simone Gabellani, Kristoffer Aalstad, Manuela Girotto, Satoru Yamaguchi, Hiroyuki Hirashima, and Luca Ferraris

The Cryosphere, 19, 4759–4783, https://doi.org/10.5194/tc-19-4759-2025,https://doi.org/10.5194/tc-19-4759-2025, 2025

Short summary

Giulia Blandini, Francesco Avanzi, Lorenzo Campo, Simone Gabellani, Kristoffer Aalstad, Manuela Girotto, Satoru Yamaguchi, Hiroyuki Hirashima, and Luca Ferraris

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-423', Anonymous Referee #1, 13 Mar 2025

This work developed a surrogate for EnKF-DA using an LSTM network. The introduction and methods sections are well written and structured. However, there are several errors in the results that are inconsistent with the plots. More importantly, the results lack sufficient explanation and analysis regarding why the LSTM performs differently from EnKF at different sites or scenarios. The discussion could benefit from additional comparisons with previous studies and a deeper analysis of the results. Currently, it leans more toward reinforcing the need for LSTM in data assimilation, which somewhat repeats points already made in the introduction. Therefore, I recommend a major revision before publication.
Line 103-105: What is the source of the meteorological forcing data? Are they derived from gridded datasets?
Table 2: The data time span for each site needs to be mentioned.
Line 171: Forecasted model state is x_k^f
Line 254: Double “the”
Line 271: “predictions”
Line 277-278: Please clarify how the data were split: by individual data points or by continuous time spans?
Line 276: Please clarify what are site-specific limits here
Line 280: The inline formula here should not include 'star,' as 'star' was previously used to represent the LSTM output, not the input from S3M. Please keep consistent.
Line 288-290: Please use a formula to clarify this configuration. Do you mean that x^f and forcing at both time steps k and k-1 are used as LSTM inputs in the second test? Please refer to Figure 2 for clarity.
Line 292-294: This part is confusing. What is the difference between Configuration 1 and Configuration 3? Was a single LSTM selected from Configuration 1 and then applied to other sites? Please clarify.
Line 299-300: Is there a specific reason to randomly sample water years for data splitting rather than using a continuous historical time span to train the model and a continuous future time span to test it? Random sampling can create artificially easier test conditions by allowing test data (time period) to fall between training water years, which may provide the model with indirect information about future conditions.
LSTM structure and hyperparameters were not mentioned in this work.
Line 309-311 (Figure 3): Is this result from testing or operational testing? Please clarify
Line 313-314: It is somewhat difficult to distinguish the EnKF-DA and LSTM boxes in the plots. If the last box in each panel represents LSTM-DA, it suggests that the RMSE values of LSTM-DA for KHT, RME, and FMI-ARC increased compared to EnKF-DA, with KHT showing the largest increase. This appears inconsistent with the narrative presented here. Please check.
Figure 3&4: The Nash-Sutcliffe coefficient can be used as a score to evaluate the accuracy of the time series in (a)–(d).
Line 321-324: Why is the LSTM trained with outputs (states) from EnKF-DA more sensitive to the sparsity of observation data? Could you explain this here? Including observation data as an input may introduce artificial errors when filling in missing data in the input.
Line 336-337: Only Figure 5b shows improvement with memory component, rather than c and d
Line 342-348: Cite Figure 6 here.
Line 344: 0.5 m? The reduction shown in figure 6f is not that large.
Line 346: These strategies were not mentioned and explained in the method.
Section 3.3: This result does not seem meaningful, as the spatial transferability of all models appears to be poor. Please consider removing it.
Line 370-371: Any explanation for this result?
Section 3.4: Instead of presenting the spatial transferability of a single model, it might be more meaningful to compare and discuss the site-specific LSTM and the multi-site LSTM.
Please refer (this is not my work and no need to cite it.): Kratzert, Frederik, Martin Gauch, Daniel Klotz, and Grey Nearing. "HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin." Hydrology and Earth System Sciences 28, no. 17 (2024): 4187-4201.
Line 410: No results were shown to support this.
Line 415: 7 sites?

Citation: https://doi.org/10.5194/egusphere-2025-423-RC1
- AC1: 'Reply on RC1', Giulia Blandini, 09 May 2025
  
  We thank the reviewer for their helpful comments. We appreciate the positive feedback on the introduction and methods, and we acknowledge the concerns raised. To improve our manuscript clarity and coherence we will modify part of the results sections and update Figures 3–6 to better align with the text, and we will add additional clarification around the structure of the LSTM algorithm.
  See our point-by-point reply in the attached pdf.
  
  Citation: https://doi.org/10.5194/egusphere-2025-423-AC1
RC2:
'Comment on egusphere-2025-423', Anonymous Referee #2, 31 Mar 2025
The paper “Learning to filter: Snow data assimilation using a Long Short-Term Memory network” presents a novel framework for snowpack prediction combining physical-based model and machine learning model. It could be a great fit for the journal. However, there are several aspects of the experimental setup and methodology that would benefit from additional clarification. I encourage the authors to provide more detailed descriptions of their experiments to enhance the transparency and reproducibility of the study. Please see my comments below.
The overall data samples are limited (both years and sites), compared to https://journals.ametsoc.org/view/journals/hydr/25/1/JHM-D-22-0220.1.xml. Could the authors comment on this issue?

What is the temporal frequency of S3M? Is it 1 hour (Line 121)?

What is the input time window size for the LSTM model? If my understanding is correct, only one timestep of meteorological forcings are used as input, based on Fig. 2 and Line 269. This is not a typical use of the LSTM model if multi-time steps are not involved. The architecture of the LSTM model also requires more details (e.g., hidden layers, hidden units).

Related to the previous comment, please clarify the “memory components” of the LSTM model. By design, the previous time series should be used as inputs to the LSTM model. What is the model without these “memory components”? If this is beneficial, do the authors consider incorporating more previous timesteps?

Loss function. As noted in Line 246, the output of negative SWE is forced back to zero, why is the regularization term still necessary in Line 260? Is the hard cut at zero only applied after training the model?

Multisite LSTM. Do the authors consider the use of site-specific information as inputs (e.g., lat-lon, slope https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023WR035009; https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021WR031033)

Line 326. “Reduce” RMSE by “-25” seems to increase RMSE for me. Please consider rephrasing it.

Figure 5. Why is the RMSE for “open loop” not shown here?

There are some caption inconsistencies. Please take time and revise them (e.g., the capital letters in Figure 8 caption)

Figure 8. Is there any particular reason to assess the performance based on different water year types? A similar and consistent RMSE as previous experiments would be helpful.
Citation: https://doi.org/10.5194/egusphere-2025-423-RC2
- AC2: 'Reply on RC2', Giulia Blandini, 09 May 2025
  
  We thank the reviewer for the positive evaluation of our manuscript and we acknowledge the need for additional clarification needed to sustain transparency and reproducibility of our study. We plan to improve the quality thanks to the useful feedback received.
  See our point-by-point reply in the attached pdf. Please note that we changed the format of the first comment https as it was causing problems on the text editor we used.
  
  Citation: https://doi.org/10.5194/egusphere-2025-423-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-423', Anonymous Referee #1, 13 Mar 2025

This work developed a surrogate for EnKF-DA using an LSTM network. The introduction and methods sections are well written and structured. However, there are several errors in the results that are inconsistent with the plots. More importantly, the results lack sufficient explanation and analysis regarding why the LSTM performs differently from EnKF at different sites or scenarios. The discussion could benefit from additional comparisons with previous studies and a deeper analysis of the results. Currently, it leans more toward reinforcing the need for LSTM in data assimilation, which somewhat repeats points already made in the introduction. Therefore, I recommend a major revision before publication.
Line 103-105: What is the source of the meteorological forcing data? Are they derived from gridded datasets?
Table 2: The data time span for each site needs to be mentioned.
Line 171: Forecasted model state is x_k^f
Line 254: Double “the”
Line 271: “predictions”
Line 277-278: Please clarify how the data were split: by individual data points or by continuous time spans?
Line 276: Please clarify what are site-specific limits here
Line 280: The inline formula here should not include 'star,' as 'star' was previously used to represent the LSTM output, not the input from S3M. Please keep consistent.
Line 288-290: Please use a formula to clarify this configuration. Do you mean that x^f and forcing at both time steps k and k-1 are used as LSTM inputs in the second test? Please refer to Figure 2 for clarity.
Line 292-294: This part is confusing. What is the difference between Configuration 1 and Configuration 3? Was a single LSTM selected from Configuration 1 and then applied to other sites? Please clarify.
Line 299-300: Is there a specific reason to randomly sample water years for data splitting rather than using a continuous historical time span to train the model and a continuous future time span to test it? Random sampling can create artificially easier test conditions by allowing test data (time period) to fall between training water years, which may provide the model with indirect information about future conditions.
LSTM structure and hyperparameters were not mentioned in this work.
Line 309-311 (Figure 3): Is this result from testing or operational testing? Please clarify
Line 313-314: It is somewhat difficult to distinguish the EnKF-DA and LSTM boxes in the plots. If the last box in each panel represents LSTM-DA, it suggests that the RMSE values of LSTM-DA for KHT, RME, and FMI-ARC increased compared to EnKF-DA, with KHT showing the largest increase. This appears inconsistent with the narrative presented here. Please check.
Figure 3&4: The Nash-Sutcliffe coefficient can be used as a score to evaluate the accuracy of the time series in (a)–(d).
Line 321-324: Why is the LSTM trained with outputs (states) from EnKF-DA more sensitive to the sparsity of observation data? Could you explain this here? Including observation data as an input may introduce artificial errors when filling in missing data in the input.
Line 336-337: Only Figure 5b shows improvement with memory component, rather than c and d
Line 342-348: Cite Figure 6 here.
Line 344: 0.5 m? The reduction shown in figure 6f is not that large.
Line 346: These strategies were not mentioned and explained in the method.
Section 3.3: This result does not seem meaningful, as the spatial transferability of all models appears to be poor. Please consider removing it.
Line 370-371: Any explanation for this result?
Section 3.4: Instead of presenting the spatial transferability of a single model, it might be more meaningful to compare and discuss the site-specific LSTM and the multi-site LSTM.
Please refer (this is not my work and no need to cite it.): Kratzert, Frederik, Martin Gauch, Daniel Klotz, and Grey Nearing. "HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin." Hydrology and Earth System Sciences 28, no. 17 (2024): 4187-4201.
Line 410: No results were shown to support this.
Line 415: 7 sites?

Citation: https://doi.org/10.5194/egusphere-2025-423-RC1
- AC1: 'Reply on RC1', Giulia Blandini, 09 May 2025
  
  We thank the reviewer for their helpful comments. We appreciate the positive feedback on the introduction and methods, and we acknowledge the concerns raised. To improve our manuscript clarity and coherence we will modify part of the results sections and update Figures 3–6 to better align with the text, and we will add additional clarification around the structure of the LSTM algorithm.
  See our point-by-point reply in the attached pdf.
  
  Citation: https://doi.org/10.5194/egusphere-2025-423-AC1
RC2:
'Comment on egusphere-2025-423', Anonymous Referee #2, 31 Mar 2025
The paper “Learning to filter: Snow data assimilation using a Long Short-Term Memory network” presents a novel framework for snowpack prediction combining physical-based model and machine learning model. It could be a great fit for the journal. However, there are several aspects of the experimental setup and methodology that would benefit from additional clarification. I encourage the authors to provide more detailed descriptions of their experiments to enhance the transparency and reproducibility of the study. Please see my comments below.
The overall data samples are limited (both years and sites), compared to https://journals.ametsoc.org/view/journals/hydr/25/1/JHM-D-22-0220.1.xml. Could the authors comment on this issue?

What is the temporal frequency of S3M? Is it 1 hour (Line 121)?

What is the input time window size for the LSTM model? If my understanding is correct, only one timestep of meteorological forcings are used as input, based on Fig. 2 and Line 269. This is not a typical use of the LSTM model if multi-time steps are not involved. The architecture of the LSTM model also requires more details (e.g., hidden layers, hidden units).

Related to the previous comment, please clarify the “memory components” of the LSTM model. By design, the previous time series should be used as inputs to the LSTM model. What is the model without these “memory components”? If this is beneficial, do the authors consider incorporating more previous timesteps?

Loss function. As noted in Line 246, the output of negative SWE is forced back to zero, why is the regularization term still necessary in Line 260? Is the hard cut at zero only applied after training the model?

Multisite LSTM. Do the authors consider the use of site-specific information as inputs (e.g., lat-lon, slope https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023WR035009; https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021WR031033)

Line 326. “Reduce” RMSE by “-25” seems to increase RMSE for me. Please consider rephrasing it.

Figure 5. Why is the RMSE for “open loop” not shown here?

There are some caption inconsistencies. Please take time and revise them (e.g., the capital letters in Figure 8 caption)

Figure 8. Is there any particular reason to assess the performance based on different water year types? A similar and consistent RMSE as previous experiments would be helpful.
Citation: https://doi.org/10.5194/egusphere-2025-423-RC2
- AC2: 'Reply on RC2', Giulia Blandini, 09 May 2025
  
  We thank the reviewer for the positive evaluation of our manuscript and we acknowledge the need for additional clarification needed to sustain transparency and reproducibility of our study. We plan to improve the quality thanks to the useful feedback received.
  See our point-by-point reply in the attached pdf. Please note that we changed the format of the first comment https as it was causing problems on the text editor we used.
  
  Citation: https://doi.org/10.5194/egusphere-2025-423-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Publish subject to revisions (further review by editor and referees) (11 May 2025) by Nora Helbig

AR by Giulia Blandini on behalf of the Authors (29 May 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (02 Jun 2025) by Nora Helbig

RR by Anonymous Referee #2 (11 Jun 2025)

RR by Anonymous Referee #1 (12 Jun 2025)

RR by Anonymous Referee #3 (02 Jul 2025)

ED: Publish subject to revisions (further review by editor and referees) (03 Jul 2025) by Nora Helbig

AR by Giulia Blandini on behalf of the Authors (20 Aug 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (21 Aug 2025) by Nora Helbig

RR by Anonymous Referee #3 (04 Sep 2025)

ED: Publish as is (04 Sep 2025) by Nora Helbig

AR by Giulia Blandini on behalf of the Authors (11 Sep 2025)

Journal article(s) based on this preprint

21 Oct 2025

Learning to filter: snow data assimilation using a Long Short-Term Memory network

Giulia Blandini, Francesco Avanzi, Lorenzo Campo, Simone Gabellani, Kristoffer Aalstad, Manuela Girotto, Satoru Yamaguchi, Hiroyuki Hirashima, and Luca Ferraris

The Cryosphere, 19, 4759–4783, https://doi.org/10.5194/tc-19-4759-2025,https://doi.org/10.5194/tc-19-4759-2025, 2025

Short summary

Giulia Blandini, Francesco Avanzi, Lorenzo Campo, Simone Gabellani, Kristoffer Aalstad, Manuela Girotto, Satoru Yamaguchi, Hiroyuki Hirashima, and Luca Ferraris

Viewed

Total article views: 3,724 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
2,510	1,108	106	3,724	97	140

HTML: 2,510
PDF: 1,108
XML: 106
Total: 3,724
BibTeX: 97
EndNote: 140

Views and downloads (calculated since 12 Feb 2025)

Month	HTML	PDF	XML	Total
Feb 2025	158	54	10	222
Mar 2025	118	54	8	180
Apr 2025	80	60	6	146
May 2025	120	78	4	202
Jun 2025	80	30	10	120
Jul 2025	54	26	2	82
Aug 2025	296	64	4	364
Sep 2025	1,110	54	20	1,184
Oct 2025	50	18	2	70
Nov 2025	36	16	8	60
Dec 2025	68	58	10	136
Jan 2026	76	100	8	184
Feb 2026	82	64	4	150
Mar 2026	92	254	6	352
Apr 2026	46	92	1	139
May 2026	27	74	2	103
Jun 2026	17	12	1	30
Jul 2026	0

Cumulative views and downloads (calculated since 12 Feb 2025)

Month	HTML	PDF	XML	Total
Feb 2025	158	54	10	222
Mar 2025	118	54	8	180
Apr 2025	80	60	6	146
May 2025	120	78	4	202
Jun 2025	80	30	10	120
Jul 2025	54	26	2	82
Aug 2025	296	64	4	364
Sep 2025	1,110	54	20	1,184
Oct 2025	50	18	2	70
Nov 2025	36	16	8	60
Dec 2025	68	58	10	136
Jan 2026	76	100	8	184
Feb 2026	82	64	4	150
Mar 2026	92	254	6	352
Apr 2026	46	92	1	139
May 2026	27	74	2	103
Jun 2026	17	12	1	30
Jul 2026	0

Viewed (geographical distribution)

Total article views: 3,722 (including HTML, PDF, and XML) Thereof 3,722 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 03 Jul 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (19896 KB)
Metadata XML

Short summary

Reliable SWE and snow depth estimates are key for water management in snow regions. To tackle computational challenges in data assimilation, we suggest a Long Short-Term Memory neural network for operational data assimilation in snow hydrology. Once trained, it cuts computation by 70 % versus an EnKF, with a slight RMSE increase (+6 mm SWE, +6 cm snow depth). This work advances deep learning in snow hydrology, offering an efficient, scalable, and low-cost modeling framework.

Learning to filter: Snow data assimilation using a Long Short-Term Memory network

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Viewed

Viewed (geographical distribution)


Total:	0
HTML:	0
PDF:	0
XML:	0