the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
FROSTBYTE: A reproducible data-driven workflow for probabilistic seasonal streamflow forecasting in snow-fed river basins across North America
Abstract. Seasonal streamflow forecasts provide key information for decision-making in sectors such as water supply management, hydropower generation, and irrigation scheduling. The predictability of streamflow on seasonal timescales relies heavily on initial hydrological conditions, such as the presence of snow and the availability of soil moisture. In high-latitude and high-altitude headwater basins in North America, snowmelt serves as the primary source of runoff generation. This study presents and evaluates a data-driven workflow for probabilistic seasonal streamflow forecasting in snow-fed river basins across North America (Canada and the USA). The workflow employs snow water equivalent (SWE) measurements as predictors and streamflow observations as predictands. Gap filling of SWE datasets is accomplished using quantile mapping from neighboring SWE and precipitation stations, and Principal Component Analysis is used to identify independent predictor components. These components are then utilized in a regression model to generate ensemble hindcasts of streamflow volumes for 75 nival basins with limited regulation from 1979 to 2021, encompassing diverse geographies and climates. Using a hindcast evaluation approach that is user-oriented provides key insights for snow monitoring experts, forecasters, decision-makers, and workflow developers. The analysis presented here unveils a wide spectrum of predictability and offers a glimpse into potential future changes in predictability. Late-season snowpack emerges as a key factor for predicting spring/summer volumes, while high precipitation during the target period presents challenges to forecast skill and streamflow predictability. Notably, we can predict lower and higher than normal streamflows during the spring to early summer with up to five months lead time in some basins. Our workflow is available on GitHub as a collection of Jupyter Notebooks, facilitating broader applications in cold regions and contributing to the ongoing advancement of methodologies.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(8899 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(8899 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-3040', Anonymous Referee #1, 19 Feb 2024
Review comments for FROSTBYTE: A reproducible data-driven workflow for probabilistic seasonal streamflow forecasting in snow-fed river basins across North America by Louise Arnal et al.
In this paper, the authors have presented a data-driven workflow for ensemble seasonal streamflow forecasting using snow water equivalent as predictors. The findings offer valuable insights relevant to various stakeholders, such as forecasters and decision-makers, effectively merging scientific precision with practical workflow development insights. The subject matter is of current interest and contributes insights to hydrological forecasting, both the workflow, and the knowledge of the predictability of streamflow from late-season snowpack. For a deeper comprehension of the study, I propose additional discussions with the authors, detailed below.
Line 169 and 206, could you clarify whether an independent regression model is employed for each target period at every initialization date, or if a single model is capable of generating multiple outputs for all target periods at the same initiation date?
Line 221, could you provide the information of the average explained variance of the PCs?
Line 293, would be interested to know what are general reliability of other methods, to better understand how much improvement the proposed method obtains.
Line 300, please considering to rescale the y axis to make the difference more visible.
Line 310, here the authors mentioned the limitation of comparing between systems with different ensemble members. Would it be more comparable to use fairCRPSS instead?
Line 321, two “and” here.
Line 323, please specify which basins are the ones that display low to no skill throughout all
initialization dates. And does this refer back to Fig.6 since there is no initialization date information in Fig.7.
Line 332, an additional interesting pattern from this figure is that the peak skill for each target period appears when the initialization month is at the start of the target period, e.g. the target period of April to September, the peak occurs when initialized at April 1st.
Citation: https://doi.org/10.5194/egusphere-2023-3040-RC1 -
AC1: 'Reply on RC1', Louise Arnal, 25 Apr 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-3040/egusphere-2023-3040-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Louise Arnal, 25 Apr 2024
-
RC2: 'Comment on egusphere-2023-3040', Anonymous Referee #2, 21 Mar 2024
Review comment: FROSTBYTE: A reproducible data-driven workflow for probabilistic seasonal streamflow forecasting in snow-fed river basins across North America by Louise Arnal et al.
The manuscript by Louise Arnal et al. presents a new data-driven workflow for probabilistic seasonal streamflow forecasting in North America, based on snow water equivalent (SWE) as the sole predictor for streamflow forecasting. A special emphasize of the work is put on the reproducibility of the workflow, which can not only be found in the elaborate description of the workflow and the graphical methods but also the collection of open source Jupyter Notebooks of every workflow step. The probabilistic forecasting system was used to create ensemble hindcasts for different target periods, which are relevant for different users such a s snow monitoring experts, forecasters and decision-makers, and were assessed using deterministic and probabilistic metrics. The discussion reviewed relevant insights and findings from the analysis in a refreshing setup, focusing again on the specific users, giving suggestions for future improvements and opportunities to utilize the presented workflow as well as offering practical guidance. Overall, this work does not only present a promising probabilistic forecasting system for local streamflow forecasting in snow-fed river basins, a well-documented workflow, that creates the opportunity for easy implementation for end users, but also is a great example on how research can be presented in a transparent and thorough manner, following principles of open and collaborative science.
The following points, remarks and questions are mostly raised for further clarifications, no major comments.Minor comments:
- section 2.1.1, line 81: could you elaborate how catchments with ‘limited regulations’ are defined? Are there specific or more general criteria that label catchments as ‘regulated’? and how does it vary for the different catchments throughout North America?
-in line with the previous comment: line 85 references a screening approach by Whitfield et al 2012 for Canada but it would be interesting for the reader to know if the classification of catchment with or without regulated catchments is comparable to the one of the USGS data set- Figure 1 d): to clarify, this shows all stations of SCDNA, even the ones that were not used in the study? As only precipitation data is considered for the manuscript, would it not be clearer to only show the incorporated stations?
- Figure 1c): some of the SWE data products seem to overlap in some locations. Was the data for these locations compared to get a general feeling of the SWE data and its quality? Just curious.
-Figure 2: this is a very informative and well designed overview figure! I would suggest referencing it more often in the manuscript (e.g. the volume aggregation of the target periods in line 169)
- section 2.1.1, line 127: while it is mentioned that the SWE and streamflow data is used regardless of whether the years have complete records (due to the following gap filling process, max allowable gap length listed as 15days in line 164) I wonder whether there was a limit of how much missing data was seen as acceptable in total? A few days per year or even a few weeks or months throughout the total record? Figure A2 in the Appendix suggests that some were heavily gap filled compared to the original timeseries?
- section 2.1.1, line 145 and section 2.2.2: gap filling through linear interpolation, could the authors elaborate on potential limitations of this approach for both streamflow and SWE? And the potential consequences of those limitations for the regime classification approach using the streamflow as well as for defining the statistics for the CDF construction in case of the SWE gap filling using quantile mapping later?- section 2.2.3, line 190: for clarification, the ‘original’ SWE data gets gap filled twice in different ways? First by linear interpolation to be able to get the statistics for the CDF construction and then the ‘original’ SWE data gets gap filled with a separate quantile mapping approach again? Or were there specific values that were not be able to be gap filled before?
Was there a specific reason (other than that SWE is used for the PCA) that streamflow did not undergo the same two step gap filling process as SWE(linear and then quantile mapping)?- section 2.2.4, line 212: for clarification, “comprising ten years for training the regression model and an additional year for generating the hindcast, using the leave-one-out cross-validation approach.”
this is the leave-one-out cross-validation approach definition?- section 2.2.4, line 215: are the total 11 years used for the PC or the split dataset (10-1)? Line 226 refers to the first but just to check
- section 2.2.4, line 224: “We conduct a PCA and fit a new model for each predictor-predictand combination” – does this mean an OSL model for every target period? Or just one OSL model for per location for all target periods?
-section 2.2.5, line 240: for clarification: target periods listen in line 169 are not the same as the ‘periods of interest’ introduced in this line? And the verification will be on the ‘periods of interest’ or the initial introduced target periods? (KGE result description suggests the latter)
- section 2.2.5, line 241: with every nival basin potentially having different ‘periods of interest’ does this have an effect on the hindcast verification if general or averaged results over the 62 stations are presented as not every ‘period of interest’ has the same number of samples?
- Table 1 once again a very informative and clear table that I am sure many readers will appreciate!
- Figure 4 and corresponding description (line 280-285): as Figure 4 is the first figure in that specific presentation style it might be nice for the reader to get a more in depth guide how to interpret it for the different target periods and the lead periods presented (despite lines 280-285, there was still some confusion when analyzing it the first time)
- section 4: nice to a see a refreshing take on a discussion
- section 4.2: maybe a reference back to both hypotheses in line 244 would be good to remind the reader of them
- the presented work is focusing on catchments with limited regulations and the discussion includes a separate focus on decision-makers: do the authors think that this work can also be helpful for decision-makers (e.g. water managers) working in more regulated catchments?
This question is also based on the explanation in section 2.2.2 in line 160, where it is stated that streamflow was “converted into volumes that capture the spring freshet and that may be of interest of water users (e.g., for water supply management, hydropower generation, irrigation scheduling, early warnings of floods and droughts)”.
Or is there a category of catchments that would fall in between non regulated and regulated where the suggested probabilistic framework could still work?Citation: https://doi.org/10.5194/egusphere-2023-3040-RC2 -
AC2: 'Reply on RC2', Louise Arnal, 25 Apr 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-3040/egusphere-2023-3040-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Louise Arnal, 25 Apr 2024
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-3040', Anonymous Referee #1, 19 Feb 2024
Review comments for FROSTBYTE: A reproducible data-driven workflow for probabilistic seasonal streamflow forecasting in snow-fed river basins across North America by Louise Arnal et al.
In this paper, the authors have presented a data-driven workflow for ensemble seasonal streamflow forecasting using snow water equivalent as predictors. The findings offer valuable insights relevant to various stakeholders, such as forecasters and decision-makers, effectively merging scientific precision with practical workflow development insights. The subject matter is of current interest and contributes insights to hydrological forecasting, both the workflow, and the knowledge of the predictability of streamflow from late-season snowpack. For a deeper comprehension of the study, I propose additional discussions with the authors, detailed below.
Line 169 and 206, could you clarify whether an independent regression model is employed for each target period at every initialization date, or if a single model is capable of generating multiple outputs for all target periods at the same initiation date?
Line 221, could you provide the information of the average explained variance of the PCs?
Line 293, would be interested to know what are general reliability of other methods, to better understand how much improvement the proposed method obtains.
Line 300, please considering to rescale the y axis to make the difference more visible.
Line 310, here the authors mentioned the limitation of comparing between systems with different ensemble members. Would it be more comparable to use fairCRPSS instead?
Line 321, two “and” here.
Line 323, please specify which basins are the ones that display low to no skill throughout all
initialization dates. And does this refer back to Fig.6 since there is no initialization date information in Fig.7.
Line 332, an additional interesting pattern from this figure is that the peak skill for each target period appears when the initialization month is at the start of the target period, e.g. the target period of April to September, the peak occurs when initialized at April 1st.
Citation: https://doi.org/10.5194/egusphere-2023-3040-RC1 -
AC1: 'Reply on RC1', Louise Arnal, 25 Apr 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-3040/egusphere-2023-3040-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Louise Arnal, 25 Apr 2024
-
RC2: 'Comment on egusphere-2023-3040', Anonymous Referee #2, 21 Mar 2024
Review comment: FROSTBYTE: A reproducible data-driven workflow for probabilistic seasonal streamflow forecasting in snow-fed river basins across North America by Louise Arnal et al.
The manuscript by Louise Arnal et al. presents a new data-driven workflow for probabilistic seasonal streamflow forecasting in North America, based on snow water equivalent (SWE) as the sole predictor for streamflow forecasting. A special emphasize of the work is put on the reproducibility of the workflow, which can not only be found in the elaborate description of the workflow and the graphical methods but also the collection of open source Jupyter Notebooks of every workflow step. The probabilistic forecasting system was used to create ensemble hindcasts for different target periods, which are relevant for different users such a s snow monitoring experts, forecasters and decision-makers, and were assessed using deterministic and probabilistic metrics. The discussion reviewed relevant insights and findings from the analysis in a refreshing setup, focusing again on the specific users, giving suggestions for future improvements and opportunities to utilize the presented workflow as well as offering practical guidance. Overall, this work does not only present a promising probabilistic forecasting system for local streamflow forecasting in snow-fed river basins, a well-documented workflow, that creates the opportunity for easy implementation for end users, but also is a great example on how research can be presented in a transparent and thorough manner, following principles of open and collaborative science.
The following points, remarks and questions are mostly raised for further clarifications, no major comments.Minor comments:
- section 2.1.1, line 81: could you elaborate how catchments with ‘limited regulations’ are defined? Are there specific or more general criteria that label catchments as ‘regulated’? and how does it vary for the different catchments throughout North America?
-in line with the previous comment: line 85 references a screening approach by Whitfield et al 2012 for Canada but it would be interesting for the reader to know if the classification of catchment with or without regulated catchments is comparable to the one of the USGS data set- Figure 1 d): to clarify, this shows all stations of SCDNA, even the ones that were not used in the study? As only precipitation data is considered for the manuscript, would it not be clearer to only show the incorporated stations?
- Figure 1c): some of the SWE data products seem to overlap in some locations. Was the data for these locations compared to get a general feeling of the SWE data and its quality? Just curious.
-Figure 2: this is a very informative and well designed overview figure! I would suggest referencing it more often in the manuscript (e.g. the volume aggregation of the target periods in line 169)
- section 2.1.1, line 127: while it is mentioned that the SWE and streamflow data is used regardless of whether the years have complete records (due to the following gap filling process, max allowable gap length listed as 15days in line 164) I wonder whether there was a limit of how much missing data was seen as acceptable in total? A few days per year or even a few weeks or months throughout the total record? Figure A2 in the Appendix suggests that some were heavily gap filled compared to the original timeseries?
- section 2.1.1, line 145 and section 2.2.2: gap filling through linear interpolation, could the authors elaborate on potential limitations of this approach for both streamflow and SWE? And the potential consequences of those limitations for the regime classification approach using the streamflow as well as for defining the statistics for the CDF construction in case of the SWE gap filling using quantile mapping later?- section 2.2.3, line 190: for clarification, the ‘original’ SWE data gets gap filled twice in different ways? First by linear interpolation to be able to get the statistics for the CDF construction and then the ‘original’ SWE data gets gap filled with a separate quantile mapping approach again? Or were there specific values that were not be able to be gap filled before?
Was there a specific reason (other than that SWE is used for the PCA) that streamflow did not undergo the same two step gap filling process as SWE(linear and then quantile mapping)?- section 2.2.4, line 212: for clarification, “comprising ten years for training the regression model and an additional year for generating the hindcast, using the leave-one-out cross-validation approach.”
this is the leave-one-out cross-validation approach definition?- section 2.2.4, line 215: are the total 11 years used for the PC or the split dataset (10-1)? Line 226 refers to the first but just to check
- section 2.2.4, line 224: “We conduct a PCA and fit a new model for each predictor-predictand combination” – does this mean an OSL model for every target period? Or just one OSL model for per location for all target periods?
-section 2.2.5, line 240: for clarification: target periods listen in line 169 are not the same as the ‘periods of interest’ introduced in this line? And the verification will be on the ‘periods of interest’ or the initial introduced target periods? (KGE result description suggests the latter)
- section 2.2.5, line 241: with every nival basin potentially having different ‘periods of interest’ does this have an effect on the hindcast verification if general or averaged results over the 62 stations are presented as not every ‘period of interest’ has the same number of samples?
- Table 1 once again a very informative and clear table that I am sure many readers will appreciate!
- Figure 4 and corresponding description (line 280-285): as Figure 4 is the first figure in that specific presentation style it might be nice for the reader to get a more in depth guide how to interpret it for the different target periods and the lead periods presented (despite lines 280-285, there was still some confusion when analyzing it the first time)
- section 4: nice to a see a refreshing take on a discussion
- section 4.2: maybe a reference back to both hypotheses in line 244 would be good to remind the reader of them
- the presented work is focusing on catchments with limited regulations and the discussion includes a separate focus on decision-makers: do the authors think that this work can also be helpful for decision-makers (e.g. water managers) working in more regulated catchments?
This question is also based on the explanation in section 2.2.2 in line 160, where it is stated that streamflow was “converted into volumes that capture the spring freshet and that may be of interest of water users (e.g., for water supply management, hydropower generation, irrigation scheduling, early warnings of floods and droughts)”.
Or is there a category of catchments that would fall in between non regulated and regulated where the suggested probabilistic framework could still work?Citation: https://doi.org/10.5194/egusphere-2023-3040-RC2 -
AC2: 'Reply on RC2', Louise Arnal, 25 Apr 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-3040/egusphere-2023-3040-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Louise Arnal, 25 Apr 2024
Peer review completion
Post-review adjustments
Journal article(s) based on this preprint
Model code and software
FROSTBYTE: Forecasting River Outlooks from Snow Timeseries: Building Yearly Targeted Ensembles L. Arnal, V. Vionnet, and M. Clark https://doi.org/10.5281/zenodo.10310683
Interactive computing environment
FROSTBYTE: Forecasting River Outlooks from Snow Timeseries: Building Yearly Targeted Ensembles L. Arnal, D. R. Casson, M. P. Clark, and A. N. Thiombiano https://github.com/CH-Earth/FROSTBYTE
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
516 | 188 | 37 | 741 | 38 | 27 |
- HTML: 516
- PDF: 188
- XML: 37
- Total: 741
- BibTeX: 38
- EndNote: 27
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Martyn P. Clark
Alain Pietroniro
Vincent Vionnet
David R. Casson
Paul H. Whitfield
Vincent Fortin
Andrew W. Wood
Wouter J. M. Knoben
Brandi W. Newton
Colleen Walford
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(8899 KB) - Metadata XML