the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Stable Stream Temperature Prediction for Different Basins Using Time Series Encoding and Temporal Convolutional Networks
Abstract. Flow temperature prediction is essential for assessing the health of river ecosystems. Water temperature data sets are often provided inconsistently in tasks that predict river water temperatures in different river basins, especially in different climatic regions. At the same time, spatial heterogeneity within different river basins significantly complicates water temperature prediction, which makes it challenging to establish a water temperature prediction model with strong generalization capabilities and stable prediction results. To solve this problem, the moving average encoding and DOY encoding of time series data into the time convolutional network model have been merged, thus constructing a time convolutional network model for time series data encoding (time-limited-TCN). The model effectively captured multimodal features of dynamic water temperature data from complex random time series, subsequently producing stable prediction results in different river basins. Thirteen hydrographic stations across four Bardeen rivers (Thames, Colorado, Mississippi and Sacramento) were used to test the proposed improved pre-temporal-TCN model and compare its performance with reference models (Air2Stream, Narx, Gru and Gboost). The results showed that the enhanced characteristics performed well in the river in the presence of human intervention, and that air temperature and DOY were important variables that influenced water temperature prediction. The proposed improved model shows that in cross-water water temperature prediction tasks, more stable and accurate prediction performance (average RMSE on the test set of at least 8.7 % better than the comparison model. Taking into account the characteristics and model performance, the proposed model should be a promising approach for the reconstruction of flow temperatures in several river basin data accumulation areas.
- Preprint
(2094 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 10 Apr 2026)
- RC1: 'Comment on egusphere-2025-4550', Anonymous Referee #1, 23 Jan 2026 reply
-
CEC1: 'Comment on egusphere-2025-4550 - No compliance with the policy of the journal', Juan Antonio Añel, 11 Feb 2026
reply
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
You have not published openly and without restrictions all the models, code, input data, and output data relative to your study. This does not comply with the policy of the journal, which explicitly requests the publication of all the elements necessary to replicate a manuscript in one of the repositories accepted according to the policy, before its submission to the journal. Because of it, your manuscript should have never been accepted for Discussions or peer-review in our journal. Therefore, the current situation is irregular, as the GMD review process depends on reviewers and community commentators being able to access, during the discussion phase, the code and data on which a manuscript depends.
Please, therefore, publish your code and data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible. We cannot have manuscripts under discussion that do not comply with our policy.
The 'Code and Data Availability’ section must also be modified to cite the new repository locations, and corresponding references added to the bibliography.
I must note that if you do not fix this problem, we cannot continue with the peer-review process or accept your manuscript for publication in GMD.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/egusphere-2025-4550-CEC1 -
AC1: 'Reply on CEC1', Lichen Su, 15 Feb 2026
reply
Dear Prof. Áñel,
Thank you very much for your careful review of our submission and for bringing this issue to our attention.
We sincerely apologize for not fully complying with the GMD code and data policy at the time of submission. We understand the importance of ensuring unrestricted public access to all model code, input data, and output data required to reproduce our results.
We have now taken the following actions to address this issue:
-
Model code has been deposited in Zenodo, and is publicly accessible at:
DOI: https://doi.org/10.5281/zenodo.18650137 -
Input data and Output data required to reproduce the simulations have been archived at:
DOI: https://doi.org/10.5281/zenodo.18650137 - Raw data: The raw data can be obtained from the following link:
The observed data for river stations in the UK are available at: https://environment.data.gov.uk/hydrology/explore#/landing (last access: 23 September 2025)
The observed data for river stations in the USA are available at: https://maps.waterdata.usgs.gov/mapper/index.html (last access: 23 September 2025)
The observed climate data for meteorological stations in the UK are available at: https://www.metoffice.gov.uk/research/climate/maps-and-data/historic-station-data (last access: 23 September 2025)
The observed climate data for meteorological stations in the USA are available at: https://www.ncei.noaa.gov/ (last access: 23 September 2025)
Some air temperature data were obtained from the National Aeronautics and Space Administration (NASA) Langley Research Center's Prediction Of Worldwide Energy Resources (POWER) project funded through the NASA Earth Science Division. The data were obtained from the POWER Project's Hourly v2.7.2 version on 2025/6/4.
During our testing process, we discovered that some computers were unable to access Zenodo. Consequently, we have provided supplementary GitHub links (URL: https://github.com/LcSu2025/Stable-Stream-Temperature-Prediction-TimENC-TCN-model) as an alternative. All repositories provide unrestricted public access.
The revised Code and Data Availability section now reads as follows:
The model generating the results presented herein is archived on Zenodo: https://doi.org/10.5281/zenodo.18650137, 2026 (Su., 2026a) and GitHub https://github.com/LcSu2025/Stable-Stream-Temperature-Prediction-TimENC-TCN-model (Su., 2026b). Further guidance on running the model and conducting ablation experiments is provided in the code repository's README file. The complete pre-processed dataset used for model training is archived on https://doi.org/10.5281/zenodo.18650137, 2026 (Su., 2026a), alongside experimental results.
And we add the corresponding references to the bibliography as required:
Su, L.: Stable Stream Temperature Prediction for Different Basins Using Time Series Encoding and Temporal Convolutional Networks: TimENC-TCN model. Zenodo. https://doi.org/10.5281/zenodo.18650137, 2026a.
Su, L.: Stable-Stream-Temperature-Prediction-TimENC-TCN-model. Github. https://github.com/LcSu2025/Stable-Stream-Temperature-Prediction-TimENC-TCN-model, 2026b.
We greatly appreciate the opportunity to correct this oversight. Thank you again for your guidance and for maintaining the high standards of GMD.
With best regards,
Lichen Su
On behalf of all co-authorsCitation: https://doi.org/10.5194/egusphere-2025-4550-AC1 -
CEC2: 'Reply on AC1', Juan Antonio Añel, 15 Feb 2026
reply
Dear authors,
Many thanks for your reply. Unfortunately, it does not fully address the issues previously pointed out. We can not accept as links for the data webpages of NOAA, MetOffice, and similar. You must store the specific data you have used in one of the repositories that we accept, not link generic webpages.
Also, you state that some computers have problems accessing Zenodo. I do not quite understand by this. Problems to access Zenodo are obviously not on the side of Zenodo. In any case, we can not accept GitHub pages, as they are not trusted for long-term scientific archival. Actually, GitHub itself recommends to use Zenodo for such purposes. Therefore, please, remove any link to GitHub, as it does not serve the purpose of the policy of the journal.
Please, reply to this comment addressing the above mentioned issues, and include in your reply the text for a new "Code and Data Availability" section for your manuscript with the information for the permanent repositories for all the code and data necessary to replicate your study.
I must insist that if you do not solve the mentioned issues, we can not continue with the review and publication process in GMD.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-4550-CEC2 -
AC2: 'Reply on CEC2', Lichen Su, 16 Feb 2026
reply
Dear Prof. Áñel,
We are most grateful for your valuable feedback. We have implemented the following changes:
1. All original data and their sources are now archived on Zenodo: https://doi.org/10.5281/zenodo.18654575 (Su., 2026b), replacing the web links.
2. The relevant GitHub links have been removed.The revised Code and Data Availability section now reads as follows:
The model generating the results presented herein is archived on Zenodo: https://doi.org/10.5281/zenodo.18650137, 2026 (Su., 2026a). Further guidance on running the model and conducting ablation experiments is provided in the code repository's README file. The original dataset is archived on Zenodo: https://doi.org/10.5281/zenodo.18654575 (Su., 2026b). The complete pre-processed dataset used for model training is archived on https://doi.org/10.5281/zenodo.18650137, 2026 (Su., 2026a), alongside experimental results.
And we add the corresponding references to the bibliography as required:
Su, L.: Stable Stream Temperature Prediction for Different Basins Using Time Series Encoding and Temporal Convolutional Networks: TimENC-TCN model. Zenodo. https://doi.org/10.5281/zenodo.18650137, 2026a.
Su, L.: Stable Stream Temperature Prediction for Different Basins Using Time Series Encoding and Temporal Convolutional Networks: TimENC-TCN model Original Dataset. Zenodo. https://doi.org/10.5281/zenodo.18654575 , 2026b.
Once again, thank you for your support and assistance.
With best regards,
Lichen Su
On behalf of all co-authorsCitation: https://doi.org/10.5194/egusphere-2025-4550-AC2 -
CEC3: 'Reply on AC2', Juan Antonio Añel, 16 Feb 2026
reply
Dear authors,
Many thanks for addressing the outstanding issues. We can consider now the current version of your manuscript in compliance with the Code and Data policy of the journal.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-4550-CEC3
-
CEC3: 'Reply on AC2', Juan Antonio Añel, 16 Feb 2026
reply
-
AC2: 'Reply on CEC2', Lichen Su, 16 Feb 2026
reply
-
-
AC1: 'Reply on CEC1', Lichen Su, 15 Feb 2026
reply
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 154 | 42 | 28 | 224 | 11 | 21 |
- HTML: 154
- PDF: 42
- XML: 28
- Total: 224
- BibTeX: 11
- EndNote: 21
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript addresses the important problem of stream water temperature prediction across multiple river basins, a task that is highly relevant for river ecosystem assessment and management. The authors propose an enhanced temporal convolutional network (TimENC-TCN) that incorporates moving average encoding and day-of-year (DOY) encoding to improve model generalization under data-scarce and heterogeneous conditions. The model is evaluated using observations from multiple hydrological stations across four major river basins and is compared against several established reference models. Overall, the study tackles a timely and challenging topic and presents a modeling framework with potential applicability to cross-basin stream temperature prediction. This is a good study with appropriate methods, and the manuscript is generally clear and easy to follow. I believe it should be suitable for publication once the following issues are addressed.
Major: Several key conclusions in the manuscript appear insufficiently supported by the presented results. For example, the statement on line 273—“Therefore, it is reasonable to infer that at these stations where performance has declined, human factors have masked the effects of natural factors”—does not appear fully justified. This inference does not hold consistently across stations in the Sacramento River. Specifically, station J (RMSE=0.969 ºC) contradicts this pattern, and the RMSE values at stations H (1.29 °C) and B (1.189 °C) are quite similar, undermining a clear distinction. Additionally, the manuscript lacks clarity on the precise locations of stations I (Verona) and the other two Sacramento River stations, which is necessary to assess the validity of this conclusion.
Similarly, the claim on line 285—that the TimENC-TCN model demonstrates better ability to handle spatial heterogeneity in basins influenced by natural factors compared to those influenced by human factors—relies heavily on observations from a single station. This limits the strength of the conclusion. Moreover, the lower model performance at this station might be due to other factors such as measurement errors in water or air temperature data, or differences in input data volume, rather than human influence alone. Overall, clarifications on station locations, more consistent evidence across multiple stations, and a cautious interpretation of results are needed to strengthen these conclusions.
Minor:
Line 43- 45. I suggest rephrasing the sentence in lines 43–45 by removing the initial “And” in the second sentence for better flow.
Line 75: I suggest including a figure showing the locations of the stations to help readers visualize the geographical differences among them. This would clarify how the stations’ distinct environments may influence the results.
Line 86, Table 1: I suggest including the watershed area for each station to provide additional context on the catchment characteristics.
Line 100: In the Methods section, it is important to include the approach used to evaluate model performance with gradual sample removal, detailing how this analysis was conducted.
Line 190: References are missing for the developers of Air2Stream and the other models mentioned. Please include appropriate citations to acknowledge the original sources.
Line 259: The table showing “Option Feature input Feature output” appears to be missing a caption. I believe this is Table 3 and recommend adding an appropriate caption for clarity.
Line 315: The authors tested the model’s generalization performance by removing samples at Purfleet station (total data volume = 5718). While this approach is appropriate, the study’s conclusions would be strengthened by including at least one additional station in the generalization analysis, such as Cisco station (total data volume = 2100).
Line 337: I suggest rephrasing this sentence to: “Meanwhile, the case of the Verona station and experiments on the drivers of stream temperature changes (Alger et al., 2021; Wade et al., 2023) indicate that introducing other features might be necessary at stations with significant human interference.”
For Figures 2, 5, 7, and 8, I suggest increasing the font size of the axis values to improve readability.