the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Fusing Satellite Embeddings to Improve Streamflow Reconstruction Across River Networks
Abstract. Reconstructing streamflow across river networks is increasingly challenging in the context of heavily modified land surface conditions. Here we present a Data Integration model with Satellite Embeddings (DISE), a reach-scale residual-learning framework that integrates Google Satellite Embeddings (SE; compact learned vector representations of satellite imagery) from the AlphaEarth Foundation Model with a recently developed discharge simulation (GRADES-hydroDL) by learning corrections toward gauge observations. We evaluate DISE at 41 gauging stations in the Yangtze River Basin using leave-one-station-out cross-validation, with embeddings aggregated over each reach’s contributing subcatchment. Simulations incorporating SE consistently outperform the GRADES-hydroDL baseline, with mean aggregation emerging as the most balanced strategy. Improvements are most pronounced for magnitude and bias: compared to GRADES-hydroDL, median KGE increases from 0.485 to 0.594 and median NSE from 0.301 to 0.533, while correlation gains remain modest, suggesting SE primarily help the model capture streamflow volume and variability rather than timing. Control experiments further show that SE enhance spatial generalization beyond both meteorological forcings and traditional hydro-environmental reach attributes (RiverATLAS): compared to the base configuration without spatial context, adding SE alone increases median KGE from 0.473 to 0.594; when SE are further added on top of RiverATLAS, median KGE increases from 0.497 to 0.567. Once SE are included, adding RiverATLAS can even slightly reduce performance. Embedding-driven gains weaken where streamflow is governed by processes not directly visible from surface imagery, particularly complex reservoir operations. Nevertheless, SE can still provide useful information when forcing-based corrections are limited. These results demonstrate that SE provide analysis-ready, information-rich representations of land surface heterogeneity that measurably strengthen streamflow reconstruction across river networks. DISE offers a scalable pathway to inject high-resolution Earth observation context into river-network modeling, improving predictions in basins where conventional forcings and hydro-environmental descriptors are often insufficient.
- Preprint
(14252 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2026-1834', Anonymous Referee #1, 16 Jun 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1834/egusphere-2026-1834-RC1-supplement.pdfCitation: https://doi.org/
10.5194/egusphere-2026-1834-RC1 - AC1: 'Reply on RC1', Haomei Lin, 28 Jun 2026
-
RC2: 'Comment on egusphere-2026-1834', Anonymous Referee #2, 17 Jun 2026
General comments
The authors present the development of a Data Integration model with Satellite Embeddings (DISE), which reconstructs streamflow by combining existing discharge products with Google Satellite Embeddings via a machine learning approach. Overall, the manuscript provides valuable insights into generating streamflow data for ungauged river reaches—a critical component of water resource management—and I believe this timely research is well-suited for HESS."
However, I recommend a ‘major revision’ of the manuscript, primarily due to the following concerns: (1) a lack of detailed explanation regarding the methodology, (2) an incomplete review of relevant existing literature, and (3) a weak core argument demonstrating the necessity and novelty of this research.
Detailed comments are provided below.
Major comments
1. Abstract: Clearly stating the motivation behind streamflow reconstruction in the abstract would better engage readers and highlight the study's importance.
2. Introduction: The Introduction would benefit from a more comprehensive literature review, particularly regarding the specific data fusion approaches employed in this study, emphasizing the novelty and strengths of the current approach compared to existing ones.
3. Introduction: I recommend providing more information on how the reconstructed streamflow data can be applied in scientific research, engineering practices, and decision-making processes (supported by relevant references). This would significantly strengthen the justification for why this work is necessary.
4. Line 49: While Satellite Embeddings (SE) are explained in the Data and Method section, providing more foundational context in the Introduction would significantly improve readability. Specifically, I recommend clarifying what type of satellite information is compressed within SE, as well as discussing how SE has been utilized in previous studies, supported by relevant references.
5. Lines 34 to 48: To ensure a complete and thorough literature review, a brief overview of data assimilation approaches should also be integrated into the Introduction.
6. Line 56: “fuses satellite embeddings”. The introduction would benefit from a brief discussion contrasting data fusion and data assimilation approaches. Clearly outlining their similarities, differences, strengths, and limitations will provide readers with a stronger conceptual foundation.
7. Line 204: “residual correction”. It is currently unclear whether this study proposes a completely novel method for residual error correction, or if it applies an existing approach with further enhancements. The authors should explicitly clarify their specific methodological contributions and ground them by introducing relevant previous studies in this domain.
8. Lines 361 to 364: To prevent potential misunderstanding, the authors should explicitly state that the spatial embeddings (SE) capture purely temporally static spatial characteristics—at least within a given year, given that the SE in this study are utilized on an annual basis. It should be made clear that these embeddings do not reflect highly dynamic hydrological variables such as soil moisture, vegetation canopy, snow cover, or water level variations.
Editorial comments
1. Line 37: Please provide the full expansion for 'PCR-GLOBWB' when it is first introduced. More generally, many acronyms throughout the manuscript lack definitions upon first use; please ensure all abbreviations are spelled out when they first appear in the text.
2. Line 249: “embeddings”. I recommend using either 'SE' or 'embeddings' consistently throughout the manuscript to minimize confusion for readers who may not be familiar with this terminology.
Citation: https://doi.org/10.5194/egusphere-2026-1834-RC2 - AC2: 'Reply on RC2', Haomei Lin, 28 Jun 2026
Status: closed
-
RC1: 'Comment on egusphere-2026-1834', Anonymous Referee #1, 16 Jun 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1834/egusphere-2026-1834-RC1-supplement.pdf
- AC1: 'Reply on RC1', Haomei Lin, 28 Jun 2026
-
RC2: 'Comment on egusphere-2026-1834', Anonymous Referee #2, 17 Jun 2026
General comments
The authors present the development of a Data Integration model with Satellite Embeddings (DISE), which reconstructs streamflow by combining existing discharge products with Google Satellite Embeddings via a machine learning approach. Overall, the manuscript provides valuable insights into generating streamflow data for ungauged river reaches—a critical component of water resource management—and I believe this timely research is well-suited for HESS."
However, I recommend a ‘major revision’ of the manuscript, primarily due to the following concerns: (1) a lack of detailed explanation regarding the methodology, (2) an incomplete review of relevant existing literature, and (3) a weak core argument demonstrating the necessity and novelty of this research.
Detailed comments are provided below.
Major comments
1. Abstract: Clearly stating the motivation behind streamflow reconstruction in the abstract would better engage readers and highlight the study's importance.
2. Introduction: The Introduction would benefit from a more comprehensive literature review, particularly regarding the specific data fusion approaches employed in this study, emphasizing the novelty and strengths of the current approach compared to existing ones.
3. Introduction: I recommend providing more information on how the reconstructed streamflow data can be applied in scientific research, engineering practices, and decision-making processes (supported by relevant references). This would significantly strengthen the justification for why this work is necessary.
4. Line 49: While Satellite Embeddings (SE) are explained in the Data and Method section, providing more foundational context in the Introduction would significantly improve readability. Specifically, I recommend clarifying what type of satellite information is compressed within SE, as well as discussing how SE has been utilized in previous studies, supported by relevant references.
5. Lines 34 to 48: To ensure a complete and thorough literature review, a brief overview of data assimilation approaches should also be integrated into the Introduction.
6. Line 56: “fuses satellite embeddings”. The introduction would benefit from a brief discussion contrasting data fusion and data assimilation approaches. Clearly outlining their similarities, differences, strengths, and limitations will provide readers with a stronger conceptual foundation.
7. Line 204: “residual correction”. It is currently unclear whether this study proposes a completely novel method for residual error correction, or if it applies an existing approach with further enhancements. The authors should explicitly clarify their specific methodological contributions and ground them by introducing relevant previous studies in this domain.
8. Lines 361 to 364: To prevent potential misunderstanding, the authors should explicitly state that the spatial embeddings (SE) capture purely temporally static spatial characteristics—at least within a given year, given that the SE in this study are utilized on an annual basis. It should be made clear that these embeddings do not reflect highly dynamic hydrological variables such as soil moisture, vegetation canopy, snow cover, or water level variations.
Editorial comments
1. Line 37: Please provide the full expansion for 'PCR-GLOBWB' when it is first introduced. More generally, many acronyms throughout the manuscript lack definitions upon first use; please ensure all abbreviations are spelled out when they first appear in the text.
2. Line 249: “embeddings”. I recommend using either 'SE' or 'embeddings' consistently throughout the manuscript to minimize confusion for readers who may not be familiar with this terminology.
Citation: https://doi.org/10.5194/egusphere-2026-1834-RC2 - AC2: 'Reply on RC2', Haomei Lin, 28 Jun 2026
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 280 | 101 | 20 | 401 | 20 | 24 |
- HTML: 280
- PDF: 101
- XML: 20
- Total: 401
- BibTeX: 20
- EndNote: 24
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1