the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Continental-scale prediction of hydrologic signatures and processes
Abstract. Understanding how dominant hydrologic processes and their drivers vary across diverse continental-scale landscapes is critical for hydrologic modeling and water management applications. Our research addresses this question by synthesizing large-sample watershed datasets, Caravan and GAGES-II, and developing random forest models to identify patterns in hydrologic behavior. We assessed dominant processes by examining hydrologic signatures—summary indicators of watershed behavior derived from hydroclimatic time series and random forest models across 14,146 gauged U.S. watersheds. The results reveal clear continental-scale gradients in hydrologic processes, including baseflow, overland flow, storage, and water balance losses. Our map of dominant processes highlights, for example, the transition from baseflow to fast responses and back to baseflow along the elevation gradient from the Appalachian spine, through the Piedmont, to the Eastern Coastal Plain; a distinct outer ring around the Great Lakes region; and sharp contrasts between coastal and inland processes in the West. Variable importance analysis from random forest models show that processes in the western U.S. are primarily controlled by climate, whereas in the eastern U.S., soil, geology, and topography play larger roles, with distinct human influences apparent in urban areas. Our estimates of dominant processes and their drivers provide a framework to extend process knowledge from research watersheds to the continental scale, assess current hydrological understanding, and evaluate hydrological model structures.
Competing interests: We would like to disclose a potential competing interest. Author Hilary McMillan is currently an Executive Editor of HESS.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(1926 KB) - Metadata XML
-
Supplement
(5602 KB) - BibTeX
- EndNote
Status: open (until 09 Feb 2026)
- RC1: 'Comment on egusphere-2025-6156', Anonymous Referee #1, 19 Jan 2026 reply
-
RC2: 'Comment on egusphere-2025-6156', Anonymous Referee #2, 22 Jan 2026
reply
Araki et al. investigate hydrologic processes (signatures) and their drivers in 14,146 watersheds across the contiguous US. The study is a highly relevant contribution to an advanced understanding of hydrologic processes at the US continental scale. While the study is well designed and executed overall, the authors may want to address the following points before final publication.
Major comment
In the results section 4.2, the short description of the corresponding region in the relevant subsections is useful. However, many parts of section 4.2 (and partly 4.3) include interpretations (“suggest…”, “indicate…”) and references that would rather fit into the discussion (e.g., section 5.1). A clear separation of results and discussion would make it easier for the reader to follow the main points presented.
Minor comments
Line 13: In general, “hydrologic behavior” always sounds quite humanized to me. Maybe change to “hydrologic functioning” or similar (here and at other relevant places in the manuscript).
Line 21: I am not sure whether the study can be referred to as a “framework”. The authors may want to rewrite this sentence and leave out the “framework” part.
Lines 38-49: This paragraph is supposed to focus on hydrologic processes, but it describes more of the method or approaches (“likelihood of…”, “statistics can help…”). It would be interesting to read more about the actual findings of the studies investigating these processes across the CONUS (even if not at a continental scale).
Lines 93-94: Many large-sample studies are conducted across large climatic gradients, which may “override” landscape attributes and thus lead to comparably weak predictive power of the latter?
Line 99: Consider adding the study of do Nascimento et al. (2025), dealing with geological maps in large-sample studies.
Lines 168-169: Any ideas on why the RF models yielded lower performance on all Caravan watersheds compared to other subset combinations? This may be something to add to the Text S2.
Line 172: Looking at the distribution of streamflow observation length in Caravan, a minimum of 5 years seems to be quite short. Why did the authors choose this length? A quick visualization of the observation length distribution of all data sources used (e.g., added to S1) would be helpful.
Lines 168-179: The training sample is derived based on various (subjective?) quality criteria. How sensitive are the predicted hydrologic signatures to changes in the quality criteria? A short explanation in the discussion would be beneficial.
Lines 195-204: This rather long paragraph seems to be unnecessary, as Table 1 already provides a good overview.
Line 202: Please indicate the meaning of “small p-values” in this context.
Line 258: Nice figure!
Line 273: Introductory sentence may be deleted.
Line 317: Please avoid using “very” as a filling word. If statistically significant, consider using “significantly” instead.
Lines 389-393: Clay and silt fraction are often collinear. Can the authors discuss mechanistic reasons (e.g., surface crusting, …) for the silt fraction (and less so clay) emerging as a dominant driver of signatures in the Northeast and South? Maybe add a few sentences to the discussion (lines 465-466).
Lines 426-434: In my view, this summary paragraph is not relevant to the discussion and may be omitted or moved to the last section of the manuscript.
Line 445: Up until this line, the section 5.1 is not about “new process understanding” but rather about the sample size itself. The authors may want to consider renaming this section.
Line 452: Consider adding the study of Stein et al. (2021) to this section, dealing with flood generation processes across the CONUS.
Line 538: Please renumber to “6 Conclusion”.
Technical points
As the Method section comprises various steps, the authors may want to consider creating a simple flowchart containing the major working steps.
References
do Nascimento, T. V., Rudlang, J., Gnann, S., Seibert, J., Hrachowitz, M., & Fenicia, F. (2025). How do geological map details influence the identification of geology-streamflow relationships in large-sample hydrology studies? Hydrol. Earth Syst. Sci., 29(24), 7173-7200. https://doi.org/10.5194/hess-29-7173-2025
Stein, L., Clark, M. P., Knoben, W. J., Pianosi, F., & Woods, R. A. (2021). How do climate and catchment attributes influence flood generating processes? A large‐sample study for 671 catchments across the contiguous USA. Water Resources Research, 57(4), e2020WR028300. https://doi.org/10.1029/2020WR028300
Citation: https://doi.org/10.5194/egusphere-2025-6156-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 401 | 384 | 17 | 802 | 38 | 29 | 35 |
- HTML: 401
- PDF: 384
- XML: 17
- Total: 802
- Supplement: 38
- BibTeX: 29
- EndNote: 35
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Araki et al. (2026) presented a very interesting paper, which incorporates new methods, new interpretations and further develops the LSH field. The authors move from a more standard just prediction approach to one more focused on processes, and this is a very interesting path to go. Although I truly believe that the contribution should be accepted for publication at the journal eventually, it still needs some minor revisions before it is in full shape.
Major comments:
1. Did the authors filter out basins from the sample based on time-series quality (apart from the time series length?). I think that this would be a very important step to make clear for readers and to base the interpretations on.
2. Also, to my understanding, the authors did compile the signatures as long-term average for all time series. Although I fully acknowledge that this is valid and I see no problem at all in the methodological choice, I think tin would be interesting to highlight and briefly discuss this in the paper. I know by experience that many catchments are under change, and perhaps a long term average could mask some of those? Again, no need to change anything, just discussion in my opinion.
3. I am not fully used to the use of sections in the introduction. Although I think they are not a problem, could the authors check about the requirements from HESS? I would choose not to use, but again, this is a taste matter (in case HESS has nothing against it).
4. After reading I got a bit confuse. Did you classify the dominant based only on the 4 processes? Could you elaborate a bit better on that?
Minor comments:
Section 2.1: could you clarify better what are the number of gauged versus ungauged basins in the study?
Section 2.2: what was the motivation of merging Caravan and GAGES, and not using only GAGES (or vice versa) for example? Perhaps I missed it!
L181: Could you please clarify what are these quality standards that you mean here?
L188: I found it confuse here. Did you not use 4 signatures per process? Could you elaborate better on that?
L192: Does it mean that we can get more than one dominant process per signature?
Figure 2: what does "unclassified" mean? I thought it was when no process dominates, but I see gray lines with the white lines. I am a bit confused here. Could you clarify it in more detail in the text?
Figure 2: why is the "seasonal variability" not present there?
Figure 2: Would it make sense to add the three regions here? I am reading the text, and I found it a bit of a lot of going back and forward to try to match the regions and boundaries... In case it does not damage this (by the way very pretty) figure, could you add?
4.3 L351: Sorry for being a bit peaky, but could you perhaps jsut add that your hypothesis is that the RF and shapely can be interpreted as controls? Just that readers are aware that this is an hypothesis. The correlations, predictions, performances could always be just mathematical (although I believe that do reflect process, as you do!).