the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Controls on Steppe River Metabolism Vary by Scale and Network Location
Abstract. We explore geographic scaling of metabolism estimates derived from dissolved oxygen measurements with data from 75 sites in three corresponding ecoregions across the temperate steppes of Mongolia and the United States. We used a nested analysis with descending spatial scales (country, ecoregion, river basin, upper or lower watershed, and wide or constrained valleys) to assess spatial heterogeneity. We then linked estimates of metabolism with reach-to-watershed-scale metrics representing geomorphology, topography, climate, and anthropogenic activity to provide possible explanations for spatial scaling dependencies. We evaluated gross primary production (GPP) and ecosystem respiration (ER) at in-situ water temperature and after standardizing them to 20 °C (GPP20 and ER20). There was no significant effect of scaling on ER, and river basin explained only modest variation in GPP. In contrast, GPP20 varied significantly with ecoregion, river basin, basin position (upper vs. lower), and valley morphology (constrained vs. wide). ER20 had no significant spatial predictors. Best regression models for GPP included positive relationships with water velocity and median basin slope and for ER included mean basin air temperature, percentage of urban land use in the basin, and GPP. Best subset regression models for GPP20 included depth, water velocity, and basin slope and for ER20 included depth and mean basin air temperature. The proportion of the watershed in urban or cropland was explanatory of ER, but not GPP. We conclude that fundamental components of ecosystem metabolism respond to different watershed scales and to distinct environmental controls. Thus, macrosystem-scale studies require multi-scale assessment to predict and capture variation in aquatic metabolism. This suggests that universal models of river metabolism are unlikely to perform as well as models built to match the specific scale of inquiry or management.
- Preprint
(965 KB) - Metadata XML
-
Supplement
(429 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-555', Anonymous Referee #1, 31 Mar 2026
-
RC2: 'Comment on egusphere-2026-555', Anonymous Referee #2, 16 Apr 2026
This study explored the spatial scales at which stream metabolism and its hypothesized drivers vary, ranging from sub-watershed scales up to countries. Through a combination of nested ANOVAs and regressions of metabolism against possible predictors, the authors found that GPP varied at the river basin scale while GPP/ER varied at multiple scales and ER had no spatial predictors, highlighting air temperature, land use, and geomorphic characteristics as possible explanations for these spatial patterns. This study fills important knowledge gaps in our understanding of metabolism, both in terms of spatial scaling and geographic coverage (and the amount of field and metabolism modeling work that went into this is no small feat!). My feelings are generally quite positive about this paper. There are a couple discrepancies in how results are being reported in different sections of the paper, and I’d like to see a bit more detail on some of the methods (all explained in specific comments below). Once those things are sufficiently addressed, I look forward to seeing these findings make it out into the world!
Specific comments:
Title: “Vary by scale and network location” comes across as redundant, as network location is one of the scales in question.
Abstract:
Line 28: “GPP varied significantly with ecoregion, river basin…” – In the results, GPP and GPP20 only vary by basin. Is this sentence is meant to describe the GPP/ER results? This also made me realize GPP/ER is never mentioned in the abstract even though a fair bit of the results and discussion centers on this ratio.
Introduction:
Line 57 (in the context of Figure 1): “given the wide variety of conditions that may occur across biomes” – Could this be worded more precisely to better introduce the predictor variables you end up including in your models? The first clear mention of climate and land use as potential predictors of metabolism is in the description of what you did in this study in line 80- they weren’t explicitly mentioned anywhere beforehand.
Lines 70-72: “A major challenge in comparing studies… Schechner et al. 2021” – This idea seems like it belongs elsewhere, maybe in the methods and/or the second intro paragraph on why scaling has been challenging. As much as I love any opportunity to highlight that study, putting it here breaks up the flow of ideas. The premise of using consistent methodology could still be kept here, just stated more simply.
Lines 82-84: “While many metabolic measurements are available for the US, metabolic rates for Mongolia’s rivers have not, to our knowledge, been quantified prior to our work.” – First, while I think this is an important contribution to emphasize, it disrupts the flow of ideas a bit here. Maybe it could be moved earlier in the paragraph? Second, the wording is slightly misleading since your group has published metabolism data from some of the Mongolian sites at least twice already (Schechner et al. 2021, Tromboni et al. 2022). Maybe rephrase as something like “rates have not been extensively quantified,” while citing previous works?
Methods:
Line 100: “A detailed description of the FPZ delineation methodology… provided previously” – Could you summarize the general idea of the methodology in a sentence here, then say that the specifics can be found elsewhere? That would make it easier for the reader to conceptualize what information is encompassed in each FPZ.
Figure 2: Do you have site photos (or photos from the general regions) you could include alongside the map to give a sense of what these different ecoregions look like? Perhaps you could include representative photos of the three US regions along one side of the map and the three Mongolian ecoregions along the other side. That would help give a sense of how comparable we might expect ecoregions to be across countries.
Line 138: “Long-term (1977-2000) climate data” – Do we have reason to believe that air temperature from 26+ years ago would be predictive of current stream metabolism, given the reality of climate change? While I acknowledge that this is a fairly recent data product, I would still like to see more justification for the inclusion of historic rather than present air temperature as a predictor, as well as more transparency in places like Figure 1 about the fact that this is historic temperature.
Line 163: “Deployments ranged from 24 hours to 1 week” – Were all the deployments at a similar time of year, and were they all within a single year? It would be helpful to know that to assess whether any of the spatial patterns could be biased by time of sampling.
Line 176: “We re-ran sites where we were unable to get a good model fit… with the Riley and Dodds 2012 model.” – A few questions here. 1) Can you add a brief explanation of why the Riley and Dodds model is better able to estimate metabolism for sites with high k? 2) If there is an alternate model that you think performs better than BASE for the more challenging sites, why not just run all the sites with that model? Presumably if the model works well for challenging sites, then it would also work for less problematic sites. 3) How do model outputs from BASE and the Riley and Dodds model compare at the less problematic sites? It would be helpful to see a comparison of metabolism estimates from the two models to know if a comparison of data from different models is valid. 4) How many sites were run with the Riley and Dodds model versus BASE? Similar to my last question about ruling out sampling bias, I’m curious if the sites that required an alternate model were concentrated in a particular spatial grouping (e.g., if they were all upper watershed sites).
Line 182: “This left us with 75 sites out of 89 where we were able to model metabolism successfully” – Could you include a table in the supplement that has more of the specifics of the QAQC metrics you used and how many sites passed each metric? While I don’t think it needs to be in the main text, there should still be some record of what thresholds you’re accepting for things like ER vs k correlation, r-hat (assuming this is what you mean by “metrics of fit associated with the model”), modeled vs observed correlation, etc. so we can track consistency of QAQC practices across studies. And on a positive note, kudos on getting that many sites to behave!
Line 205: “separated by upstream and downstream sites followed by a finer FPZ scale” – The wording here and in the next sentence makes it sound like there is an additional level of organization in between upstream/downstream and wide/constrained.
Results:
Section 3.1: BASE estimates metabolism on a per volume basis, but GPP and ER are reported in areal units throughout your results. Either correct the units throughout the results, or explain in the metabolism methods how you converted to units of g O2/m2/d.
Line 225 and Figure 4: “…GPP varied marginally significantly, only by river basin (Figure 4)” – It wasn’t immediately intuitive that Figure 4 was related to this sentence, as there is no mention of river basin in Figure 4. On second glance it makes sense that it’s showing no significant difference across the broader spatial scales. However, if all the figure is showing (in the context of the ANOVA results) is that GPP20 didn’t vary at the scale of country or ecoregion, it makes me wonder why there isn’t a similar plot of ER20. Could Figure 4 instead be a two-panel figure that’s a side by side of GPP20 and ER20?
Line 263: “We explored the top explanatory variables… using ANOVAs” – Explanation of this analysis should be included at the end of section 2.4 rather than introducing it here. Also, statistics are only reported for the percent urban and percent cropland models in the supplement- I’d expect to see results for the other variables reported as well.
Discussion:
Line 295: “We analyzed … scaling factors and continuous predictors separately because…” – This should be explained in the methods rather than the discussion.
Line 356: “We explained a modest amount of the variation in metabolism” – I think this understates the fact that the adjusted r2 values in the linear models were all on the order of 0.03-0.1. Those r2 values suggest to me that there is a lot of variation in metabolism still not being captured, so I think some disclaimer along the lines of “there is still work to do to be able to more reliably predict metabolism” should be included.
Line 377: “ecoregions significantly influenced GPP” – in the results, GPP only significantly varied at the river basin scale.
Line 379: “Climate drives ecoregion characteristics (e.g., riparian canopy) – This comes across as a little contradictory with section 4.3, where it was suggested that local riparian cover might not be the most important control of GPP.
Line 384: “grazing in Mongolia… effects of cropland” – More of a curiosity question than a request to edit: do you have a sense of whether/how the designation of land as “cropland” in Mongolia accounts for nomadic herding practices? I’m wondering if it might be more challenging to pin down the effects of land use in Mongolia when one of the land use pressures involves roaming herds of livestock as opposed to, say, farms tied to specific locations (although perhaps the herding locations are more consistent than I’m aware of).
Conclusions:
Line 409-410: “respiration was related to long-term climate conditions… ER may be more sensitive to global warming” – Consider revisiting this conclusion depending on your response to my previous comment on use of historic temperature data.
Supplement:
Tables S4, S6, S14, S15: These tables don’t have the same note as the other ANOVAs about “we ran 4 tests… p<0.0125 for significance” as the GPP and ER tables. Were the models constructed differently, or was the note just forgotten?
Technical corrections:
Introduction:
Figure 1: Change the largest scale from “continent” to “country” to match how it’s discussed throughout the rest of the text.
Methods:
Line 130: Clearer if worded as “…hydrologic and geomorphic measurements in reaches based on…”?
Line 152: Clearer if worded as 10% and 5% of reach length rather than 0.1 and 0.05 of reach length?
Line 168: “animal disturbance” – Thank you for the beautiful mental image of goats munching on PAR sensors. (Purely speculation on my part, but it seems like something they would do!)
Results:
Line 228: “found in figure S1” – The data distributions are currently figure S2 in the supplement (but should be S1 if I’m reading correctly).
Line 231: Include O2 in units for rates.
Figures 3 and 4: Check units on axes- assuming these should be g O2 instead of mg O2?
Line 273: “…also matched at the upper-lower and percentage cropland scales” – should this be the ecoregion scale?
Discussion:
Line 283: Should be Figure S2 instead of S1? Check order here and in the supplement.
Line 287: “five of 76 river sites” – Should this be 75?
Supplement:
Table S7: “marked correlations are significant” – I’m not seeing any markings? Double-check that this table is displaying how you intended it to.
Citation: https://doi.org/10.5194/egusphere-2026-555-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 165 | 61 | 19 | 245 | 31 | 12 | 33 |
- HTML: 165
- PDF: 61
- XML: 19
- Total: 245
- Supplement: 31
- BibTeX: 12
- EndNote: 33
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This is a timely and interesting paper examining the controls on stream metabolism across a large range of spatial scales for dry, temperate rivers in the USA and Mongolia. The Mongolian data - and the comparisons with the US data - are most important given the near lack of such information on streams from this region. I enjoyed reading it.
The paper is well-written and the methodology and key findings are clearly explained. The figures are appropriate for illustrating the related points in the text, and presuming they are reproduced in the original colors, visually acceptable.
I do have several comments and suggestions they authors may wish to consider when revising the text. These are generally in order of appearance in the paper rather than in order of importance.
In the Introduction, it may be worth adding that this methodology may be used at these - or similar - sites to assess how climate change effects may temper the regional controls on metabolism, especially noting the later comment on the role of GPP as an effective indicator of stream health.
I am not familiar at all with rivers in Mongolia. Are they clear water or turbid? I'm presuming the USA sites are clear water - is that correct? In particular, is water clarity sufficient for PAR to reach the benthos in the thalweg, or just in littoral zones? Although stream metabolism doesn't distinguish benthic and pelagic metabolism, the importance of benthic production can be inferred from knowing whether water depth is less or greater than euphotic depth.
Line 163: I would like to see some commentary on the limitations and possible data artifacts that may affect subsequent modelling and meta-analysis arising from the very short (1 - 7 day) deployment at each site. I totally understand the logistical constraints. How is a very overcast day, or consecutive days, considered for example. This will result in GPP suppressed well below 'normal' behavior at this site. Is this overcome to some extent simply by the number of sites examined?
Line 175: What were these metrics? (i.e. what criteria?)
Line 184: What is "unreasonably high"? Vague
Line 217: Nutrients are discussed later but I would have liked to see some data on concentrations of bio-available N and P as well as DOC in the rivers, as factors potentially directly affecting GPP and ER. Is there much of a difference across the various spatial scales? Even spot measurements during the deployments would provide some insight. These can be related to, but not inferred from, land use for example.
Line 252: How many sites are there with urban influence? How is this defined? Is it the same definition in both countries? Urbanization is not listed amongst the factors in lines 246-249. Is that just an oversight?
Line 267: Was the significant water depth effect related to turbidity and euphotic depth, as noted above?
Line 278: 'strong' meaning here?
Line 329: Suggest citing the finding of Hall (2013) that a substantial fraction (estimate of 44% on average) of newly created organic matter is respired before it is taken up by higher trophic levels. Freshwater Science, 32(2): 507-516 (2013)
Line 350: Are the constrained valley effects (partly) through topographic shading?
Line 378: Turbidity again. "...for metabolism in drier temperate areas, results from one country can likely be applied to another". I'm not convinced at all about this. My thinking - and I'm very happy to be convinced otherwise - is that parts of Central Asia, southwest USA, Australia, southern Africa etc where light penetration into a turbid water column is a major constraint on GPP may disrupt this universality of behavior.