the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Toward merging MOPEX and CAMELS hydrometeorological datasets: compatibility and statistical comparison
Abstract. This study compares two large hydrometeorological datasets, the Model Parameter Estimation Experiment (MOPEX), and the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS), focusing on 47 shared watersheds within the continental United States. The evaluation spans daily, monthly, seasonal, and annual scales for the overlapping water years of 1981 to 2000. Spatial aggregations are conducted based on Köppen-Geiger climate regions along with annual Budyko evaporative and aridity indices. Results indicate significant differences between the datasets at daily timesteps, highlighting the challenge of high temporal resolution data reconciliation; however, compatibility markedly improves with temporal aggregation at monthly, seasonal, and annual scales. While MOPEX shows a warm bias for temperature and CAMELS shows a wet bias for precipitation, statistical analyses demonstrate that both datasets are representative of climatic conditions and extreme events. Our findings validate the results of previous research employing either dataset. Furthermore, this study serves as a foundation for the merging and extension of MOPEX and CAMELS datasets.
- Preprint
(1641 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 23 May 2025)
-
RC1: 'Comment on egusphere-2024-4182', Anonymous Referee #1, 17 Feb 2025
reply
Overall, this is an interesting study comparing two commonly utilized catchment data sources. The analysis for the continental US (CONUS) appears to demonstrate statistically significant differences in the aggregate regarding temperature and precipitation. The differences shown are important, however greater attention to explaining the differences, and their significance would significantly improve the manuscript. The use of machine learning is not clearly articulated in the work and its significance is not yet clear. Greater attention should be paid to discussing the impacts of these differences on future modeling efforts as well.
The manuscript would also benefit from a clear statement of the goals of the research, i.e. is the goal to show that the two data sets are equivalent and therefore can be merged? Or is to identify where the two data sets differ and to explain why they are different, with the goal of adjusting one, or the other to allow merging? See line 45 for the first time this is made clear in the text. I would suggest clearly stating this in the abstract as well
Line 103. How will this study address “uncertainties within the data sets? This is an unclear statement
Lines 140-150. This is a confusing paragraph for those not intimately familiar with either data set. You state there are large discrepancies between the CAMEL SAC model ET and CAMEL-WB. Why is this important when comparing CAMELS to MOPEX, the goal of this work? Please expand this section and make it clear why these differences in ET with CAMELS is important to the goal of this work.
Tables 4 and 5: These tables need far more explaining. The text indicates that they are internal variability of the two data sets, yet in each case, only a single mean is presented. The text is unclear as the tables do not provide the reader with any form of comparison here. The text indicates “within” the data sets, but the tables appear to provide “between” the data sets. Please expand section 4.1 to be clearer here.
Line 272. Does the fact that averaging over greater temporal scales reduce the dispersion a major finding here? it would seem like this would be an expected result?.
Line 323. It’s not surprising that the variation in arid region precipitation is greater but what does “ remain the most consistent” in the text mean? Consistent between data sets? Please be specific.
Line 375: Some discussion of why these differences exist would be valuable here. . A bit of speculation will be helpful and appropriate.
Line 630, Section 4.4 It is not fully apparent why machine learning validation was undertaking for this work and how it helps in the analysis. Please justify its use in more clarity.
Citation: https://doi.org/10.5194/egusphere-2024-4182-RC1 -
AC1: 'Reply on RC1', Katharine Sink, 10 Mar 2025
reply
Thank you for your time and suggestions. Please see the attached pdf document for our responses to each comment.
-
AC1: 'Reply on RC1', Katharine Sink, 10 Mar 2025
reply
-
RC2: 'Comment on egusphere-2024-4182', Anonymous Referee #2, 22 Apr 2025
reply
Summary
This manuscript presents a detailed comparison between two widely used streamflow and meteorological datasets for the continental United States, MOPEX and CAMELS, investigating their consistency and discrepancies from daily to annual scales. The study is based on a carefully designed statistical analysis and is relevant to the hydrological modeling and large-sample hydrology communities. The work is rigorous, and the results are clearly communicated and well discussed. I have a few remarks and suggestions for improvement that the authors might find useful.Specific comments
- In the abstract and elsewhere, the term ‘bias’ is used to describe the differences between MOPEX and CAMELS. Since bias is typically defined with respect to a reference or ground truth, it would be helpful to clarify that this refers to relative bias (i.e., systematic differences between datasets), rather than absolute error. While this becomes clearer within the manuscript, the abstract might mislead readers into thinking that MOPEX is definitively too warm or CAMELS too wet.
-The manuscript could benefit from a more in-depth discussion of which dataset may be more reliable under certain conditions. Lines 685–687 touch upon this subject but could be expanded. For instance, CAMELS uses Daymet meteorological forcing, which could be potentially considered more reliable for regional hydrological analyses. However, its evapotranspiration values are derived from the SAC-SMA hydrologic model and, as the authors show, can exhibit implausible behavior. These trade-offs, i.e., between modern gridded meteorological inputs and model-based ET estimates, deserve a more explicit discussion to help guide dataset selection for different hydrological applications.
-Line 725: Please provide a citation for the NCDC COOP and SNOTEL datasets used in MOPEX. Additionally, a brief explanation of the nature of these data sources, including their observational basis and common sources of uncertainty, would help readers better understand the reliability and limitations of the meteorological data used in these databases.
-Figure 2: Could the authors clarify the meaning of the blue color in the map? It's not evident from the caption or figure description.
-Section 3.2.2: Please include references for all the statistical tests used (e.g., Fligner-Killeen test, Welch’s t-test).
Citation: https://doi.org/10.5194/egusphere-2024-4182-RC2 -
AC2: 'Reply on RC2', Katharine Sink, 24 Apr 2025
reply
Thank you for your time and feedback on our manuscript. We appreciate your suggestions. Please refer to the attached pdf for our responses.
-
AC2: 'Reply on RC2', Katharine Sink, 24 Apr 2025
reply
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
210 | 50 | 11 | 271 | 10 | 10 |
- HTML: 210
- PDF: 50
- XML: 11
- Total: 271
- BibTeX: 10
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1