<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" specific-use="SMUR" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">EGUsphere</journal-id>
<journal-title-group>
<journal-title>EGUsphere</journal-title>
<abbrev-journal-title abbrev-type="publisher">EGUsphere</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">EGUsphere</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub"></issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/egusphere-2026-3272</article-id>
<title-group>
<article-title>Setting the Bar: Benchmarks for Model Performances in Large-Sample Hydrology</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Seibert</surname>
<given-names>Jan</given-names>
<ext-link>https://orcid.org/0000-0002-6314-2124</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Vis</surname>
<given-names>Marc</given-names>
<ext-link>https://orcid.org/0000-0002-5589-2611</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Pool</surname>
<given-names>Sandra</given-names>
<ext-link>https://orcid.org/0000-0001-9399-9199</ext-link>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>University of Zurich, Department of Geography, Winterthurerstrasse 190, 8057 Zurich, Switzerland</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Eawag, Swiss Federal Institute of Aquatic Science and Technology, Department Water Resources and Drinking Water, Überlandstrasse 133, 8600 Dübendorf, Switzerland</addr-line>
</aff>
<pub-date pub-type="epub">
<day>11</day>
<month>06</month>
<year>2026</year>
</pub-date>
<volume>2026</volume>
<fpage>1</fpage>
<lpage>23</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Jan Seibert et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://egusphere.copernicus.org/preprints/2026/egusphere-2026-3272/">This article is available from https://egusphere.copernicus.org/preprints/2026/egusphere-2026-3272/</self-uri>
<self-uri xlink:href="https://egusphere.copernicus.org/preprints/2026/egusphere-2026-3272/egusphere-2026-3272.pdf">The full text article is available as a PDF file from https://egusphere.copernicus.org/preprints/2026/egusphere-2026-3272/egusphere-2026-3272.pdf</self-uri>
<abstract>
<p>The availability of large-sample hydrometeorological datasets, now widespread across many regions worldwide, has changed hydrological catchment modelling. Assessing model performance is an essential component of any modelling exercise, and an important question is how to interpret performance measure values. Performances of uncalibrated bucket-type models vary significantly across regions and can reach NSE values of 0.8 or higher, particularly in humid or snow-dominated catchments. This implies that using a fixed value for a performance measure to judge model performance, as sometimes suggested in the literature, is inappropriate. Instead, one should consider that, given local hydroclimatic conditions and the quality of the available data, the performance we should expect from any model in a particular catchment can vary widely. At the same time, a perfect fit (NSE value of 1) is usually impossible to achieve due to errors and uncertainties in the model and data. Therefore, it is helpful to compare model performances to lower and upper benchmarks.&lt;/p&gt;
&lt;p&gt;The purpose of this study was two-fold. First, we examined how to compute lower bounds, including determining appropriate ensemble sizes, assessing the effects of parameter ranges, deciding whether to use random or regional parameter sets, and evaluating how best to aggregate the ensemble of simulations. We also examined the relationships between lower and upper benchmarks and catchment characteristics. Secondly, we utilised these findings to compute both lower and upper benchmarks for many of the existing large sample datasets. By providing these values to the modelling community, we aim to facilitate the broader use of lower and upper benchmarks in large-sample hydrological modelling studies. We argue that these values are valuable as they provide a basis for evaluating model performance across the various large-sample datasets. This will allow assessment of model performance, considering what one could and should expect for a particular catchment.</p>
</abstract>
<counts><page-count count="23"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>