<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" specific-use="SMUR" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">EGUsphere</journal-id>
<journal-title-group>
<journal-title>EGUsphere</journal-title>
<abbrev-journal-title abbrev-type="publisher">EGUsphere</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">EGUsphere</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub"></issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/egusphere-2025-3006</article-id>
<title-group>
<article-title>A data-driven method for identifying climate drivers of agricultural yield failure from daily weather data</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Sweet</surname>
<given-names>Lily-belle</given-names>
<ext-link>https://orcid.org/0000-0001-9971-6102</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Müller</surname>
<given-names>Christoph</given-names>
<ext-link>https://orcid.org/0000-0002-9491-3550</ext-link>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Jägermeyr</surname>
<given-names>Jonas</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Zscheischler</surname>
<given-names>Jakob</given-names>
<ext-link>https://orcid.org/0000-0001-6045-1629</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff6">
<sup>6</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Department of Compound Environmental Risks, Helmholtz Centre for Environmental Research - UFZ, Leipzig, Germany</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Department of Hydro Sciences, TUD Dresden University of Technology, Dresden, Germany</addr-line>
</aff>
<aff id="aff3">
<label>3</label>
<addr-line>Potsdam Institute for Climate Impact Research (PIK), Member of the Leibniz Association, Potsdam, Germany</addr-line>
</aff>
<aff id="aff4">
<label>4</label>
<addr-line>Columbia University, Climate School, New York, NY, USA</addr-line>
</aff>
<aff id="aff5">
<label>5</label>
<addr-line>NASA Goddard Institute for Space Studies (GISS), New York, NY, USA</addr-line>
</aff>
<aff id="aff6">
<label>6</label>
<addr-line>Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Dresden-Leipzig, Germany</addr-line>
</aff>
<pub-date pub-type="epub">
<day>26</day>
<month>08</month>
<year>2025</year>
</pub-date>
<volume>2025</volume>
<fpage>1</fpage>
<lpage>56</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2025 Lily-belle Sweet et al.</copyright-statement>
<copyright-year>2025</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://egusphere.copernicus.org/preprints/2025/egusphere-2025-3006/">This article is available from https://egusphere.copernicus.org/preprints/2025/egusphere-2025-3006/</self-uri>
<self-uri xlink:href="https://egusphere.copernicus.org/preprints/2025/egusphere-2025-3006/egusphere-2025-3006.pdf">The full text article is available as a PDF file from https://egusphere.copernicus.org/preprints/2025/egusphere-2025-3006/egusphere-2025-3006.pdf</self-uri>
<abstract>
<p>Climate-related impacts, such as agricultural yield failure, often occur in response to a range of specific weather conditions taking place across different time periods, such as during the growing season. Identifying which weather conditions and timings are most strongly associated with a certain impact is difficult because of the overwhelming number of possible predictor combinations from different aggregation periods. Here we address this challenge and introduce a method for identifying a small number of climate drivers of an impact from high-resolution meteorological data. Based on the principle that causal drivers should generalize across different environments, our proposed two-stage approach systematically generates, tests, and discards candidate features using machine learning and then generates a set of robust drivers. We evaluate the method using simulated US maize yield data from two process-based global gridded crop models and rigorous out-of-sample testing (using approximately 30 years of early 20th-century climate and yield data for training and over 70 years of subsequent data for testing). The climate drivers identified align with crop model mechanisms and consistently use only the weather variables that are taken as input by the respective models. Logistic regression models using ten drivers as predictors show strong predictive performance on the held-out test period even under shifting climatic conditions, achieving correlations of 0.70&amp;ndash;0.85 between predicted and true annual proportions of grid cells experiencing yield failure. This approach circumvents the limitations of post-hoc interpretability in black-box machine learning models, allowing researchers to use parsimonious statistical models to explore relationships between climate and impacts, while still harnessing the predictive power of high-resolution, multivariate weather data. We demonstrate this method in the context of agricultural yield failure, but it is also applicable for studying other climate-related impacts such as forest die-off, wildfire incidents, landslides, or flooding.</p>
</abstract>
<counts><page-count count="56"/></counts>
<funding-group>
<award-group id="gs1">
<funding-source>Helmholtz-Gemeinschaft</funding-source>
<award-id>VH-NG-1537</award-id>
</award-group>
</funding-group>
</article-meta>
</front>
<body/>
<back>
</back>
</article>