the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Multi-annual predictions of hot, dry and hot-dry compound extremes
Abstract. Hot-dry compound extremes have recently gained increasing attention due to their potential impacts on environments and societies. For these reasons, assessing climate predictions is essential to providing reliable information on such extremes. However, despite several studies focusing on compound extremes in the past and climate projections, little is known on a multi-annual timescale. At this regard, decadal climate predictions have been produced to provide useful information for this specific timescale. Thus, we evaluate the ability of the CMIP6 multi-model decadal climate hindcast in predicting hot-dry climate extremes, as well as their hot and dry univariate counterparts for the forecast years 2–5. The multi-model skillfully predicts hot-dry compound extremes and hot extremes over most land regions, while the skill is more limited for dry extremes. However, we find only small and spatially limited improvements from the initialisation of the hindcasts, especially for the hot-dry compound extremes, with most of the skill coming from external forcings, especially long-term trends. Finally, we find the decadal hindcast to be able to reproduce the connections between the compound extremes and their hot and dry univariate components. Evaluations such as this of decadal hindcast are an essential tool to establish the potential and the limits of these products, a necessary step to provide reliable and useful information regarding such impactful extremes.
- Preprint
(7903 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-940', Anonymous Referee #1, 02 Jun 2025
Review: Multi-annual predictions of hot, dry and hot-dry compound extremes
The study by Aranyossy et al. investigates the predictability of hot-dry compound extremes and their univariate components (hot and dry events) on a multi-annual scale (forecast years 2–5), using decadal climate hindcasts from the CMIP6 Decadal Climate Prediction Project (DCPP). The study evaluates the skill of initialized forecasts compared to historical simulations, explores the relative contributions of external forcings versus initial conditions, and assesses whether the model ensemble can reproduce observed relationships between compound and univariate extremes.
This manuscript addresses a timely and scientifically relevant topic—the decadal predictability of compound hot-dry extremes—and provides valuable insights using a multi-model ensemble of CMIP6 decadal forecasts. However, the current version has some substantial shortcomings that limit its scientific clarity and impact. I believe the study requires major revision to strengthen the methodological framework, sharpen the interpretation of key results, and improve its overall scientific accuracy before it can be considered for publication.
Major Notes:
Unclear Justification for Compound Event Definition: The definition of hot-dry compound extremes is based on overlapping thresholds (TX90p and SPI3dry or SPEI3dry), but the manuscript does not adequately discuss how sensitive the results are to these thresholds or to the chosen accumulation window. The absence of sensitivity analysis raises questions about the robustness of the results.
Overstatement of Skill Based on Trend Agreement: The study repeatedly refers to “skillful prediction” of compound extremes, but a substantial portion of this skill stems from long-term trend agreement rather than the successful prediction of interannual variability. In many regions, the DCPP ensemble appears to simply capture externally forced warming trends, which correlate with observed trends in hot extremes. However, this does not necessarily equate to predictive skill in a practical, decision-relevant sense. This distinction is mentioned, but not emphasized sufficiently in the framing of the results or the conclusion. The authors must clearly distinguish between correlation due to trend matching and actual initialized predictive skill.
Presentation and Clarity: The text is dense and often difficult to follow due to inconsistent terminology and lengthy, complex sentences. Key methodological steps are underexplained or relegated to figure captions.
Minor Notes:
- Figure captions could benefit from clearer labeling and direct interpretation; currently, they are overly technical.
- l.87 add “as“ (… we define months with drought conditions as all months …)
- Several commas are missing throughout the text
- Geographical areas are not always written in the same way (for example, word “northern“ is sometimes written in lower case and sometimes in upper case)
- l.238 delete “seen“
Citation: https://doi.org/10.5194/egusphere-2025-940-RC1 -
AC1: 'Reply on RC1', Alvise Aranyossy, 14 Sep 2025
On behalf of all the authors of this manuscript, I would like to take this occasion to thank the reviewer for taking the time to read the draft and for the valuable suggestions put forward.
I attached the document with the comments and our responses.
-
CC1: 'Comment on egusphere-2025-940', Rhea Gaur, 06 Jun 2025
This paper motivates that the assessment of decadal climate predictions is essential to providing reliable information on hot-dry compound extremes because of their potential impacts on environments and societies. There is mention that previous research has focused mostly on climate variables or univariate climate extreme prediction at this timescale and this study aims to fill that gap for hot-dry compound extremes. It evaluates the ability of the CMIP6 multi-model decadal climate hindcasts in predicting hot-dry climate extremes as well as hot and dry univariate counterparts for forecast years 2-5, investigates the added value of model initialization by comparing the forecasts to historical simulations, and compares the modeled correlations between compound and univariate extremes with observation-based datasets, ERA5 and GCPP-BEST (as referred to in the paper).
While the paper addresses a relevant gap and performs a worthwhile correlational analysis, much revision is required to make this study well-explained and well-interpreted.
Major Points -- Scientific Methods
Strengths:
- Two different CMIP6 experiment types (DCPP MME and Hist MME) are a meaningful way to explore the skill contributions for initialization and external forcing.
- Good choice to use multiple reference datasets (GPCC-BEST and ERA5).
Critiques:
- Model initialization details should be discussed (e.g. initialized over land/sea). Model selection and some information/justification concerning the models should be mentioned instead of simply referring to the appendix table.
- More discussion about the potential oversimplification due to all of the ensemble averaging should be discussed.
- "Observational uncertainty" is only quoted however, discussion is required to compare the general higher agreement with GPCC-BEST in comparison to ERA5 at least.
- The compound extremes definition is somewhat rigid and may miss more complex co-occurrences. There should be consideration given to simultaneity of event dynamics.
- A 3 month accumulation window is chosen and is appropriate for meteorological drought but may not capture many other drought timescales. The sensitivity to accumulation period should be discussed.
Major Points -- Explanations
Strengths:
- Model-dependent variability is mentioned.
Critiques:
- Explanation of indices calculation relies on references and is otherwise unclear/not justified (e.g. why use the analysis period as reference period for TX90p and are observation years used for the comparisons aligned with forecast years 2-5?). Should include some explanation similar to references (e.g. for SPI/SPEI calculation and standardization processes).
- Methodological details feel condensed or fragmented:
- Mention that PET is calculated using the Hargreaves's method but no justification is provided.
- Description of percentile estimation using a 5-day running window is noted but how seasons are handled should be discussed.
- Missing data mentioned in figure captions but not discussed.
- A statement about why hot-wet compound extremes are not discussed would strengthen the focus of this analysis.
- There is mention in the introduction that dependence among univariate variables of a compound extreme can decrease the return period of such events but this is not discussed or mentioned again any further.
Major Points -- Conclusions/Interpretation of Results
A general comment about the statement of results: overall it reads a bit as a list, with a lack of sufficient meaningful connections made or real reasons explored for greater "skill" in particular regions compared to others.
Critiques:
- The minimal added skill from initialization is important, if a real artifact, but under-discussed -- what does this imply for the use of decadal forecasts in operational contexts?
- The regional and seasonal variations are mentioned as future work but some exploration into this would give more content/meaning to the study. A preliminary investigation would be interesting and relevant given the spatial nature of the data and existing literature.
- Statements about better/under-performing regions are made but connections between them and between areas for similar analyses should be explored. These really should be accompanied by considerations of the frequency and intensity of these extreme events in those areas (e.g. high skill in California important for multi-day droughts in the region).
Major Points -- Recommendations for Further Discussion/Improvement
- Aforementioned justifications/clarifications should be included.
- Schematic for event selection (compound index calculation) would enhance clarity.
- Consideration of other compound event types such as humid heatwaves, lagged dependences (or explanation for why specifically these that were chosen) would expand the framework's applicability.
- Mention and discussion of the socio-economic impacts since the motivation centers on "high-impact" extremes but is never discussed again. Linked the compound events to impact datasets could strengthen the relevance and make the closing statement make more sense.
- Specific mentions of events should be made rather than "compound extremes during 2003, 2010, 2015 and 2018 in Europe stand as an example".
- Sensitivity analysis on SPI/SPEI accumulation periods could reveal the robustness of event definitions.
- Skillful areas should be connected to real examples to emphasize the importance of the system's performance.
- Similarly, discussion of how lack of skill in extreme-prone areas is a major limitation (e.g. North Africa where certain types of hot-dry events are more common).
Minor Points
- Figures are should be made larger and less crowded.
- This is small but there are too many references to the appendix mostly in the discussion for the ERA5 results but not presented in the body.
- Bias correction and model drift should be noted. This is mentioned in one of the authors' other papers referred to in the Data and Methods section.
Grammatical/Written Structure
- Generally clear, scientific tone and use of domain-specific terminology.
- Good integration of citations.
- Minor/spelling/grammatical errors.
- Incorrect figure references (e.g. "Figure 2d" instead of correct "Figure 2c").
- Text contains many run-on sentences which can be very confusing and seemingly contrasting to statements made before or after.
- Paragraph transitions need to be made smoother and often miss key points -- topic sentences should be prominent with a clear message while more concluding sentences need to be added to complete the idea of the paragraph.
Citation: https://doi.org/10.5194/egusphere-2025-940-CC1 - AC3: 'Reply on CC1', Alvise Aranyossy, 14 Sep 2025
-
RC2: 'Comment on egusphere-2025-940', Anonymous Referee #2, 15 Jul 2025
Foremost, I would like to mention, that researching compound extremes in the context of decadal climate prediction is a valuable contribution to the field of inter-annual predictions. The paper does represent the added value of initialized climate predictions well, but also shows the limitation of added value to specific variables and regions.
Typos:
4 - „IN this regard“
145 - circumscribed - is there a better word possible, or just leave it out?General remarks:
0 - Is the code with which the results are produced available publicly?
70 - Did you apply any bias correction/calibration to the MME, individual models respectively? If not, why? Wouldn't a non-linear calibration add value to the model outputs?
107 - the authors focus their study on lead times 2-5 which is in my opinion a fair choice. Could you give a reason on why you specifically choose this lead-time? And, if available, please present the findings for other lead times in the way you did for the comparison between GPCC-BEST/ERA5. Decadal predictions extend out to ten years, so I would be curious what happens to the signal of the initialization in this MME context.
135 - In a production scenario, if you only had time to use one reference data set, would you recommend to use gridded observations or reanalysis?
149 - Given the interpretation of skill occurring in certain regions is difficult, I'm curious why Greenland and "Central Asia" stand out as showing increased skill due to initialization. Do you have any idea why that might be?Citation: https://doi.org/10.5194/egusphere-2025-940-RC2 -
AC2: 'Reply on RC2', Alvise Aranyossy, 14 Sep 2025
I would like to thank, on behalf of all the authors of this study, the reviewer for their time in reviewing the draft and for their valuable suggestions to improve it.
I have attached to this reply the document containing the reviewer's comments and our responses.
-
AC2: 'Reply on RC2', Alvise Aranyossy, 14 Sep 2025
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
463 | 51 | 17 | 531 | 16 | 30 |
- HTML: 463
- PDF: 51
- XML: 17
- Total: 531
- BibTeX: 16
- EndNote: 30
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1