Collective risk modelling for understanding the correlation between multi-peril accumulated losses

Jones, Toby P.; Stephenson, David B.; Priestley, Matthew D. K.

doi:10.5194/egusphere-2025-3031

Preprints

https://doi.org/10.5194/egusphere-2025-3031

Preprints

27 Jun 2025

| 27 Jun 2025

Collective risk modelling for understanding the correlation between multi-peril accumulated losses

Toby P. Jones, David B. Stephenson, and Matthew D. K. Priestley

Abstract. Hazards such as storms can create multiple perils, such as windstorms and floods, that have correlated annual losses. To better understand the drivers of such correlations, this study explores three collective risk frameworks with varying complexity.

Mathematical expressions are derived explaining how this correlation depends on parameters such as event dispersion (clustering), and the joint distribution of the two hazard variables. Hazard variables are first assumed independent, inducing a positive correlation due to the shared positive dependence on the total number of events. The next framework allows for correlation between the hazard variables, which can then capture negative correlation between accumulated losses. The final framework builds on this by allowing for between-year correlation caused by interannual modulation of the hazard variables.

These frameworks are illustrated using European windstorm gust speeds and precipitation reanalyses from 1980–2000. They are used to diagnose why the correlation between annual wind and precipitation severity indices decreases as thresholds are increased. Only the framework with interannual modulation of the hazard variables quantitatively captures the negative correlations over Europe at high threshold. We propose that one plausible driver for the modulation is the transit time that storms spend near locations.

Received: 25 Jun 2025 – Discussion started: 27 Jun 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Toby P. Jones, David B. Stephenson, and Matthew D. K. Priestley

Status: closed

RC1:
'Comment on egusphere-2025-3031', Anonymous Referee #1, 29 Jul 2025

This paper explores the correlation between wind and precipitation at different thresholds using three different frameworks. It aims to understand what may be the driver for a negative correlation at a higher threshold. I think the paper is well-written, and there are some interesting findings presented. However, I would like to suggest some key points of concern that the authors can address during this review process
Title and terminology
I was intrigued by the title of the paper which states that it investigates multi-peril accumulated losses. Therefore, I was surprised to find that there is in fact, no loss data used in the paper. The paper only looks at the ‘severity’ purely based on ERA5 data of wind and precipitation. ‘Losses’ may cause the reader to expect a more quantifiable impact, such as damages, or have some sort of vulnerability or exposure included in the loss function. I understand that the authors is using severity as a proxy for loss, however, the hazard already seems to be a proxy of the severity through the use of a function. Perhaps the authors can consider to change the word losses to severity instead, in order to avoid further confusion.
What makes the correlation “correct”
In Line 173, the authors mention that “negative correlation over the northwest of mainland Europe is correctly captured” and only framework C is able to capture the correlation at each threshold. It seems that it is ‘correct’ because it matches the sample correlation? What makes the sample correlation from Jones et al. 2024 ‘correct’?
Data selection for the study
I wonder why only data from 1980-2000 has been investigated, while the data is available to the present. The authors state that data prior to 1980 has not been included due to data quality, however, this should not be an issue for recent data. Does the cutoff in 2000 mean that the current day climate is not reflected in the results?
Significance of the research
The introduction of the paper reads very well and highlights the general need for the proposed research. I find the findings related to the negative correlations and the difference between storms with a short and long duration interesting. However, as a reader I am left wondering exactly why these findings are important. Who is it relevant for? How may these results improve our multi-peril risk management? Additionally, a negative correlation between wind and rain can already be deducted by looking at Figure 5, which provides enough insights for the relationship between hazard intensity and duration as well. Can the authors explain the need of extensively testing the three frameworks that are presented as the main focus of the paper?

Citation: https://doi.org/10.5194/egusphere-2025-3031-RC1
- AC3: 'Reply on RC1', Toby Jones, 12 Sep 2025
  
  We would like to thank the reviewer for their helpful comments on our manuscript. Please see attached PDF for our response.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3031-AC3
RC2:
'Comment on egusphere-2025-3031', Anonymous Referee #2, 30 Jul 2025

Please see attached pdf.

Citation: https://doi.org/10.5194/egusphere-2025-3031-RC2
- AC1: 'Reply on RC2', Toby Jones, 12 Sep 2025
  
  We would like to thank the reviewer for their helpful comments on our manuscript. Please see attached PDF for our response.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3031-AC1
RC3:
'Comment on egusphere-2025-3031', Anonymous Referee #3, 08 Aug 2025
This paper presents a novel and very interesting approach that can diagnose the drivers of correlations between hazards and quantify their relative influence. The approach is applied to annual wind and rainfall severity indices across Europe and demonstrates the capability to reproduce their correlation by accounting for three components including within-year dependence between hazards, the dispersion in the annual counts of events, and interannual dependency between seasonally accumulated hazards. This paper provides a timely and practical contribution to the literature that builds on previous research and provides a tool that should open an avenue for future research and enable our understanding of multivariate extreme events. The paper is generally very well written and I have only minor comments/suggestions to add. I would be very pleased to recommend this paper for publication once these comments are addressed.
Methodology section:
I think this section would benefit with some clarification and simplification. The description of the approach is quite mathematical and may be difficult for some readers of the journal to follow. Given the potential value of this method as a tool for the wider community, simplifying certain explanations could encourage broader uptake. These are mostly suggestions, though there are places where I feel I could not replicate without making assumptions. I leave it to the authors to decide whether these changes would strengthen the text.
P5 L108-111, the subscript j is introduced without explanation. Furthermore, why use Cov() when var(X) is used later on L126? Although they are interchangeable, using only one would be consistent and enhance readability.
P5, L113-114: For the correlation , should this not be ? Likewise for P5, L131. I perceive as the correlation of one pair of X and Y values which is obviously not the case. I assume that the correlation is calculated between all X and Y values? Furthermore, it would be helpful to to use , to indicate individual events where appropriate, and X and Y to indicate the random variables.
P5, L114: Which measure of correlation is used?
P5, L125: Please clarify how and are calculated. I assume it is via the same equations on P6, L138 without conditioning on Z.
P6, L136-139: I suggest providing some plain explanation on what these terms represent.
Minor comments:
P6 L146-148: Could you clarify what you mean by ‘smaller scale features’? ERA5 is unlikely to reproduce certain features such as sting jets and does not resolve convective rainfall, which are the features I would consider smaller scale. Also, I would double check the cited papers here as I don’t see a mention of ERA5 within them, and one paper predates ERA5.
P6 L156: Do you mean the maximum 3-second wind gust instead of the ‘3-second maximum’?
P6 L160: Please clarify which measure of correlation is used.
P6 L163: Are the thresholds applied to the event metrics and not the hourly?
Figure 2: Could you include the motivation for using absolute thresholds instead of percentile thresholds? Also, you mention in the conclusions that the results are insensitive to this choice (P13 L246-247), but it would be informative to provide a map in supplementary showing what percentile the absolute thresholds here fall under. I can’t tell if 10mm or 20mm in a storm is extreme or not.
P8 L176: Framework C Indicates the presence of a distinct land-sea contrast in the correlations for joint exceedance of 20 ms^-1 and 20mm, while there are indications of this in the sample correlation. Can you comment on why we see the land-sea contrast and why the sample correlations are noisier?
P8 L180: Could you clarify what is meant by ‘simultaneous correlation’? Do you mean the local correlation each grid cell?
P8 L187-188: Should 3g and 3i be 3d and 3f?
P8 L197-198: I think there is a typo here:"The positive within year dependency component (solid thin line) is largely compensated at all temperature thresholds by the negative within year dependency component”, i.e. “within year dependency component” is repeated twice.
Figure 4 and 5:
The negative within-year correlations (dashed line) appear much stronger over this area of France than the grid-cell correlations shown in Figure 3d–f, which are close to zero across the domain. Could you comment on why this might be the case? Does the larger domain introduce spatial correlations between wind and rainfall? This is an interesting result and may relate to the findings of Manning et al. (2024), already cited here, given the cancelling effect on the dispersion component. For example, one might expect a negative spatial correlation on an event basis, as the highest rainfall typically occurs to the northeast of a cyclone centre, while the strongest winds occur to the south. This spatial separation could explain the cancelling influence on N.

You see a similar effect of the relative positioning of wind and rainfall extremes in cyclones in Figures 5b and 5c. Most wind-event tracks pass to the north of the domain, exposing a larger portion of the domain to the part of the cyclone that generates strong winds. Conversely, rainfall-extreme tracks tend to lie farther south, meaning a greater part of the domain overlaps with the cyclone sector producing heavy rainfall. See Figure 3 in Manning et al. (2024) and related discussion that shows similar results.

Overall, I think the manuscript would benefit from further discussion linking the statistical results to the underlying physical processes, as well as examining the sensitivity of the findings to domain size (e.g., grid cells in Figure 3 versus the larger domains in Figures 4 and 5). The paper presents interesting results on the propagation speed of systems, but I believe there are further nuances to discuss, such as storm positioning, as noted above. It would also be valuable to extend the discussion to consider the influence of the jet stream, which is a dominant driver of propagation speed.
Title: I suggest amending the title to highlight its application to wind and rainfall extremes. While the paper presents an excellent tool, it also makes a valuable contribution to the understanding of multivariate wind and rainfall extremes which might otherwise be overlooked.
Citation: https://doi.org/10.5194/egusphere-2025-3031-RC3
- AC2: 'Reply on RC3', Toby Jones, 12 Sep 2025
  
  We would like to thank the reviewer for their helpful comments on our manuscript. Please see attached PDF for our response.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3031-AC2

Status: closed

RC1:
'Comment on egusphere-2025-3031', Anonymous Referee #1, 29 Jul 2025

This paper explores the correlation between wind and precipitation at different thresholds using three different frameworks. It aims to understand what may be the driver for a negative correlation at a higher threshold. I think the paper is well-written, and there are some interesting findings presented. However, I would like to suggest some key points of concern that the authors can address during this review process
Title and terminology
I was intrigued by the title of the paper which states that it investigates multi-peril accumulated losses. Therefore, I was surprised to find that there is in fact, no loss data used in the paper. The paper only looks at the ‘severity’ purely based on ERA5 data of wind and precipitation. ‘Losses’ may cause the reader to expect a more quantifiable impact, such as damages, or have some sort of vulnerability or exposure included in the loss function. I understand that the authors is using severity as a proxy for loss, however, the hazard already seems to be a proxy of the severity through the use of a function. Perhaps the authors can consider to change the word losses to severity instead, in order to avoid further confusion.
What makes the correlation “correct”
In Line 173, the authors mention that “negative correlation over the northwest of mainland Europe is correctly captured” and only framework C is able to capture the correlation at each threshold. It seems that it is ‘correct’ because it matches the sample correlation? What makes the sample correlation from Jones et al. 2024 ‘correct’?
Data selection for the study
I wonder why only data from 1980-2000 has been investigated, while the data is available to the present. The authors state that data prior to 1980 has not been included due to data quality, however, this should not be an issue for recent data. Does the cutoff in 2000 mean that the current day climate is not reflected in the results?
Significance of the research
The introduction of the paper reads very well and highlights the general need for the proposed research. I find the findings related to the negative correlations and the difference between storms with a short and long duration interesting. However, as a reader I am left wondering exactly why these findings are important. Who is it relevant for? How may these results improve our multi-peril risk management? Additionally, a negative correlation between wind and rain can already be deducted by looking at Figure 5, which provides enough insights for the relationship between hazard intensity and duration as well. Can the authors explain the need of extensively testing the three frameworks that are presented as the main focus of the paper?

Citation: https://doi.org/10.5194/egusphere-2025-3031-RC1
- AC3: 'Reply on RC1', Toby Jones, 12 Sep 2025
  
  We would like to thank the reviewer for their helpful comments on our manuscript. Please see attached PDF for our response.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3031-AC3
RC2:
'Comment on egusphere-2025-3031', Anonymous Referee #2, 30 Jul 2025

Please see attached pdf.

Citation: https://doi.org/10.5194/egusphere-2025-3031-RC2
- AC1: 'Reply on RC2', Toby Jones, 12 Sep 2025
  
  We would like to thank the reviewer for their helpful comments on our manuscript. Please see attached PDF for our response.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3031-AC1
RC3:
'Comment on egusphere-2025-3031', Anonymous Referee #3, 08 Aug 2025
This paper presents a novel and very interesting approach that can diagnose the drivers of correlations between hazards and quantify their relative influence. The approach is applied to annual wind and rainfall severity indices across Europe and demonstrates the capability to reproduce their correlation by accounting for three components including within-year dependence between hazards, the dispersion in the annual counts of events, and interannual dependency between seasonally accumulated hazards. This paper provides a timely and practical contribution to the literature that builds on previous research and provides a tool that should open an avenue for future research and enable our understanding of multivariate extreme events. The paper is generally very well written and I have only minor comments/suggestions to add. I would be very pleased to recommend this paper for publication once these comments are addressed.
Methodology section:
I think this section would benefit with some clarification and simplification. The description of the approach is quite mathematical and may be difficult for some readers of the journal to follow. Given the potential value of this method as a tool for the wider community, simplifying certain explanations could encourage broader uptake. These are mostly suggestions, though there are places where I feel I could not replicate without making assumptions. I leave it to the authors to decide whether these changes would strengthen the text.
P5 L108-111, the subscript j is introduced without explanation. Furthermore, why use Cov() when var(X) is used later on L126? Although they are interchangeable, using only one would be consistent and enhance readability.
P5, L113-114: For the correlation , should this not be ? Likewise for P5, L131. I perceive as the correlation of one pair of X and Y values which is obviously not the case. I assume that the correlation is calculated between all X and Y values? Furthermore, it would be helpful to to use , to indicate individual events where appropriate, and X and Y to indicate the random variables.
P5, L114: Which measure of correlation is used?
P5, L125: Please clarify how and are calculated. I assume it is via the same equations on P6, L138 without conditioning on Z.
P6, L136-139: I suggest providing some plain explanation on what these terms represent.
Minor comments:
P6 L146-148: Could you clarify what you mean by ‘smaller scale features’? ERA5 is unlikely to reproduce certain features such as sting jets and does not resolve convective rainfall, which are the features I would consider smaller scale. Also, I would double check the cited papers here as I don’t see a mention of ERA5 within them, and one paper predates ERA5.
P6 L156: Do you mean the maximum 3-second wind gust instead of the ‘3-second maximum’?
P6 L160: Please clarify which measure of correlation is used.
P6 L163: Are the thresholds applied to the event metrics and not the hourly?
Figure 2: Could you include the motivation for using absolute thresholds instead of percentile thresholds? Also, you mention in the conclusions that the results are insensitive to this choice (P13 L246-247), but it would be informative to provide a map in supplementary showing what percentile the absolute thresholds here fall under. I can’t tell if 10mm or 20mm in a storm is extreme or not.
P8 L176: Framework C Indicates the presence of a distinct land-sea contrast in the correlations for joint exceedance of 20 ms^-1 and 20mm, while there are indications of this in the sample correlation. Can you comment on why we see the land-sea contrast and why the sample correlations are noisier?
P8 L180: Could you clarify what is meant by ‘simultaneous correlation’? Do you mean the local correlation each grid cell?
P8 L187-188: Should 3g and 3i be 3d and 3f?
P8 L197-198: I think there is a typo here:"The positive within year dependency component (solid thin line) is largely compensated at all temperature thresholds by the negative within year dependency component”, i.e. “within year dependency component” is repeated twice.
Figure 4 and 5:
The negative within-year correlations (dashed line) appear much stronger over this area of France than the grid-cell correlations shown in Figure 3d–f, which are close to zero across the domain. Could you comment on why this might be the case? Does the larger domain introduce spatial correlations between wind and rainfall? This is an interesting result and may relate to the findings of Manning et al. (2024), already cited here, given the cancelling effect on the dispersion component. For example, one might expect a negative spatial correlation on an event basis, as the highest rainfall typically occurs to the northeast of a cyclone centre, while the strongest winds occur to the south. This spatial separation could explain the cancelling influence on N.

You see a similar effect of the relative positioning of wind and rainfall extremes in cyclones in Figures 5b and 5c. Most wind-event tracks pass to the north of the domain, exposing a larger portion of the domain to the part of the cyclone that generates strong winds. Conversely, rainfall-extreme tracks tend to lie farther south, meaning a greater part of the domain overlaps with the cyclone sector producing heavy rainfall. See Figure 3 in Manning et al. (2024) and related discussion that shows similar results.

Overall, I think the manuscript would benefit from further discussion linking the statistical results to the underlying physical processes, as well as examining the sensitivity of the findings to domain size (e.g., grid cells in Figure 3 versus the larger domains in Figures 4 and 5). The paper presents interesting results on the propagation speed of systems, but I believe there are further nuances to discuss, such as storm positioning, as noted above. It would also be valuable to extend the discussion to consider the influence of the jet stream, which is a dominant driver of propagation speed.
Title: I suggest amending the title to highlight its application to wind and rainfall extremes. While the paper presents an excellent tool, it also makes a valuable contribution to the understanding of multivariate wind and rainfall extremes which might otherwise be overlooked.
Citation: https://doi.org/10.5194/egusphere-2025-3031-RC3
- AC2: 'Reply on RC3', Toby Jones, 12 Sep 2025
  
  We would like to thank the reviewer for their helpful comments on our manuscript. Please see attached PDF for our response.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3031-AC2

Toby P. Jones, David B. Stephenson, and Matthew D. K. Priestley

Viewed

Total article views: 1,125 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
950	143	32	1,125	24	37

HTML: 950
PDF: 143
XML: 32
Total: 1,125
BibTeX: 24
EndNote: 37

Views and downloads (calculated since 27 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	68	8	3	79
Jul 2025	81	16	9	106
Aug 2025	188	23	1	212
Sep 2025	471	17	5	493
Oct 2025	46	25	6	77
Nov 2025	34	18	3	55
Dec 2025	31	21	5	57
Jan 2026	31	15	0	46

Cumulative views and downloads (calculated since 27 Jun 2025)

Month	HTML	PDF	XML	Total
Jun 2025	68	8	3	79
Jul 2025	81	16	9	106
Aug 2025	188	23	1	212
Sep 2025	471	17	5	493
Oct 2025	46	25	6	77
Nov 2025	34	18	3	55
Dec 2025	31	21	5	57
Jan 2026	31	15	0	46

Viewed (geographical distribution)

Total article views: 1,113 (including HTML, PDF, and XML) Thereof 1,113 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 15 Jan 2026

Short summary

Some hazards bring multiple perils, meaning their yearly losses are correlated. For example, storms cause losses from both wind and rain damage each year. Three models to understand the drivers of the relationship between these yearly losses are explored. These models can be applied to other hazards, but this study focuses on understanding drivers of wind and rain from windstorms. Storm duration near a location is important, having a positive/negative effect on windspeed/rainfall respectively.


Total:	0
HTML:	0
PDF:	0
XML:	0