Warnings based on risk matrices: a coherent framework with consistent evaluation

Taggart, Robert J.; Wilke, David J.

doi:10.5194/egusphere-2025-323

Preprints

https://doi.org/10.5194/egusphere-2025-323

Preprints

21 Mar 2025

| 21 Mar 2025

Warnings based on risk matrices: a coherent framework with consistent evaluation

Robert J. Taggart and David J. Wilke

Abstract. Risk matrices are widely used across a range of fields and have found increasing utility in warning decision practices globally. However, their application in this context presents challenges, which range from potentially perverse warning outcomes to a lack of objective verification (i.e., evaluation) methods. This paper introduces a coherent framework for generating multi-level warnings from risk matrices to address these challenges. The proposed framework is general, is based on probabilistic forecasts of hazard severity or impact and is compatible with the Common Alerting Protocol (CAP). Moreover, it includes a family of consistent scoring functions for objectively evaluating the predictive performance of risk matrix assessments and the warnings they produce. These scoring functions enable the ranking of forecasters or warning systems and the tracking of system improvements by rewarding accurate probabilistic forecasts and compliance with warning service directives. A synthetic experiment demonstrates the efficacy of these scoring functions, while the framework is illustrated through warnings for heavy rainfall based on operational ensemble prediction system forecasts for Tropical Cyclone Jasper (Queensland, Australia, 2023). This work establishes a robust foundation for enhancing the reliability and verifiability of risk-based warning systems.

Received: 04 Feb 2025 – Discussion started: 21 Mar 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 803 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (803 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

13 Aug 2025

| Highlight paper

Warnings based on risk matrices: a coherent framework with consistent evaluation

Robert J. Taggart and David J. Wilke

Nat. Hazards Earth Syst. Sci., 25, 2657–2677, https://doi.org/10.5194/nhess-25-2657-2025,https://doi.org/10.5194/nhess-25-2657-2025, 2025

Short summary Editorial statement

Robert J. Taggart and David J. Wilke

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-323', Samar Momin, 12 Apr 2025

General Comments:
This paper introduces a mathematically rigorous framework for issuing and evaluating multi-level warnings derived from risk matrices. It addresses critical weaknesses in current risk matrix-based warning systems, such as inconsistency, lack of objectivity, and absence of formal verification mechanisms. The framework is probabilistic, hazard-agnostic, and compatible with the Common Alerting Protocol (CAP), making it widely applicable in disaster risk management.
The manuscript is technically strong, well-written, and well-structured. It clearly explains the conceptual foundation and mathematical formulation, with practical examples and synthetic experiments demonstrating real-world and theoretical robustness, and provides an open-source Python-based code.
Strengths:
1. Innovation and Relevance:
The paper presents a coherent warning framework that resolves known inconsistencies in traditional risk matrices. The risk matrix score and warning score are introduced as consistent, theoretically grounded methods for evaluation.
2. Operational Usability:
The framework is flexible and compatible with real-time systems (e.g., CAP-based alerting), and can be applied across hazards and domains.
3. Synthetic Experiment and Case Study:
The use of six distinct synthetic forecasters in a probabilistic setup illustrates the scoring method’s discriminative power. The Tropical Cyclone Jasper case study shows practical feasibility in a high-impact, real-world scenario.
4. Clarity and Depth:
The manuscript does an excellent job explaining the logic behind severity-certainty structuring, lead-time sensitivity, and score weighting using realistic examples.
5. Open-Source Tooling:
Providing a Python implementation in the scores package adds major value and supports reproducibility.
Specific Comments:
1. Terminology and Framing:
While the mathematical rigor is a strength, early sections could benefit from briefly reinforcing why these inconsistencies in risk matrices matter for public safety and policy credibility. Consider simplifying the initial explanation of “forecast directive” and “warning directive” for non-technical readers.
2. Comparison with Existing Systems:
The distinction from the UK Met Office (UKMO) and other operational frameworks is clear, but it might help to include a side-by-side visual comparison in an appendix or supplementary material (if possible).
3. Evaluation Weights:
The method for deriving weights from stakeholder input (e.g., community consultation on false alarm vs. miss costs) is strong. However, a brief reflection on the subjectivity and variability in such consultations would add depth.
4. Scalability to Multi-Hazard Systems:
Although the framework is hazard-agnostic, a discussion on how it could scale or adapt to multi-hazard interactions (e.g., flood + wind) would strengthen its applicability. That being said, it would be helpful to shed light on this framework toward earthquake hazards as they are growing in frequency (if possible).
5. Lead Time Scaling:
The use of distinct matrices for LONG-, MID-, and SHORT-range phases is excellent. It would be helpful to mention how this could be dynamically updated as new ensemble data arrives.

Citation: https://doi.org/10.5194/egusphere-2025-323-RC1
- AC1: 'Reply on RC1', Robert Taggart, 06 May 2025
  
  Thank your for taking the time to review the manuscript and give critical feedback.
  Below, we reproduce your comments/suggestion in bold font, followed by our response in non-bold font. Italicized text indicates proposed additional material that will be inserted into the revised manuscript.
  While the mathematical rigor is a strength, early sections could benefit from briefly reinforcing why these inconsistencies in risk matrices matter for public safety and policy credibility. Consider simplifying the initial explanation of “forecast directive” and “warning directive” for non-technical readers.
  Thanks for this suggestion. We will add a couple of extra sentences in the manuscript to help orientate the reader with the "forecast directive" terminology. When the term "forecast directive" is forecast introduced (L25), we will give the following simple example:
  For example, a forecast directive for a warning service for damaging wind gusts might be "Issue a warning if and only if the probability of a wind gust exceeding 90 km/h is at least 10%".
  We will also elaborate why directives are important by inserting an additional sentence after the existing sentence starting at L43:
  When warning decision process lacks adequate definition, two forecasters with identical probabilistic assessments of the hazard could issue two different warning levels. This may lead to warning messages that fluctuate unnecessarily, compromising both public safety and service credibility.
  The distinction from the UK Met Office (UKMO) and other operational frameworks is clear, but it might help to include a side-by-side visual comparison in an appendix or supplementary material (if possible).
  We think implementing this suggestion will be helpful for the reader. We have prepared a side-by-side visual comparison which fits naturally in Section 2.2 as a new figure. The text of Section 2.2 will also be updated to reference this visual comparison.
  The method for deriving weights from stakeholder input (e.g., community consultation on false alarm vs. miss costs) is strong. However, a brief reflection on the subjectivity and variability in such consultations would add depth.
  We believe that such discussion on this is beyond the scope of the current work. However, we will include a brief sentence at the end of Section 3.1 to note that the development of our framework motivates further research:
  Although the process for determining weights in this fictitious flood example was presented straightforwardly, this framework motivates further research into developing best practices for eliciting thresholds and weights through stakeholder consultation.
  Although the framework is hazard-agnostic, a discussion on how it could scale or adapt to multi-hazard interactions (e.g., flood + wind) would strengthen its applicability. That being said, it would be helpful to shed light on this framework toward earthquake hazards as they are growing in frequency (if possible).
  We will include the following comment near the end of Section 2.1 on the applicability of the framework to a generic index, which may account for multi-hazard interactions:
  More generally, the framework could be applied to an index, which itself represents complex multi-hazard interactions. An example of such an index is the Fire Behaviour Index (FBI) used in the Australian Fire Danger Ratings System (AFDRS), which combines weather and fuel state information to determine the severity of fire behaviour.
  Although the framework is applicable to earthquake hazards, we believe it is not appropriate to discuss this in detail, as earthquakes lie outside the authors' area of expertise.
  The use of distinct matrices for LONG-, MID-, and SHORT-range phases is excellent. It would be helpful to mention how this could be dynamically updated as new ensemble data arrives.
  How the arrival of new ensemble data impacts the warning issue process will depend on the way each warning service is designed. Going into such details is beyond the scope of this manuscript but could be explored using concrete warning service examples in a follow-up paper. Nonetheless, we note here that there are at least two factors at play. One is where the lead time phases are a function of the onset to severe phenomena, and new ensemble data shifts the time of onset sufficiently to change the phase. The other is where new ensemble data leads to a re-evaluation of the likelihood and/or severity of the phenomena, which may prompt an update of the warning based on pre-defined amendment criteria for the warning service.
  
  Citation: https://doi.org/10.5194/egusphere-2025-323-AC1
RC2:
'Comment on egusphere-2025-323', Anonymous Referee #2, 18 Apr 2025

This paper proposes a probabilistic framework for multi-level warnings based on risk matrices and illustrates an example for Tropical Cyclone Jasper. The paper is well-organized, and well-written. Also, it publishes open-source codes and all the mathematical algorithms in the appendix, make the paper clear and concise.

Citation: https://doi.org/10.5194/egusphere-2025-323-RC2
- AC2: 'Reply on RC2', Robert Taggart, 06 May 2025
  
  Thank you for reading the manuscript and your positive review
  
  Citation: https://doi.org/10.5194/egusphere-2025-323-AC2
RC3:
'Comment on egusphere-2025-323', Anonymous Referee #3, 20 Apr 2025

This is an excellent paper, which outlines an innovative method for presentation and evaluation of warning predictions. It is strongly based on theoretical concepts but also provides a methodology that is intuitive. I highly recommend publication in EGUsphere.

Citation: https://doi.org/10.5194/egusphere-2025-323-RC3
- AC3: 'Reply on RC3', Robert Taggart, 06 May 2025
  
  Thank you for reading the manuscript and your positive review
  
  Citation: https://doi.org/10.5194/egusphere-2025-323-AC3

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-323', Samar Momin, 12 Apr 2025

General Comments:
This paper introduces a mathematically rigorous framework for issuing and evaluating multi-level warnings derived from risk matrices. It addresses critical weaknesses in current risk matrix-based warning systems, such as inconsistency, lack of objectivity, and absence of formal verification mechanisms. The framework is probabilistic, hazard-agnostic, and compatible with the Common Alerting Protocol (CAP), making it widely applicable in disaster risk management.
The manuscript is technically strong, well-written, and well-structured. It clearly explains the conceptual foundation and mathematical formulation, with practical examples and synthetic experiments demonstrating real-world and theoretical robustness, and provides an open-source Python-based code.
Strengths:
1. Innovation and Relevance:
The paper presents a coherent warning framework that resolves known inconsistencies in traditional risk matrices. The risk matrix score and warning score are introduced as consistent, theoretically grounded methods for evaluation.
2. Operational Usability:
The framework is flexible and compatible with real-time systems (e.g., CAP-based alerting), and can be applied across hazards and domains.
3. Synthetic Experiment and Case Study:
The use of six distinct synthetic forecasters in a probabilistic setup illustrates the scoring method’s discriminative power. The Tropical Cyclone Jasper case study shows practical feasibility in a high-impact, real-world scenario.
4. Clarity and Depth:
The manuscript does an excellent job explaining the logic behind severity-certainty structuring, lead-time sensitivity, and score weighting using realistic examples.
5. Open-Source Tooling:
Providing a Python implementation in the scores package adds major value and supports reproducibility.
Specific Comments:
1. Terminology and Framing:
While the mathematical rigor is a strength, early sections could benefit from briefly reinforcing why these inconsistencies in risk matrices matter for public safety and policy credibility. Consider simplifying the initial explanation of “forecast directive” and “warning directive” for non-technical readers.
2. Comparison with Existing Systems:
The distinction from the UK Met Office (UKMO) and other operational frameworks is clear, but it might help to include a side-by-side visual comparison in an appendix or supplementary material (if possible).
3. Evaluation Weights:
The method for deriving weights from stakeholder input (e.g., community consultation on false alarm vs. miss costs) is strong. However, a brief reflection on the subjectivity and variability in such consultations would add depth.
4. Scalability to Multi-Hazard Systems:
Although the framework is hazard-agnostic, a discussion on how it could scale or adapt to multi-hazard interactions (e.g., flood + wind) would strengthen its applicability. That being said, it would be helpful to shed light on this framework toward earthquake hazards as they are growing in frequency (if possible).
5. Lead Time Scaling:
The use of distinct matrices for LONG-, MID-, and SHORT-range phases is excellent. It would be helpful to mention how this could be dynamically updated as new ensemble data arrives.

Citation: https://doi.org/10.5194/egusphere-2025-323-RC1
- AC1: 'Reply on RC1', Robert Taggart, 06 May 2025
  
  Thank your for taking the time to review the manuscript and give critical feedback.
  Below, we reproduce your comments/suggestion in bold font, followed by our response in non-bold font. Italicized text indicates proposed additional material that will be inserted into the revised manuscript.
  While the mathematical rigor is a strength, early sections could benefit from briefly reinforcing why these inconsistencies in risk matrices matter for public safety and policy credibility. Consider simplifying the initial explanation of “forecast directive” and “warning directive” for non-technical readers.
  Thanks for this suggestion. We will add a couple of extra sentences in the manuscript to help orientate the reader with the "forecast directive" terminology. When the term "forecast directive" is forecast introduced (L25), we will give the following simple example:
  For example, a forecast directive for a warning service for damaging wind gusts might be "Issue a warning if and only if the probability of a wind gust exceeding 90 km/h is at least 10%".
  We will also elaborate why directives are important by inserting an additional sentence after the existing sentence starting at L43:
  When warning decision process lacks adequate definition, two forecasters with identical probabilistic assessments of the hazard could issue two different warning levels. This may lead to warning messages that fluctuate unnecessarily, compromising both public safety and service credibility.
  The distinction from the UK Met Office (UKMO) and other operational frameworks is clear, but it might help to include a side-by-side visual comparison in an appendix or supplementary material (if possible).
  We think implementing this suggestion will be helpful for the reader. We have prepared a side-by-side visual comparison which fits naturally in Section 2.2 as a new figure. The text of Section 2.2 will also be updated to reference this visual comparison.
  The method for deriving weights from stakeholder input (e.g., community consultation on false alarm vs. miss costs) is strong. However, a brief reflection on the subjectivity and variability in such consultations would add depth.
  We believe that such discussion on this is beyond the scope of the current work. However, we will include a brief sentence at the end of Section 3.1 to note that the development of our framework motivates further research:
  Although the process for determining weights in this fictitious flood example was presented straightforwardly, this framework motivates further research into developing best practices for eliciting thresholds and weights through stakeholder consultation.
  Although the framework is hazard-agnostic, a discussion on how it could scale or adapt to multi-hazard interactions (e.g., flood + wind) would strengthen its applicability. That being said, it would be helpful to shed light on this framework toward earthquake hazards as they are growing in frequency (if possible).
  We will include the following comment near the end of Section 2.1 on the applicability of the framework to a generic index, which may account for multi-hazard interactions:
  More generally, the framework could be applied to an index, which itself represents complex multi-hazard interactions. An example of such an index is the Fire Behaviour Index (FBI) used in the Australian Fire Danger Ratings System (AFDRS), which combines weather and fuel state information to determine the severity of fire behaviour.
  Although the framework is applicable to earthquake hazards, we believe it is not appropriate to discuss this in detail, as earthquakes lie outside the authors' area of expertise.
  The use of distinct matrices for LONG-, MID-, and SHORT-range phases is excellent. It would be helpful to mention how this could be dynamically updated as new ensemble data arrives.
  How the arrival of new ensemble data impacts the warning issue process will depend on the way each warning service is designed. Going into such details is beyond the scope of this manuscript but could be explored using concrete warning service examples in a follow-up paper. Nonetheless, we note here that there are at least two factors at play. One is where the lead time phases are a function of the onset to severe phenomena, and new ensemble data shifts the time of onset sufficiently to change the phase. The other is where new ensemble data leads to a re-evaluation of the likelihood and/or severity of the phenomena, which may prompt an update of the warning based on pre-defined amendment criteria for the warning service.
  
  Citation: https://doi.org/10.5194/egusphere-2025-323-AC1
RC2:
'Comment on egusphere-2025-323', Anonymous Referee #2, 18 Apr 2025

This paper proposes a probabilistic framework for multi-level warnings based on risk matrices and illustrates an example for Tropical Cyclone Jasper. The paper is well-organized, and well-written. Also, it publishes open-source codes and all the mathematical algorithms in the appendix, make the paper clear and concise.

Citation: https://doi.org/10.5194/egusphere-2025-323-RC2
- AC2: 'Reply on RC2', Robert Taggart, 06 May 2025
  
  Thank you for reading the manuscript and your positive review
  
  Citation: https://doi.org/10.5194/egusphere-2025-323-AC2
RC3:
'Comment on egusphere-2025-323', Anonymous Referee #3, 20 Apr 2025

This is an excellent paper, which outlines an innovative method for presentation and evaluation of warning predictions. It is strongly based on theoretical concepts but also provides a methodology that is intuitive. I highly recommend publication in EGUsphere.

Citation: https://doi.org/10.5194/egusphere-2025-323-RC3
- AC3: 'Reply on RC3', Robert Taggart, 06 May 2025
  
  Thank you for reading the manuscript and your positive review
  
  Citation: https://doi.org/10.5194/egusphere-2025-323-AC3

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Publish subject to minor revisions (review by editor) (06 May 2025) by David J. Peres

AR by Robert Taggart on behalf of the Authors (15 May 2025) Author's response Author's tracked changes Manuscript

ED: Publish as is (28 May 2025) by David J. Peres

AR by Robert Taggart on behalf of the Authors (02 Jun 2025)

Journal article(s) based on this preprint

13 Aug 2025

| Highlight paper

Warnings based on risk matrices: a coherent framework with consistent evaluation

Robert J. Taggart and David J. Wilke

Nat. Hazards Earth Syst. Sci., 25, 2657–2677, https://doi.org/10.5194/nhess-25-2657-2025,https://doi.org/10.5194/nhess-25-2657-2025, 2025

Short summary Editorial statement

Robert J. Taggart and David J. Wilke

Data sets

Data and code for risk matrix score paper Robert J. Taggart http://doi.org/10.5281/zenodo.14668723

Model code and software

Data and code for risk matrix score paper Robert J. Taggart http://doi.org/10.5281/zenodo.14668723

Robert J. Taggart and David J. Wilke

Viewed

Total article views: 2,249 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,565	576	108	2,249	118	139

HTML: 1,565
PDF: 576
XML: 108
Total: 2,249
BibTeX: 118
EndNote: 139

Views and downloads (calculated since 21 Mar 2025)

Month	HTML	PDF	XML	Total
Mar 2025	148	20	6	174
Apr 2025	80	46	10	136
May 2025	134	32	10	176
Jun 2025	76	32	8	116
Jul 2025	60	26	0	86
Aug 2025	130	46	12	188
Sep 2025	434	22	18	474
Oct 2025	36	34	2	72
Nov 2025	74	42	8	124
Dec 2025	74	28	6	108
Jan 2026	68	48	10	126
Feb 2026	64	60	6	130
Mar 2026	74	54	2	130
Apr 2026	63	25	2	90
May 2026	43	53	6	102
Jun 2026	2	4	1	7
Jul 2026	5	4	1	10

Cumulative views and downloads (calculated since 21 Mar 2025)

Month	HTML	PDF	XML	Total
Mar 2025	148	20	6	174
Apr 2025	80	46	10	136
May 2025	134	32	10	176
Jun 2025	76	32	8	116
Jul 2025	60	26	0	86
Aug 2025	130	46	12	188
Sep 2025	434	22	18	474
Oct 2025	36	34	2	72
Nov 2025	74	42	8	124
Dec 2025	74	28	6	108
Jan 2026	68	48	10	126
Feb 2026	64	60	6	130
Mar 2026	74	54	2	130
Apr 2026	63	25	2	90
May 2026	43	53	6	102
Jun 2026	2	4	1	7
Jul 2026	5	4	1	10

Viewed (geographical distribution)

Total article views: 2,244 (including HTML, PDF, and XML) Thereof 2,244 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 20 Jul 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (803 KB)
Metadata XML

Short summary

Our research presents a new method for determining warning levels for any hazard. Using risk matrices, our framework addresses issues found in other approaches. We provide examples to demonstrate how the approach works. A powerful method for evaluating warning accuracy is given, allowing for a cycle of continuous improvement in warning services. This research is relevant to a broad audience, from those who develop forecast systems to practitioners who issue or communicate warnings.


Total:	0
HTML:	0
PDF:	0
XML:	0