Probabilistic and Machine Learning Methods for Uncertainty Quantification in Power Outage Prediction due to Extreme Events

Arora, Prateek; Ceferino, Luis

doi:https://doi.org/10.5194/egusphere-2022-975

Preprints

https://doi.org/10.5194/egusphere-2022-975

Preprints

10 Oct 2022

| 10 Oct 2022

Probabilistic and Machine Learning Methods for Uncertainty Quantification in Power Outage Prediction due to Extreme Events

Prateek Arora and Luis Ceferino

Abstract. Strong hurricane winds damage power grids and cause cascading power failures. Statistical and machine learning models have been proposed to predict the extent of power disruptions due to hurricanes. Existing outage models use inputs including power system information, environmental, and demographic parameters. This paper reviews the existing power outage models, highlighting their strengths and limitations. Existing models were developed and validated with data on a few utility companies and regions, limiting the extent of their applicability across geographies and hurricane events. Instead, we train and validate these existing outage models using power outages for multiple regions and hurricanes, including Hurricanes Harvey (2017), Michael (2018), and Isaias (2020), in 1,833 cities along the U.S. coastline. The dataset includes outage data from 39 utility companies in Texas, 5 in Florida, 5 in New Jersey, and 11 in New York. We discuss the limited ability of state-of-the-art machine learning models to (1) make bounded outage predictions, (2) extrapolate predictions to high winds, and (3) account for physics-informed outage uncertainties at low and high winds. For example, we observe that existing models can predict outages as high as 25 times more than the number of customers and cannot capture well the outage variance for wind speeds over 70 m/s. Finally, we present a Beta regression outage modeling framework to address the shortcomings of existing power outage models.

Received: 22 Sep 2022 – Discussion started: 10 Oct 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 1480 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (1480 KB)

Supplement (832 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

03 May 2023

Probabilistic and machine learning methods for uncertainty quantification in power outage prediction due to extreme events

Prateek Arora and Luis Ceferino

Nat. Hazards Earth Syst. Sci., 23, 1665–1683, https://doi.org/10.5194/nhess-23-1665-2023,https://doi.org/10.5194/nhess-23-1665-2023, 2023

Short summary

Prateek Arora and Luis Ceferino

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-975', Anonymous Referee #1, 11 Nov 2022

Review for “Probabilistic and Machine Learning Methods for Uncertainty Quantification in Power Outage Prediction due to Extreme Events”

This is a comprehensive work that compares different machine learning methods. I find the methods and presentations are solid and the authors did nice job in summarizing the substantial works they have finished. However, I do find some critical information is missing. Namely, they need a comparison of performances for all ML models with the separate testing data because it will show whether their models have overfitting issues and how their performances for new data never encountered. The author also needs more illustrations of how their model input data is obtained and possible associated uncertainties. Based on those, I suggest a major revision for this version. Please see detailed suggestions below.

Line 34, “including hurricane”, is it wind field or else? Please clarify.

Uncertainty in the poweroutage.us data.

Line 105 to 111, what are the possible uncertainties in interpolate all covariates into the city scale? Which interpolation method is used? Please be specific.

Line 113: number of outages, is it the same as customers without power?

Line 125: Uncertainty in wind speed estimates since the sizes of cities vary.

Line 147: Which rescaling technique is used for this one?

Line 175: Please fix the citation.

Table 2: how to interpret the difference between R2DEV and R2ψ?

Why is random forest used only for the fraction of customers without power? Is the number of power outages not fit the RF algorithm?

Figure 5: what are R2 and other error statistics in the holdout test? It will be helpful to report them in the same figure.

Do you have any prediction vs observation plot for the RF model like Figure 5?

Section 8.2, there is a heavy discussion on how winds control the power outage from the models. However, how precipitation is related to power outages is not shown, as it is the second most important variable in the RF model. You have demonstrated some nonlinear relationships between wind speed and power outage fraction. Therefore, it is worthwhile to show precipitation’s relationship to outrage fraction or show precipitation and wind jointly with outage prediction in a separate pdp plot. That may explain some nonlinear relationships in Figure 7b.

Section 9, the author mentioned beta regression may have better performance. But no comparison is made with the previous method. I suggest shortening the arguments after line 454 because there is no evidence in the paper supporting them.

Line 469 to 470, unlike linear models, RF does not have the assumption of non-collinearity.

Citation: https://doi.org/10.5194/egusphere-2022-975-RC1
- AC1: 'Reply on RC1', Prateek Arora, 19 Jan 2023
  
  Dear Reviewer,
  We are grateful for your review and comments. Based on your suggestions, we have made changes to the manuscript.
  Please find attached detailed response to your comments.
  
  Citation: https://doi.org/10.5194/egusphere-2022-975-AC1
RC2:
'Comment on egusphere-2022-975', Anonymous Referee #2, 15 Dec 2022

This paper investigated the limitations of existing power outage models, including bounded prediction, out-of-distribution prediction, and physics-aware uncertainties The authors found some of the existing state-of-the-art models may generate unrealistic predictions, and cannot generalize well to extreme events that are not sufficiently represented in the training datasets. The authors discuss some potential ways to address the shortcomings of these models. I have some major comments that authors need to address before publication:

1. The problems mentioned by the authors, including limited generalization ability, unbounded predictions, and unreasonable uncertainty variations, are common problem for general machine learning models. Many machine learning community researchers proposed different methods to address these problems. How unique and critical are they for power outage predictions?

2. Now there is a variety of more complex power outage prediction models [1], are there any specific reasons for the authors to choose to evaluate traditional machine learning models? These traditional models are known to be less representative.

3. It is unclear to me why beta regression should perform well in general cases. I think it also has its own problems such as strict distribution assumption and does not address the representativeness issues which eventually cause the poor generalization problem. Could you provide any justifications and performance comparison regarding why Beta regression should be used?

[1]Xie, Jian, Inalvis Alvarez-Fernandez, and Wei Sun. "A review of machine learning applications in power system resilience." In , pp. 1-5. IEEE, 2020.

Citation: https://doi.org/10.5194/egusphere-2022-975-RC2
- AC2: 'Reply on RC2', Prateek Arora, 19 Jan 2023
  
  Dear Reviewer,
  We appreciate your review of our manuscript. Based on your suggestions, we have incorporated changes in the manuscript.
  Please find attached detailed response to your comments.
  
  Citation: https://doi.org/10.5194/egusphere-2022-975-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-975', Anonymous Referee #1, 11 Nov 2022

Review for “Probabilistic and Machine Learning Methods for Uncertainty Quantification in Power Outage Prediction due to Extreme Events”

This is a comprehensive work that compares different machine learning methods. I find the methods and presentations are solid and the authors did nice job in summarizing the substantial works they have finished. However, I do find some critical information is missing. Namely, they need a comparison of performances for all ML models with the separate testing data because it will show whether their models have overfitting issues and how their performances for new data never encountered. The author also needs more illustrations of how their model input data is obtained and possible associated uncertainties. Based on those, I suggest a major revision for this version. Please see detailed suggestions below.

Line 34, “including hurricane”, is it wind field or else? Please clarify.

Uncertainty in the poweroutage.us data.

Line 105 to 111, what are the possible uncertainties in interpolate all covariates into the city scale? Which interpolation method is used? Please be specific.

Line 113: number of outages, is it the same as customers without power?

Line 125: Uncertainty in wind speed estimates since the sizes of cities vary.

Line 147: Which rescaling technique is used for this one?

Line 175: Please fix the citation.

Table 2: how to interpret the difference between R2DEV and R2ψ?

Why is random forest used only for the fraction of customers without power? Is the number of power outages not fit the RF algorithm?

Figure 5: what are R2 and other error statistics in the holdout test? It will be helpful to report them in the same figure.

Do you have any prediction vs observation plot for the RF model like Figure 5?

Section 8.2, there is a heavy discussion on how winds control the power outage from the models. However, how precipitation is related to power outages is not shown, as it is the second most important variable in the RF model. You have demonstrated some nonlinear relationships between wind speed and power outage fraction. Therefore, it is worthwhile to show precipitation’s relationship to outrage fraction or show precipitation and wind jointly with outage prediction in a separate pdp plot. That may explain some nonlinear relationships in Figure 7b.

Section 9, the author mentioned beta regression may have better performance. But no comparison is made with the previous method. I suggest shortening the arguments after line 454 because there is no evidence in the paper supporting them.

Line 469 to 470, unlike linear models, RF does not have the assumption of non-collinearity.

Citation: https://doi.org/10.5194/egusphere-2022-975-RC1
- AC1: 'Reply on RC1', Prateek Arora, 19 Jan 2023
  
  Dear Reviewer,
  We are grateful for your review and comments. Based on your suggestions, we have made changes to the manuscript.
  Please find attached detailed response to your comments.
  
  Citation: https://doi.org/10.5194/egusphere-2022-975-AC1
RC2:
'Comment on egusphere-2022-975', Anonymous Referee #2, 15 Dec 2022

This paper investigated the limitations of existing power outage models, including bounded prediction, out-of-distribution prediction, and physics-aware uncertainties The authors found some of the existing state-of-the-art models may generate unrealistic predictions, and cannot generalize well to extreme events that are not sufficiently represented in the training datasets. The authors discuss some potential ways to address the shortcomings of these models. I have some major comments that authors need to address before publication:

1. The problems mentioned by the authors, including limited generalization ability, unbounded predictions, and unreasonable uncertainty variations, are common problem for general machine learning models. Many machine learning community researchers proposed different methods to address these problems. How unique and critical are they for power outage predictions?

2. Now there is a variety of more complex power outage prediction models [1], are there any specific reasons for the authors to choose to evaluate traditional machine learning models? These traditional models are known to be less representative.

3. It is unclear to me why beta regression should perform well in general cases. I think it also has its own problems such as strict distribution assumption and does not address the representativeness issues which eventually cause the poor generalization problem. Could you provide any justifications and performance comparison regarding why Beta regression should be used?

[1]Xie, Jian, Inalvis Alvarez-Fernandez, and Wei Sun. "A review of machine learning applications in power system resilience." In , pp. 1-5. IEEE, 2020.

Citation: https://doi.org/10.5194/egusphere-2022-975-RC2
- AC2: 'Reply on RC2', Prateek Arora, 19 Jan 2023
  
  Dear Reviewer,
  We appreciate your review of our manuscript. Based on your suggestions, we have incorporated changes in the manuscript.
  Please find attached detailed response to your comments.
  
  Citation: https://doi.org/10.5194/egusphere-2022-975-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Reconsider after major revisions (further review by editor and referees) (20 Jan 2023) by Vitor Silva

AR by Prateek Arora on behalf of the Authors (17 Feb 2023) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (22 Feb 2023) by Vitor Silva

RR by Anonymous Referee #1 (27 Feb 2023)

RR by Anonymous Referee #2 (27 Mar 2023)

ED: Publish subject to minor revisions (review by editor) (27 Mar 2023) by Vitor Silva

AR by Prateek Arora on behalf of the Authors (27 Mar 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (28 Mar 2023) by Vitor Silva

ED: Publish as is (31 Mar 2023) by Philip Ward (Executive editor)

AR by Prateek Arora on behalf of the Authors (31 Mar 2023) Manuscript

Journal article(s) based on this preprint

03 May 2023

Probabilistic and machine learning methods for uncertainty quantification in power outage prediction due to extreme events

Prateek Arora and Luis Ceferino

Nat. Hazards Earth Syst. Sci., 23, 1665–1683, https://doi.org/10.5194/nhess-23-1665-2023,https://doi.org/10.5194/nhess-23-1665-2023, 2023

Short summary

Prateek Arora and Luis Ceferino

Supplement

https://doi.org/10.5194/egusphere-2022-975-supplement

Prateek Arora and Luis Ceferino

Viewed

Total article views: 791 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
503	270	18	791	47	8	3

HTML: 503
PDF: 270
XML: 18
Total: 791
Supplement: 47
BibTeX: 8
EndNote: 3

Views and downloads (calculated since 10 Oct 2022)

Month	HTML	PDF	XML	Total
Oct 2022	145	47	4	196
Nov 2022	145	42	5	192
Dec 2022	39	28	2	69
Jan 2023	53	37	6	96
Feb 2023	54	40	1	95
Mar 2023	30	32	0	62
Apr 2023	34	44	0	78
May 2023	3	0	3
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Cumulative views and downloads (calculated since 10 Oct 2022)

Month	HTML	PDF	XML	Total
Oct 2022	145	47	4	196
Nov 2022	145	42	5	192
Dec 2022	39	28	2	69
Jan 2023	53	37	6	96
Feb 2023	54	40	1	95
Mar 2023	30	32	0	62
Apr 2023	34	44	0	78
May 2023	3	0	3
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Viewed (geographical distribution)

Total article views: 785 (including HTML, PDF, and XML) Thereof 785 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 06 Sep 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (1480 KB)
Metadata XML

Short summary

Power outage models can help the utilities in managing risks for outages from hurricanes. Our article reviews the existing outage models during hurricanes and highlights their strengths and limitations. Existing models can give erroneous estimates with outage predictions larger than the number of customers, struggle with predictions for catastrophic hurricanes, and do not represent the uncertainties of infrastructure failure well. We conceptualize a new model that overcomes these challenges.


Total:	0
HTML:	0
PDF:	0
XML:	0

Probabilistic and Machine Learning Methods for Uncertainty Quantification in Power Outage Prediction due to Extreme Events

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Supplement

Viewed

Viewed (geographical distribution)