Methods for evaluating the significance and importance of differences amongst probabilistic seismic hazard results for engineering and risk analyses: A review and insights

Douglas, John; Crowley, Helen; Silva, Vitor; Marzocchi, Warner; Danciu, Laurentiu; Pinho, Rui

doi:10.5194/egusphere-2023-991

Preprints

https://doi.org/10.5194/egusphere-2023-991

Preprints

19 Jun 2023

| 19 Jun 2023

Status: this preprint has been withdrawn by the authors.

Methods for evaluating the significance and importance of differences amongst probabilistic seismic hazard results for engineering and risk analyses: A review and insights

John Douglas, Helen Crowley, Vitor Silva, Warner Marzocchi, Laurentiu Danciu, and Rui Pinho

Abstract. When new seismic hazard models are published it is natural to compare them to existing models for the same location. This type of comparison routinely indicates differences between the assessed hazards in the various models. The question that then arises is whether these differences are scientifically significant, given the large epistemic uncertainties inherent in all seismic hazard models, or practically important, given the use of hazard models as inputs to risk and engineering calculations. A difference that exceeds a given threshold could mean that building codes may need updating, risk models for insurance purposes may need to be revised, or emergency management procedures revisited. In the current literature there is little guidance on what constitutes a significant or important difference, which can lead to lengthy discussions amongst hazard analysts and end users. This study reviews proposals in the literature on this topic and examines how applicable these proposals are for several sites considering various seismic hazards models for each site, including the two European Seismic Hazard Models of 2013 and 2020. The implications of differences in hazard for risk and engineering purposes are also examined to understand how important such differences are for potential end users of seismic hazard models. Based on this, we discuss the relevance of such methods to determine the scientific significance and practical importance of differences between seismic hazard models and identify some open questions. Finally, we make some recommendations for the future.

This preprint has been withdrawn.

Received: 12 May 2023 – Discussion started: 19 Jun 2023

Competing interests: Some authors are members of the editorial committee for this special issue of Natural Hazards and Earth System Sciences. The peer-review process was guided by an independent editor, and the authors have no other competing interests to declare.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1404 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (1404 KB)

Download & links

This preprint has been withdrawn.

John Douglas, Helen Crowley, Vitor Silva, Warner Marzocchi, Laurentiu Danciu, and Rui Pinho

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-991', Anonymous Referee #1, 10 Jul 2023

The paper reviews the methods for establishing whether the differences amongst PSHA results, based on different hazard models, can be deemed significant or important. Significant has a clear scientific meaning in the statistical context, important much less so.
Despite more could be perhaps expected from the knowledgeable authors, most exercises performed in of the paper are trivial and why they need a journal paper remains a question after the review.
The main issue with this is that it provides a limited innovative contribution. More specifically, Section 2 recalls the methods available in literature for evaluating whether the differences between PSHA results are important or significant. Such methods are then applied in some exercises presented in Section 3. However, the objective of Section 3 is not clear, apart from comparing site-specific hazard curves from some known PSHA models.
Another comment pertains to lines 63-75. The usefulness of the discussion about contouring is questionable. It is quite obvious that contouring is used (only) for representation purposes, and that this may hide differences between PSHA results, at the same site, based on different hazard models. In fact, the ground motion intensity to be used for seismic design is taken from the PSHA numerical results, which are generally provided to users. This reviewer suggests reducing/removing this part.
Section 4 proposes a new method for evaluating whether changes on PSHA results at one site, due to different hazard models, can be deemed important. The implementation of such method needs fragility functions and therefore it seems to depend on the building stock that is considered at the site (this is explained in Section 4.2). Indeed, the procedure allows one to establish whether “small changes on the seismic hazard can lead to important differences on risk metrics” (lines 467-468). There is an apparent ambiguity, which can be summarized in the following question: does the study investigates the “importance” of the difference between hazard or risk results? This reviewer finds that this should be clearly stated in the abstract and introduction.
The next comment is partly related to the previous one. In fact, it is not clear why the effect of the difference amongst PSHA models should be explored considering (only) risk (as it appears from the paper) rather than hazard results. Is there a specific reason?
Lines 415-420 discuss that the differences between the five considered hazard models are not important because the average annual probability of collapse (AAPC) for mid-rise RC buildings in Beznau, designed according to the different PSHA models, is always below a pre-determined threshold. In the proposed exercise, authors assume the AAPC threshold equal to 2x10-4, a typical considered value. However, the metric and the threshold value for establishing importance/non-importance of difference between results are completely arbitrary.
Lines 436-437: For the European hazard, there is only one city where the hazard change from ESHM13 to ESHM20 can be deemed important, shown in red in the table”. There is no red edit in the tables. Do authors mean the bold edit in Table 4?

Citation: https://doi.org/10.5194/egusphere-2023-991-RC1
- AC1:
  'Reply on RC1', John Douglas, 07 Nov 2023
  Reviewer #1
  We thank the reviewer for their detailed and constructive comments on the original version of this manuscript. In the following, we reply point by point to their comments. In addition, we provide an annotated version of the manuscript. Finally, we made some additional minor changes following our own internal reviews.
  ------
  “The paper reviews the methods for establishing whether the differences amongst PSHA results, based on different hazard models, can be deemed significant or important. Significant has a clear scientific meaning in the statistical context, important much less so.
  Despite more could be perhaps expected from the knowledgeable authors, most exercises performed in of the paper are trivial and why they need a journal paper remains a question after the review.”
  We agree that some of what we present may seem quite simple but as far as we are aware this topic has never been thoroughly investigated in the literature despite being a topic of considerable interest to various end users. This work was motivated by problems that we are currently facing. We have added two sentences at the end of the introduction to address this comment.
  
  “The main issue with this is that it provides a limited innovative contribution. More specifically, Section 2 recalls the methods available in literature for evaluating whether the differences between PSHA results are important or significant. Such methods are then applied in some exercises presented in Section 3. However, the objective of Section 3 is not clear, apart from comparing site-specific hazard curves from some known PSHA models.”
  It is important in articles to discuss previous studies tackling similar questions (Section 2). Some of this literature is not well known and hence it is necessary to highlight it. Section 3 applies these proposals to some example hazard models to illustrate them and to understand their advantages and disadvantages. We have added some clarification about our objectives at the start of Section 3.
  
  “Another comment pertains to lines 63-75. The usefulness of the discussion about contouring is questionable. It is quite obvious that contouring is used (only) for representation purposes, and that this may hide differences between PSHA results, at the same site, based on different hazard models. In fact, the ground motion intensity to be used for seismic design is taken from the PSHA numerical results, which are generally provided to users. This reviewer suggests reducing/removing this part.”
  We do not agree with this comment. For example, at least in our experience (e.g. most European countries) it is the published contoured map that is used for design not the exact numbers, which are not published. In Europe, to our knowledge only in Italy are exact numbers used for seismic design codes and not contoured values. Even if contoured maps are used only for representation, this representation can lead to questions from non-technical end users if they do not look at the actual numbers. We have added some text to the manuscript on this topic.
  
  “Section 4 proposes a new method for evaluating whether changes on PSHA results at one site, due to different hazard models, can be deemed important. The implementation of such method needs fragility functions and therefore it seems to depend on the building stock that is considered at the site (this is explained in Section 4.2). Indeed, the procedure allows one to establish whether “small changes on the seismic hazard can lead to important differences on risk metrics” (lines 467-468). There is an apparent ambiguity, which can be summarized in the following question: does the study investigates the “importance” of the difference between hazard or risk results? This reviewer finds that this should be clearly stated in the abstract and introduction.”
  We agree that this concept was not sufficiently clear in the previous version of the manuscript. “Importance” is related to how changes in the seismic hazard might affect the risk/engineering results, and consequently decisions that are based on these results. In our opinion, since the only variable that is causing changes in the risk results is the different hazard models, we can still state that we are investigating the practical importance of changes in the seismic hazard. We have modified the introduction to clarify this aspect.
  
  “The next comment is partly related to the previous one. In fact, it is not clear why the effect of the difference amongst PSHA models should be explored considering (only) risk (as it appears from the paper) rather than hazard results. Is there a specific reason?”
  Risk is more useful as differences in the hazard do not necessarily mean much as hazard itself is not what end-users are really interested in. We have added some text at the beginning of Section 4 to emphasise this.
  
  “Lines 415-420 discuss that the differences between the five considered hazard models are not important because the average annual probability of collapse (AAPC) for mid-rise RC buildings in Beznau, designed according to the different PSHA models, is always below a pre-determined threshold. In the proposed exercise, authors assume the AAPC threshold equal to 2x10-4, a typical considered value. However, the metric and the threshold value for establishing importance/non-importance of difference between results are completely arbitrary.”
  Two references were included to explain the origin of the 2 x 10^-4 [an Informative Annex of the updated Eurocode 8 and ASCE (2010)]. Therefore, this choice is not arbitrary. We have added some text to explain that whilst collapse is not explicitly a design parameter, it is used to verify the design.
  
  “Lines 436-437: For the European hazard, there is only one city where the hazard change from ESHM13 to ESHM20 can be deemed important, shown in red in the table”. There is no red edit in the tables. Do authors mean the bold edit in Table 4?”
  Yes, we meant bold type. This has been corrected.
  
  Citation: https://doi.org/10.5194/egusphere-2023-991-AC1
RC2:
'Comment on egusphere-2023-991', Anonymous Referee #2, 09 Oct 2023
I enjoyed reading this manuscript, and it discusses an ‘important’ issue of high relevance for Earth scientists and earthquake engineers: How to deal with hazard values that are changing from one PSHA to the next. The topic is well suited for the special issue and nicely adds perspective to the ESHM2020 model. The manuscript is well-written and generally clear, and while I have several suggestions for the authors to consider, detailed below, I believe these are minor, and the paper is close to being published.
In my opinion, the manuscript's main shortcoming is that it remains inconclusive in answering the question it raises: When is a change significant, and when is it practically important? The manuscript nicely reviews the available methods (which are few) but ends somewhat open-ended in the conclusions and suggestions. I like the generally carefully phrased suggestions, the balanced review, and the application to selected case studies is important and original; however, the manuscript does not propose a unique or innovative new method or workflow for assessing significance and practical importance, and it is limited to a few selected sites and models. Different methods apply also deliver different conclusions, leaving the reader slightly at a loss what to conclude. In that sense, the work does not represent a breakthrough. The discussion and conclusion sections could be written more forcefully, suggesting a clear preference of the (distinguished) team of authors on how to move forward in this thorny issue of great relevance. I have several detailed comments:
The four Swiss models are, to my best knowledge, indeed only partially comparable with respect to the site conditions without further adjustments to a common 'rock' condition (not sure about the Italian ones). This fact is initially acknowledged but later not re-discussed, and it limits interpretation in the Swiss context.

The ENSI review team rejected the PRP model, so in that sense, it is not truly an 'accepted' model. The model used by ENSI was the so-called 'hybrid' one, a max of SUIHaz Source and PRP GMM. Would not a comparison with the hybrid model be more accurate?

As a consequence of 1 and 2, I suggest emphasising more strongly that the paper's focus is a mainly comparison of methods based on selected cases but with limited implications for the actual sites. This limitation is even more relevant when computing risk since your risk calculation is very simplistic compared to the Swiss risk model released in 2023.
The PSHA community would generally argue that an SSHAC level III or IV site-specific study should be more reliable than a national PSHA. So, in addition to asking if a study is (statistically) significantly different or, importantly, different based on a risk/build code metric, one could also ask if a study is 'better'. Of course, 'better' is even harder to justify, yet it is often used as an argument for replacing older studies. Better could indicate that it is based on substantially more data, newer and improved methodologies, based on a broader expert group, and more thorough uncertainty quantification etc. In my opinion, a new hazard study must quantify the difference from past studies and justify why the new study should be considered superior and adopted as the next standard or used for a specific site (in the case of NPP). The authors could discuss this aspect of the generation of PSHA also. A somewhat more heretic view would be: Is it relevant if a study is significantly different or practically important if it is beloved to be superior/better? The newest model, representing the current state of the art, should be used in any case.

I am surprised that for the case study Italy, you only compare MPS04 and MPS19. Why did you not include also ESHM13 and ESHM20, analog to the Swiss case? I think this would broaden the implications and analysis.

One of the most striking features of your Figure 1, 3 and 4 is how different ESHM 13 is to the other 4 models. This can be in parts explained by ESHM13 not being well calibrated for this region, for lack of time, but you may want to comment on it in the spirit of significant and important differences. A new model coming along that is significantly different should be scrutinised in detail before being released, otherwise it risks jumping by a lot I the next model generation (as did ESHM13 --> 20).

The study mainly refers to differences in PGA; this simplification should be justified. Would it not be better to define significant differences in the spectral domain? Are models also different at 1 Hz, 5 Hz etc.? Using PGA as the basis to compute risk is a substantial simplification for risk-related studies. Is it warranted (the authors should know better than me)? Likewise, building codes, at least in the EC8 context, do I believe not use PGA; how would this impact your results?

You selected five cities for the European model comparison and 4 for the Italian one. Please make sure the coordinates you used are given for reproducibility. It would also be good to explain the choice of these cities – or, even better, show a difference map between ESHM13/20 in terms of significance. Such maps would allow identifying in an objective way areas of s and smallest differences: I believe they are planned in the ESHM20 publication in any case, so you might refer to them.

I personally believe that the second point you raise in the discussions resonates well with me: Increasing the ‘resilience’ of building codes to changes in the hazard, that are likely to come, and establishing this on a cost-benefit analysis. An explicitly included safety buffer in the design of buildings, instead of building right at the ‘limit’ would be a wise choice of society. To me this is a better concept than allowing for non-compliance for a longer period or hoping that conservatism was already build in (your point 1).

Figure 1 is difficult to see and not very attractive; the subsequent Figures 3 and 4 are more appealing.

For Tables 1 and 2, please state how models are compared (what does a negative % mean, which model is higher?)
Citation: https://doi.org/10.5194/egusphere-2023-991-RC2
- AC2:
  'Reply on RC2', John Douglas, 07 Nov 2023
  Reviewer #2
  We thank the reviewer for their detailed and constructive comments on the original version of this manuscript. In the following, we reply point by point to their comments. In addition, we provide an annotated version of the manuscript. Finally, we made some additional minor changes following our own internal reviews.
  ------
  I enjoyed reading this manuscript, and it discusses an ‘important’ issue of high relevance for Earth scientists and earthquake engineers: How to deal with hazard values that are changing from one PSHA to the next. The topic is well suited for the special issue and nicely adds perspective to the ESHM2020 model. The manuscript is well-written and generally clear, and while I have several suggestions for the authors to consider, detailed below, I believe these are minor, and the paper is close to being published.
  In my opinion, the manuscript's main shortcoming is that it remains inconclusive in answering the question it raises: When is a change significant, and when is it practically important? The manuscript nicely reviews the available methods (which are few) but ends somewhat open-ended in the conclusions and suggestions. I like the generally carefully phrased suggestions, the balanced review, and the application to selected case studies is important and original; however, the manuscript does not propose a unique or innovative new method or workflow for assessing significance and practical importance, and it is limited to a few selected sites and models. Different methods apply also deliver different conclusions, leaving the reader slightly at a loss what to conclude. In that sense, the work does not represent a breakthrough. The discussion and conclusion sections could be written more forcefully, suggesting a clear preference of the (distinguished) team of authors on how to move forward in this thorny issue of great relevance.
  We have added three recommendations at the end of the conclusions to address this comment.
  
  I have several detailed comments:
  The four Swiss models are, to my best knowledge, indeed only partially comparable with respect to the site conditions without further adjustments to a common 'rock' condition (not sure about the Italian ones). This fact is initially acknowledged but later not re-discussed, and it limits interpretation in the Swiss context.
  
  We acknowledge the concern about the models' comparability in terms of site conditions as we do not attempt to adjust them to a standard Swiss 'rock' condition. While this limitation is initially mentioned, we will change the text to emphasize it more explicitly. Adapting them to a common 'rock' would be time-consuming, and we lack the necessary data. Furthermore, any adjustment is expected to influence the outcomes by less than 5%, leaving the primary conclusions unchanged. To be clear, the surface ‘rock’ condition for PGA at Beznau is 1800m/s in PRP/PEGASOS, 1100m/s in SuiHaz15, and 800m/s in ESHM13 and ESHM20. Using the conversion factors from Danciu and Fäh (2017), the difference between PRP/PEGASOS and Suihaz15 corresponds to around 5%. The difference with ESHM13/20 and Suihaz15 is likely to be of a similar order.
  
  The ENSI review team rejected the PRP model, so in that sense, it is not truly an 'accepted' model. The model used by ENSI was the so-called 'hybrid' one, a max of SUIHaz Source and PRP GMM. Would not a comparison with the hybrid model be more accurate?
  
  A comparison with ENSI's 'hybrid' model might indeed have been more insightful. A hybrid model is, however, not a standard seismic hazard product. In addition, we are unsure whether the decision of ENSI is public. Therefore, we have not made any change to the manuscript.
  
  As a consequence of 1 and 2, I suggest emphasising more strongly that the paper's focus is a mainly comparison of methods based on selected cases but with limited implications for the actual sites. This limitation is even more relevant when computing risk since your risk calculation is very simplistic compared to the Swiss risk model released in 2023.
  We have added two sentences to the introduction to Section 4 to highlight that the examples shown are illustrations and case-specific calculations would be required for actually applications. These sentences complement additions made at the end of Section 1 and the beginning of Section 3 in response to earlier comments.
  
  The PSHA community would generally argue that an SSHAC level III or IV site-specific study should be more reliable than a national PSHA. So, in addition to asking if a study is (statistically) significantly different or, importantly, different based on a risk/build code metric, one could also ask if a study is 'better'. Of course, 'better' is even harder to justify, yet it is often used as an argument for replacing older studies. Better could indicate that it is based on substantially more data, newer and improved methodologies, based on a broader expert group, and more thorough uncertainty quantification etc. In my opinion, a new hazard study must quantify the difference from past studies and justify why the new study should be considered superior and adopted as the next standard or used for a specific site (in the case of NPP). The authors could discuss this aspect of the generation of PSHA also. A somewhat more heretic view would be: Is it relevant if a study is significantly different or practically important if it is beloved to be superior/better? The newest model, representing the current state of the art, should be used in any case.
  
  Site-specific studies are inherently more complex and comprehensive in comparison to national and regional hazard models. In general, site-specific hazard models are known for their complexity, particularly due to their detailed focus and the extensive effort required to capture a wide range of data, assumptions, and models. In recent decades, national/regional hazard models such as UCERF3, SuiHaz15 and ESHM20 have become increasingly complex and adaptable to updates. Each type has different audiences and applications, resulting in diverse ranges of validity. More precisely, national models may not be accurate when it comes to extremely low probabilities such as 10^-5 or 10^-6, which are essential for conducting site-specific hazard studies for nuclear power plants. At such low probabilities, surface site condition uncertainties and aleatory variabilities become dominant.
  
  Furthermore, when comparing a site-specific model with its national/regional counterpart, the analysis goes beyond just differences in results. It is crucial to thoroughly examine datasets, underlying assumptions, constructed models, expert elicitation and input, computational models, as well as the practical applications of the results. A meaningful comparison can only be achieved if the models are clear, thoroughly documented, and the underlying data and methodologies are both easily accessible and capable of being replicated. We have added some text in the conclusions about this.
  I am surprised that for the case study Italy, you only compare MPS04 and MPS19. Why did you not include also ESHM13 and ESHM20, analog to the Swiss case? I think this would broaden the implications and analysis.
  
  As we are using hazard models as examples we have decided not to also include ESHM13 and 20 for Italy. The MPS models are the ones proposed for seismic design in Italy and hence we think they are the most relevant for that part of the analysis. In addition, including an additional two hazard models would make presentation of the results for Italy much more complicated as there would be six pairwise comparisons rather than just one. We do note, however, that Syracuse is used in both the Italian and Europe comparisons. We have not made any changes to the manuscript for this point.
  
  One of the most striking features of your Figure 1, 3 and 4 is how different ESHM 13 is to the other 4 models. This can be in parts explained by ESHM13 not being well calibrated for this region, for lack of time, but you may want to comment on it in the spirit of significant and important differences. A new model coming along that is significantly different should be scrutinised in detail before being released, otherwise it risks jumping by a lot I the next model generation (as did ESHM13 --> 20).
  
  We have added some sentences about this in the revised manuscript.
  
  The study mainly refers to differences in PGA; this simplification should be justified. Would it not be better to define significant differences in the spectral domain? Are models also different at 1 Hz, 5 Hz etc.? Using PGA as the basis to compute risk is a substantial simplification for risk-related studies. Is it warranted (the authors should know better than me)? Likewise, building codes, at least in the EC8 context, do I believe not use PGA; how would this impact your results?
  
  Whilst we could have extended the study to other intensity measure types, we do not feel that the additional effort (which may require digitising hazard curves) would be justified given we are only using the calculations to illustrate the methods.
  
  Using PGA for buildings codes is standard in the current EC8, which is why we used it in our example. It is true that the future version of EC8 will bypass PGA and provide spectral accelerations directly so we note in the methodology in Section 4.2 that one can also employ spectral acceleration as the seismic design parameter.
  
  We used PGA in the risk calculations so that we could make use of the hazard curves presented previously. We agree with the reviewer that other intensity measure types would typically be employed in a risk assessment, but as mentioned above, these applications have mainly illustrative purposes.
  
  You selected five cities for the European model comparison and 4 for the Italian one. Please make sure the coordinates you used are given for reproducibility. It would also be good to explain the choice of these cities – or, even better, show a difference map between ESHM13/20 in terms of significance. Such maps would allow identifying in an objective way areas of s and smallest differences: I believe they are planned in the ESHM20 publication in any case, so you might refer to them.
  
  We have added the coordinates of the cities and included some discussion of why these locations were chosen.
  
  The cities were selected because their hazard is relevant for this investigation. Reference Figure 6 in Danciu, L. et al. (2022). The 2020 European Seismic Hazard Model: Milestones and Lessons Learned. In: Vacareanu, R., Ionescu, C. (eds) Progresses in European Earthquake Engineering and Seismology. ECEES 2022. Springer Proceedings in Earth and Environmental Sciences. Springer, Cham. https://doi.org/10.1007/978-3-031-15104-0_1
  
  I personally believe that the second point you raise in the discussions resonates well with me: Increasing the ‘resilience’ of building codes to changes in the hazard, that are likely to come, and establishing this on a cost-benefit analysis. An explicitly included safety buffer in the design of buildings, instead of building right at the ‘limit’ would be a wise choice of society. To me this is a better concept than allowing for non-compliance for a longer period or hoping that conservatism was already build in (your point 1).
  
  We than the reviewer for their positive feedback.
  
  Figure 1 is difficult to see and not very attractive; the subsequent Figures 3 and 4 are more appealing.
  
  Figure 1 has been redrawn in the same style as Figure 3 and 4.
  
  For Tables 1 and 2, please state how models are compared (what does a negative % mean, which model is higher?)
  
  The captions of these tables have been expanded to explain the meaning of the numbers.
  
  Citation: https://doi.org/10.5194/egusphere-2023-991-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-991', Anonymous Referee #1, 10 Jul 2023

The paper reviews the methods for establishing whether the differences amongst PSHA results, based on different hazard models, can be deemed significant or important. Significant has a clear scientific meaning in the statistical context, important much less so.
Despite more could be perhaps expected from the knowledgeable authors, most exercises performed in of the paper are trivial and why they need a journal paper remains a question after the review.
The main issue with this is that it provides a limited innovative contribution. More specifically, Section 2 recalls the methods available in literature for evaluating whether the differences between PSHA results are important or significant. Such methods are then applied in some exercises presented in Section 3. However, the objective of Section 3 is not clear, apart from comparing site-specific hazard curves from some known PSHA models.
Another comment pertains to lines 63-75. The usefulness of the discussion about contouring is questionable. It is quite obvious that contouring is used (only) for representation purposes, and that this may hide differences between PSHA results, at the same site, based on different hazard models. In fact, the ground motion intensity to be used for seismic design is taken from the PSHA numerical results, which are generally provided to users. This reviewer suggests reducing/removing this part.
Section 4 proposes a new method for evaluating whether changes on PSHA results at one site, due to different hazard models, can be deemed important. The implementation of such method needs fragility functions and therefore it seems to depend on the building stock that is considered at the site (this is explained in Section 4.2). Indeed, the procedure allows one to establish whether “small changes on the seismic hazard can lead to important differences on risk metrics” (lines 467-468). There is an apparent ambiguity, which can be summarized in the following question: does the study investigates the “importance” of the difference between hazard or risk results? This reviewer finds that this should be clearly stated in the abstract and introduction.
The next comment is partly related to the previous one. In fact, it is not clear why the effect of the difference amongst PSHA models should be explored considering (only) risk (as it appears from the paper) rather than hazard results. Is there a specific reason?
Lines 415-420 discuss that the differences between the five considered hazard models are not important because the average annual probability of collapse (AAPC) for mid-rise RC buildings in Beznau, designed according to the different PSHA models, is always below a pre-determined threshold. In the proposed exercise, authors assume the AAPC threshold equal to 2x10-4, a typical considered value. However, the metric and the threshold value for establishing importance/non-importance of difference between results are completely arbitrary.
Lines 436-437: For the European hazard, there is only one city where the hazard change from ESHM13 to ESHM20 can be deemed important, shown in red in the table”. There is no red edit in the tables. Do authors mean the bold edit in Table 4?

Citation: https://doi.org/10.5194/egusphere-2023-991-RC1
- AC1:
  'Reply on RC1', John Douglas, 07 Nov 2023
  Reviewer #1
  We thank the reviewer for their detailed and constructive comments on the original version of this manuscript. In the following, we reply point by point to their comments. In addition, we provide an annotated version of the manuscript. Finally, we made some additional minor changes following our own internal reviews.
  ------
  “The paper reviews the methods for establishing whether the differences amongst PSHA results, based on different hazard models, can be deemed significant or important. Significant has a clear scientific meaning in the statistical context, important much less so.
  Despite more could be perhaps expected from the knowledgeable authors, most exercises performed in of the paper are trivial and why they need a journal paper remains a question after the review.”
  We agree that some of what we present may seem quite simple but as far as we are aware this topic has never been thoroughly investigated in the literature despite being a topic of considerable interest to various end users. This work was motivated by problems that we are currently facing. We have added two sentences at the end of the introduction to address this comment.
  
  “The main issue with this is that it provides a limited innovative contribution. More specifically, Section 2 recalls the methods available in literature for evaluating whether the differences between PSHA results are important or significant. Such methods are then applied in some exercises presented in Section 3. However, the objective of Section 3 is not clear, apart from comparing site-specific hazard curves from some known PSHA models.”
  It is important in articles to discuss previous studies tackling similar questions (Section 2). Some of this literature is not well known and hence it is necessary to highlight it. Section 3 applies these proposals to some example hazard models to illustrate them and to understand their advantages and disadvantages. We have added some clarification about our objectives at the start of Section 3.
  
  “Another comment pertains to lines 63-75. The usefulness of the discussion about contouring is questionable. It is quite obvious that contouring is used (only) for representation purposes, and that this may hide differences between PSHA results, at the same site, based on different hazard models. In fact, the ground motion intensity to be used for seismic design is taken from the PSHA numerical results, which are generally provided to users. This reviewer suggests reducing/removing this part.”
  We do not agree with this comment. For example, at least in our experience (e.g. most European countries) it is the published contoured map that is used for design not the exact numbers, which are not published. In Europe, to our knowledge only in Italy are exact numbers used for seismic design codes and not contoured values. Even if contoured maps are used only for representation, this representation can lead to questions from non-technical end users if they do not look at the actual numbers. We have added some text to the manuscript on this topic.
  
  “Section 4 proposes a new method for evaluating whether changes on PSHA results at one site, due to different hazard models, can be deemed important. The implementation of such method needs fragility functions and therefore it seems to depend on the building stock that is considered at the site (this is explained in Section 4.2). Indeed, the procedure allows one to establish whether “small changes on the seismic hazard can lead to important differences on risk metrics” (lines 467-468). There is an apparent ambiguity, which can be summarized in the following question: does the study investigates the “importance” of the difference between hazard or risk results? This reviewer finds that this should be clearly stated in the abstract and introduction.”
  We agree that this concept was not sufficiently clear in the previous version of the manuscript. “Importance” is related to how changes in the seismic hazard might affect the risk/engineering results, and consequently decisions that are based on these results. In our opinion, since the only variable that is causing changes in the risk results is the different hazard models, we can still state that we are investigating the practical importance of changes in the seismic hazard. We have modified the introduction to clarify this aspect.
  
  “The next comment is partly related to the previous one. In fact, it is not clear why the effect of the difference amongst PSHA models should be explored considering (only) risk (as it appears from the paper) rather than hazard results. Is there a specific reason?”
  Risk is more useful as differences in the hazard do not necessarily mean much as hazard itself is not what end-users are really interested in. We have added some text at the beginning of Section 4 to emphasise this.
  
  “Lines 415-420 discuss that the differences between the five considered hazard models are not important because the average annual probability of collapse (AAPC) for mid-rise RC buildings in Beznau, designed according to the different PSHA models, is always below a pre-determined threshold. In the proposed exercise, authors assume the AAPC threshold equal to 2x10-4, a typical considered value. However, the metric and the threshold value for establishing importance/non-importance of difference between results are completely arbitrary.”
  Two references were included to explain the origin of the 2 x 10^-4 [an Informative Annex of the updated Eurocode 8 and ASCE (2010)]. Therefore, this choice is not arbitrary. We have added some text to explain that whilst collapse is not explicitly a design parameter, it is used to verify the design.
  
  “Lines 436-437: For the European hazard, there is only one city where the hazard change from ESHM13 to ESHM20 can be deemed important, shown in red in the table”. There is no red edit in the tables. Do authors mean the bold edit in Table 4?”
  Yes, we meant bold type. This has been corrected.
  
  Citation: https://doi.org/10.5194/egusphere-2023-991-AC1
RC2:
'Comment on egusphere-2023-991', Anonymous Referee #2, 09 Oct 2023
I enjoyed reading this manuscript, and it discusses an ‘important’ issue of high relevance for Earth scientists and earthquake engineers: How to deal with hazard values that are changing from one PSHA to the next. The topic is well suited for the special issue and nicely adds perspective to the ESHM2020 model. The manuscript is well-written and generally clear, and while I have several suggestions for the authors to consider, detailed below, I believe these are minor, and the paper is close to being published.
In my opinion, the manuscript's main shortcoming is that it remains inconclusive in answering the question it raises: When is a change significant, and when is it practically important? The manuscript nicely reviews the available methods (which are few) but ends somewhat open-ended in the conclusions and suggestions. I like the generally carefully phrased suggestions, the balanced review, and the application to selected case studies is important and original; however, the manuscript does not propose a unique or innovative new method or workflow for assessing significance and practical importance, and it is limited to a few selected sites and models. Different methods apply also deliver different conclusions, leaving the reader slightly at a loss what to conclude. In that sense, the work does not represent a breakthrough. The discussion and conclusion sections could be written more forcefully, suggesting a clear preference of the (distinguished) team of authors on how to move forward in this thorny issue of great relevance. I have several detailed comments:
The four Swiss models are, to my best knowledge, indeed only partially comparable with respect to the site conditions without further adjustments to a common 'rock' condition (not sure about the Italian ones). This fact is initially acknowledged but later not re-discussed, and it limits interpretation in the Swiss context.

The ENSI review team rejected the PRP model, so in that sense, it is not truly an 'accepted' model. The model used by ENSI was the so-called 'hybrid' one, a max of SUIHaz Source and PRP GMM. Would not a comparison with the hybrid model be more accurate?

As a consequence of 1 and 2, I suggest emphasising more strongly that the paper's focus is a mainly comparison of methods based on selected cases but with limited implications for the actual sites. This limitation is even more relevant when computing risk since your risk calculation is very simplistic compared to the Swiss risk model released in 2023.
The PSHA community would generally argue that an SSHAC level III or IV site-specific study should be more reliable than a national PSHA. So, in addition to asking if a study is (statistically) significantly different or, importantly, different based on a risk/build code metric, one could also ask if a study is 'better'. Of course, 'better' is even harder to justify, yet it is often used as an argument for replacing older studies. Better could indicate that it is based on substantially more data, newer and improved methodologies, based on a broader expert group, and more thorough uncertainty quantification etc. In my opinion, a new hazard study must quantify the difference from past studies and justify why the new study should be considered superior and adopted as the next standard or used for a specific site (in the case of NPP). The authors could discuss this aspect of the generation of PSHA also. A somewhat more heretic view would be: Is it relevant if a study is significantly different or practically important if it is beloved to be superior/better? The newest model, representing the current state of the art, should be used in any case.

I am surprised that for the case study Italy, you only compare MPS04 and MPS19. Why did you not include also ESHM13 and ESHM20, analog to the Swiss case? I think this would broaden the implications and analysis.

One of the most striking features of your Figure 1, 3 and 4 is how different ESHM 13 is to the other 4 models. This can be in parts explained by ESHM13 not being well calibrated for this region, for lack of time, but you may want to comment on it in the spirit of significant and important differences. A new model coming along that is significantly different should be scrutinised in detail before being released, otherwise it risks jumping by a lot I the next model generation (as did ESHM13 --> 20).

The study mainly refers to differences in PGA; this simplification should be justified. Would it not be better to define significant differences in the spectral domain? Are models also different at 1 Hz, 5 Hz etc.? Using PGA as the basis to compute risk is a substantial simplification for risk-related studies. Is it warranted (the authors should know better than me)? Likewise, building codes, at least in the EC8 context, do I believe not use PGA; how would this impact your results?

You selected five cities for the European model comparison and 4 for the Italian one. Please make sure the coordinates you used are given for reproducibility. It would also be good to explain the choice of these cities – or, even better, show a difference map between ESHM13/20 in terms of significance. Such maps would allow identifying in an objective way areas of s and smallest differences: I believe they are planned in the ESHM20 publication in any case, so you might refer to them.

I personally believe that the second point you raise in the discussions resonates well with me: Increasing the ‘resilience’ of building codes to changes in the hazard, that are likely to come, and establishing this on a cost-benefit analysis. An explicitly included safety buffer in the design of buildings, instead of building right at the ‘limit’ would be a wise choice of society. To me this is a better concept than allowing for non-compliance for a longer period or hoping that conservatism was already build in (your point 1).

Figure 1 is difficult to see and not very attractive; the subsequent Figures 3 and 4 are more appealing.

For Tables 1 and 2, please state how models are compared (what does a negative % mean, which model is higher?)
Citation: https://doi.org/10.5194/egusphere-2023-991-RC2
- AC2:
  'Reply on RC2', John Douglas, 07 Nov 2023
  Reviewer #2
  We thank the reviewer for their detailed and constructive comments on the original version of this manuscript. In the following, we reply point by point to their comments. In addition, we provide an annotated version of the manuscript. Finally, we made some additional minor changes following our own internal reviews.
  ------
  I enjoyed reading this manuscript, and it discusses an ‘important’ issue of high relevance for Earth scientists and earthquake engineers: How to deal with hazard values that are changing from one PSHA to the next. The topic is well suited for the special issue and nicely adds perspective to the ESHM2020 model. The manuscript is well-written and generally clear, and while I have several suggestions for the authors to consider, detailed below, I believe these are minor, and the paper is close to being published.
  In my opinion, the manuscript's main shortcoming is that it remains inconclusive in answering the question it raises: When is a change significant, and when is it practically important? The manuscript nicely reviews the available methods (which are few) but ends somewhat open-ended in the conclusions and suggestions. I like the generally carefully phrased suggestions, the balanced review, and the application to selected case studies is important and original; however, the manuscript does not propose a unique or innovative new method or workflow for assessing significance and practical importance, and it is limited to a few selected sites and models. Different methods apply also deliver different conclusions, leaving the reader slightly at a loss what to conclude. In that sense, the work does not represent a breakthrough. The discussion and conclusion sections could be written more forcefully, suggesting a clear preference of the (distinguished) team of authors on how to move forward in this thorny issue of great relevance.
  We have added three recommendations at the end of the conclusions to address this comment.
  
  I have several detailed comments:
  The four Swiss models are, to my best knowledge, indeed only partially comparable with respect to the site conditions without further adjustments to a common 'rock' condition (not sure about the Italian ones). This fact is initially acknowledged but later not re-discussed, and it limits interpretation in the Swiss context.
  
  We acknowledge the concern about the models' comparability in terms of site conditions as we do not attempt to adjust them to a standard Swiss 'rock' condition. While this limitation is initially mentioned, we will change the text to emphasize it more explicitly. Adapting them to a common 'rock' would be time-consuming, and we lack the necessary data. Furthermore, any adjustment is expected to influence the outcomes by less than 5%, leaving the primary conclusions unchanged. To be clear, the surface ‘rock’ condition for PGA at Beznau is 1800m/s in PRP/PEGASOS, 1100m/s in SuiHaz15, and 800m/s in ESHM13 and ESHM20. Using the conversion factors from Danciu and Fäh (2017), the difference between PRP/PEGASOS and Suihaz15 corresponds to around 5%. The difference with ESHM13/20 and Suihaz15 is likely to be of a similar order.
  
  The ENSI review team rejected the PRP model, so in that sense, it is not truly an 'accepted' model. The model used by ENSI was the so-called 'hybrid' one, a max of SUIHaz Source and PRP GMM. Would not a comparison with the hybrid model be more accurate?
  
  A comparison with ENSI's 'hybrid' model might indeed have been more insightful. A hybrid model is, however, not a standard seismic hazard product. In addition, we are unsure whether the decision of ENSI is public. Therefore, we have not made any change to the manuscript.
  
  As a consequence of 1 and 2, I suggest emphasising more strongly that the paper's focus is a mainly comparison of methods based on selected cases but with limited implications for the actual sites. This limitation is even more relevant when computing risk since your risk calculation is very simplistic compared to the Swiss risk model released in 2023.
  We have added two sentences to the introduction to Section 4 to highlight that the examples shown are illustrations and case-specific calculations would be required for actually applications. These sentences complement additions made at the end of Section 1 and the beginning of Section 3 in response to earlier comments.
  
  The PSHA community would generally argue that an SSHAC level III or IV site-specific study should be more reliable than a national PSHA. So, in addition to asking if a study is (statistically) significantly different or, importantly, different based on a risk/build code metric, one could also ask if a study is 'better'. Of course, 'better' is even harder to justify, yet it is often used as an argument for replacing older studies. Better could indicate that it is based on substantially more data, newer and improved methodologies, based on a broader expert group, and more thorough uncertainty quantification etc. In my opinion, a new hazard study must quantify the difference from past studies and justify why the new study should be considered superior and adopted as the next standard or used for a specific site (in the case of NPP). The authors could discuss this aspect of the generation of PSHA also. A somewhat more heretic view would be: Is it relevant if a study is significantly different or practically important if it is beloved to be superior/better? The newest model, representing the current state of the art, should be used in any case.
  
  Site-specific studies are inherently more complex and comprehensive in comparison to national and regional hazard models. In general, site-specific hazard models are known for their complexity, particularly due to their detailed focus and the extensive effort required to capture a wide range of data, assumptions, and models. In recent decades, national/regional hazard models such as UCERF3, SuiHaz15 and ESHM20 have become increasingly complex and adaptable to updates. Each type has different audiences and applications, resulting in diverse ranges of validity. More precisely, national models may not be accurate when it comes to extremely low probabilities such as 10^-5 or 10^-6, which are essential for conducting site-specific hazard studies for nuclear power plants. At such low probabilities, surface site condition uncertainties and aleatory variabilities become dominant.
  
  Furthermore, when comparing a site-specific model with its national/regional counterpart, the analysis goes beyond just differences in results. It is crucial to thoroughly examine datasets, underlying assumptions, constructed models, expert elicitation and input, computational models, as well as the practical applications of the results. A meaningful comparison can only be achieved if the models are clear, thoroughly documented, and the underlying data and methodologies are both easily accessible and capable of being replicated. We have added some text in the conclusions about this.
  I am surprised that for the case study Italy, you only compare MPS04 and MPS19. Why did you not include also ESHM13 and ESHM20, analog to the Swiss case? I think this would broaden the implications and analysis.
  
  As we are using hazard models as examples we have decided not to also include ESHM13 and 20 for Italy. The MPS models are the ones proposed for seismic design in Italy and hence we think they are the most relevant for that part of the analysis. In addition, including an additional two hazard models would make presentation of the results for Italy much more complicated as there would be six pairwise comparisons rather than just one. We do note, however, that Syracuse is used in both the Italian and Europe comparisons. We have not made any changes to the manuscript for this point.
  
  One of the most striking features of your Figure 1, 3 and 4 is how different ESHM 13 is to the other 4 models. This can be in parts explained by ESHM13 not being well calibrated for this region, for lack of time, but you may want to comment on it in the spirit of significant and important differences. A new model coming along that is significantly different should be scrutinised in detail before being released, otherwise it risks jumping by a lot I the next model generation (as did ESHM13 --> 20).
  
  We have added some sentences about this in the revised manuscript.
  
  The study mainly refers to differences in PGA; this simplification should be justified. Would it not be better to define significant differences in the spectral domain? Are models also different at 1 Hz, 5 Hz etc.? Using PGA as the basis to compute risk is a substantial simplification for risk-related studies. Is it warranted (the authors should know better than me)? Likewise, building codes, at least in the EC8 context, do I believe not use PGA; how would this impact your results?
  
  Whilst we could have extended the study to other intensity measure types, we do not feel that the additional effort (which may require digitising hazard curves) would be justified given we are only using the calculations to illustrate the methods.
  
  Using PGA for buildings codes is standard in the current EC8, which is why we used it in our example. It is true that the future version of EC8 will bypass PGA and provide spectral accelerations directly so we note in the methodology in Section 4.2 that one can also employ spectral acceleration as the seismic design parameter.
  
  We used PGA in the risk calculations so that we could make use of the hazard curves presented previously. We agree with the reviewer that other intensity measure types would typically be employed in a risk assessment, but as mentioned above, these applications have mainly illustrative purposes.
  
  You selected five cities for the European model comparison and 4 for the Italian one. Please make sure the coordinates you used are given for reproducibility. It would also be good to explain the choice of these cities – or, even better, show a difference map between ESHM13/20 in terms of significance. Such maps would allow identifying in an objective way areas of s and smallest differences: I believe they are planned in the ESHM20 publication in any case, so you might refer to them.
  
  We have added the coordinates of the cities and included some discussion of why these locations were chosen.
  
  The cities were selected because their hazard is relevant for this investigation. Reference Figure 6 in Danciu, L. et al. (2022). The 2020 European Seismic Hazard Model: Milestones and Lessons Learned. In: Vacareanu, R., Ionescu, C. (eds) Progresses in European Earthquake Engineering and Seismology. ECEES 2022. Springer Proceedings in Earth and Environmental Sciences. Springer, Cham. https://doi.org/10.1007/978-3-031-15104-0_1
  
  I personally believe that the second point you raise in the discussions resonates well with me: Increasing the ‘resilience’ of building codes to changes in the hazard, that are likely to come, and establishing this on a cost-benefit analysis. An explicitly included safety buffer in the design of buildings, instead of building right at the ‘limit’ would be a wise choice of society. To me this is a better concept than allowing for non-compliance for a longer period or hoping that conservatism was already build in (your point 1).
  
  We than the reviewer for their positive feedback.
  
  Figure 1 is difficult to see and not very attractive; the subsequent Figures 3 and 4 are more appealing.
  
  Figure 1 has been redrawn in the same style as Figure 3 and 4.
  
  For Tables 1 and 2, please state how models are compared (what does a negative % mean, which model is higher?)
  
  The captions of these tables have been expanded to explain the meaning of the numbers.
  
  Citation: https://doi.org/10.5194/egusphere-2023-991-AC2

John Douglas, Helen Crowley, Vitor Silva, Warner Marzocchi, Laurentiu Danciu, and Rui Pinho

Viewed

Total article views: 1,319 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
764	488	67	1,319	56	80

HTML: 764
PDF: 488
XML: 67
Total: 1,319
BibTeX: 56
EndNote: 80

Views and downloads (calculated since 19 Jun 2023)

Month	HTML	PDF	XML	Total
Jun 2023	247	127	3	377
Jul 2023	93	38	6	137
Aug 2023	36	12	1	49
Sep 2023	36	23	1	60
Oct 2023	47	27	5	79
Nov 2023	40	15	5	60
Dec 2023	17	17	3	37
Jan 2024	15	22	1	38
Feb 2024	15	14	0	29
Mar 2024	16	16	0	32
Apr 2024	12	9	3	24
May 2024	8	8	1	17
Jun 2024	6	6	3	15
Jul 2024	10	8	4	22
Aug 2024	9	6	12	27
Sep 2024	7	3	4	14
Oct 2024	7	1	1	9
Nov 2024	15	5	0	20
Dec 2024	6	8	0	14
Jan 2025	2	6	0	8
Feb 2025	6	3	1	10
Mar 2025	9	5	2	16
Apr 2025	10	8	0	18
May 2025	8	17	0	25
Jun 2025	13	10	1	24
Jul 2025	14	11	0	25
Aug 2025	10	15	0	25
Sep 2025	20	13	8	41
Oct 2025	13	20	2	35
Nov 2025	17	15	0	32

Cumulative views and downloads (calculated since 19 Jun 2023)

Month	HTML	PDF	XML	Total
Jun 2023	247	127	3	377
Jul 2023	93	38	6	137
Aug 2023	36	12	1	49
Sep 2023	36	23	1	60
Oct 2023	47	27	5	79
Nov 2023	40	15	5	60
Dec 2023	17	17	3	37
Jan 2024	15	22	1	38
Feb 2024	15	14	0	29
Mar 2024	16	16	0	32
Apr 2024	12	9	3	24
May 2024	8	8	1	17
Jun 2024	6	6	3	15
Jul 2024	10	8	4	22
Aug 2024	9	6	12	27
Sep 2024	7	3	4	14
Oct 2024	7	1	1	9
Nov 2024	15	5	0	20
Dec 2024	6	8	0	14
Jan 2025	2	6	0	8
Feb 2025	6	3	1	10
Mar 2025	9	5	2	16
Apr 2025	10	8	0	18
May 2025	8	17	0	25
Jun 2025	13	10	1	24
Jul 2025	14	11	0	25
Aug 2025	10	15	0	25
Sep 2025	20	13	8	41
Oct 2025	13	20	2	35
Nov 2025	17	15	0	32

Viewed (geographical distribution)

Total article views: 1,294 (including HTML, PDF, and XML) Thereof 1,294 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 16 Nov 2025

Short summary

Estimates of the earthquake ground motions expected during the lifetime of a building or the length of an insurance policy are frequently calculated for locations around the world. Estimates for the same location from different studies can show large differences. These differences affect engineering, financial and risk management decisions. We apply various approaches to understand when such differences have an impact on such decisions and when they are expected because data are limited.


Total:	0
HTML:	0
PDF:	0
XML:	0