the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Sensitivity of marine heatwaves metrics to SST products, focusing on the Tropical Pacific
Abstract. Marine heatwaves (MHWs) are increasingly studied in climate sciences for their ecological impacts, for which accurate real-time bulletins and forecasts are essential. Yet, methodological choices in their detection affect metric estimates, underlining the need to better assess these sensitivities. This study provides a thorough assessment of the impact of Sea Surface Temperature (SST) product choice on MHW statistics, focusing on the tropical Pacific. MHW detection was performed on six daily gridded SST datasets: four widely used blended satellite observational products, one ocean reanalysis, and a multi-dataset ensemble mean computed from the four observational products. Sensitivity to SST products was evaluated for six MHW metrics (MHW days per year, number of events per year, duration, maximum intensity, cumulative intensity and onset rate) and for the Degree Heating Weeks (DHW), a widely used proxy for coral bleaching. Inter-product comparisons revealed a significant dispersion among MHW metric estimates, with the reanalysis GLORYS12v1 detecting fewer, longer and less intense MHWs while OISST detected more MHWs of shorter duration and higher intensity, likely related to their weak and strong high-frequency SST variability (periods shorter than 2 weeks) respectively. The sensitivity analysis showed that the onset rate was the most sensitive metric to SST product choice and the maximum intensity the most robust. Metrics uncertainties were quantified inside seven regions of the basin and were largest in the western Pacific Warm Pool. Co-occurrence analyses of MHWs revealed that, over the basin, 10 to 80 % of MHW days were detected simultaneously by all products, with the western Pacific Warm Pool showing the lowest agreement (10–40 %). Filtering MHWs by size also revealed that the detection of large-scale MHWs (> 5°x5°) was more consistent across products than smaller ones. Finally, over the studied period, inter-product differences tend to decrease with time. The DHW also revealed to be sensitive to SST products, with inter-product differences on DHW annual maximum reaching more than 1°C.weeks-1 and percentages of bleaching alert days (DHW ≥ 4°C.weeks-1) in common across products reaching 70 % at most across much of the basin. These findings contribute to a better understanding of how methodological choices affect the characterization of MHWs and DHW, and their associated uncertainties.
- Preprint
(3535 KB) - Metadata XML
-
Supplement
(10255 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-5417', Anonymous Referee #1, 19 Nov 2025
-
RC2: 'Comment on egusphere-2025-5417', Anonymous Referee #2, 30 Dec 2025
General comments
This manuscript evaluates how the choice of observational or observation-based sea surface temperature datasets influences the comparison of marine heatwave metrics. The topic is timely and of relevance to the marine heatwave community, as the results demonstrate that commonly used metrics can be sensitive to dataset selection. This is an important issue, given that many marine heatwaves studies rely on only one observational dataset. The study is generally well designed and the analysis of good quality, but some aspects of the methodology and clarity of presentation would benefit from revision. Addressing the specific and technical comments below would substantially strengthen the manuscript. Although the individual comments may not necessitate extensive changes, taken together they represent substantial revisions. On this basis, it is recommended that the manuscript be reconsidered after major revisions.
Specific Comments
- Given the focus on the tropical Pacific, where ENSO is a dominant driver of sea surface temperature variability, the role of ENSO in shaping the identified marine heatwaves warrants clearer and more explicit consideration. Additional analysis separating marine heatwaves that are associated with El Niño or La Niña from those that are independent of ENSO would be a valuable extension. While such an analysis is not essential for publication, it would help place the dataset-dependent differences in marine heatwave metrics into a clearer physical context. Some ENSO-related relationships are already evident in Section 3.3.3, and building on these would enhance the manuscript.
- The datasets are regridded to a common 0.25° grid prior to the calculation of marine heatwave metrics. While this approach allows the products to be more readily compared, further discussion of the potential implications of the regridding step would strengthen the methodological transparency of the study. In particular, regridding higher-resolution products to a coarser grid may smooth small-scale temperature features, which could in turn reduce peak marine heatwave intensities or alter event characteristics such as duration and spatial extent. It would be helpful to discuss whether this smoothing could contribute to some of the inter-dataset differences reported, and whether the results are sensitive to the chosen target resolution. An alternative approach, such as regridding coarser products to a finer grid, or computing marine heatwave metrics at native resolution prior to regridding, could be briefly considered, even if not implemented. Clarifying the rationale for the chosen approach, and its potential impact on the results, would improve confidence in the robustness of the conclusions.
- The inclusion of a “composite” dataset, defined as the mean of the four observational products, is potentially informative as a diagnostic or reference product. However, its subsequent inclusion in ensemble-based analyses raises concerns. In particular, incorporating the composite alongside the original observational datasets in the calculation of the ensemble mean effectively results in the four observational products being counted twice, thereby skewing the ensemble statistics. A more statistically consistent approach would be to exclude the composite from ensemble calculations, while retaining it as a separate product for comparison.
Technical Corrections
(Note: the notation “L16” below refers to “line 16”, as an example)
- L15: Here and throughout, “sea surface temperature” does not need to be capitalised. Likewise for “degree heating weeks”.
- L31: The phrase “methodological choices” may be misleading here, as the analysis appears to focus primarily on differences from dataset selection. Clarification of rephrasing may be helpful.
- L49: “requires” might be a better word than “implies”.
- L49: Not just the length (i.e., duration) of the baseline, but also the start year.
- L50-53: The structure of this sentence is unclear. Rephrasing is recommended to distinguish between the agreed methodology and the remaining sources of variability.
- L56-58: Please rephrase. It seems this sentence should not begin with “if”.
- L61: Please rephrase “show variability between themselves”. Perhaps: “and consequently exhibit differences”.
- L67: “basinwide” or “oceanwide” scales might be more accurate than “regional”, given the text that follows.
- L71: How is it that understanding of SST product differences would improve MHW forecasts?
- L93: It may be misleading to state that six datasets were used, since one of these is the composite derived from the four observational products. For transparency, this could be rephrased (see the relevant Specific Comment above).
- L106: It seems (from L134) that the re-gridding was performed before calculating the MHW metrics, but it would be helpful to discuss the potential impact of the re-gridding (see the relevant Specific Comment above). For example, regridding to a coarser resolution may smooth small-scale features and potentially reduce maximum intensities. Could an alternative be to regrid the coarser products to a finer scale?
- L114: Please rephrase the description for the composite product in the table for clarity, e.g., Mean of the four SST analysis products, having regridded C3S, CRW and OSTIA to the OISST 0.25° grid.
- L118: “not symmetric” -> about the equator.
- L120: “detailed” -> conducted?
- L131: This figure shows the composite in panel (a) and the ensemble mean in panels (b-g). It might be better to make the panels consistent.
- L136: “Gaps of less than two days… were ignored”. Do you really mean that they were ignored? I.e., not taken into consideration for calculating the duration or cumulative intensity? Or do you mean that they did not split the event into two (as is usually done)?
- L151: Should the units of DHW be °C.weeks, and not °C.weeks-1?
- L166 and L168: “maps were defined at each pixel”. Please rephrase, since the wording implies that maps are created at each pixel, rather than pixel values within a map.
- L169-176: Including the composite dataset in ensemble mean effectively results in the four observational products being counted twice. For statistical consistency, it would be fairer to exclude the composite from the ensemble mean (see the relevant Specific Comment above).
- L187-188: As above, the counting of simultaneous events should exclude the composite, otherwise the statistics are skewed.
- L194: “time start” -> onset?
- L204: “defined in Eq. 3” -> defined as, since the equation follows directly after.
- L207: “p_value” -> p-value. And elsewhere throughout the text.
- L207: “inferior” -> less than. Alternatively, all of this sentence could be rephrased in terms of the statistical significance at the 99% confidence level.
- L214: In the PEQD region, the MHWs are typically El Niño events. ENSO is mentioned in this paragraph, but not El Niño specifically.
- L243-250: The large differences in counts of MHW days per year across products is somewhat surprising. Given the use of a 90th percentile threshold, one might expect around 10% of days to qualify as MHW days, with deviations arising from the persistence criterion and asymmetries in temperature variability. However, the magnitude of the differences shown here appears larger than might be expected. It would be helpful to explore this further, for example by discussing whether differences in long-term trends, variability, or dataset characteristics could contribute to the tendency for some products (i.e., CRW, OSTIA, OISST) to underestimate, and others (i.e., C3S and GLORYS) to overestimate, MHW days. A brief discussion that considers recent findings (Brunner, L., Voigt, A. Pitfalls in diagnosing temperature extremes. Nat Commun 15, 2087 (2024). https://doi.org/10.1038/s41467-024-46349-x) could help to contextualise these results.
- L262-265: The methodology used to derive the “ranking” is not entirely clear, and would benefit from further clarification. Taking MHW days as an example, GLORYS appears to have the highest mean value and is therefore assigned a value of 1, but it is not clear how the corresponding value for OISST (approximately 0.17) is determined. Is this based on a domain-averaged quantity? Furthermore, the term “ranking” may be misleading here, as it typically implies an ordinal ordering (e.g. 1 to 6), whereas the values shown appear to represent a continuous or scaled metric. Clarifying both the calculation and the terminology would improve transparency.
- L288: Some care may be needed in the interpretation that the “onset rate is the most sensitive metric”. In this context, “sensitivity” could be interpreted as having a physical meaning, whereas the result appears to indicate that the onset rate exhibits the largest normalised dispersion across products. Clarifying this distinction, or rephrasing the statement, would help avoid potential ambiguity.
- L388: What does “SP7” refer to?
- L451: “go” -> be?
- L459-464: this single pixel analysis is a good example of where the effects of regridding (as noted in the Specific Comments) should be considered.
Citation: https://doi.org/10.5194/egusphere-2025-5417-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 268 | 85 | 25 | 378 | 127 | 19 | 22 |
- HTML: 268
- PDF: 85
- XML: 25
- Total: 378
- Supplement: 127
- BibTeX: 19
- EndNote: 22
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
MHW detection and characterization (metrics) still have some open issues in their settings and criteria, such as the use or not of detrended SST data, baseline climatology, spatio-temporal constraints… In this work, the authors run an extensive analysis of the impact of the SST database selection on MHW detection and metrics in the Tropical Pacific Area.
The authors analyse a complete set of MHW metrics over different areas of the Pacific Ocean and different satellite and reanalysis SST datasets. They find differences between calculated metrics characteristics, regional differences and metrics dispersion depending on the selected dataset. They also analyse the temporal evolution of regional averaged MHW metrics.
The results in the manuscript show that for different metrics, the best results are observed with different databases. In the same direction, different MHW sizes yield different results are obtained depending on the SST used. They also observe that the variability of results between different databases has been decreasing in recent years. No clear distinction is obtained between one database and the others, nor is there one that obtains a better result in most metrics.
Work shown in the manuscript is methodologically consistent and provides interesting results on the impact of SST databases in MHW analysis. My recommendation is to publish the manuscript with minor revisions.
Main comments
My main concern comes from the methodology choice in detecting all pixels constituting an MHW in the 2.2.3- Filtering MHWs by size section. This is an important issue, as the authors separate MHWs in micro and macro scales, which needs to be better explained. Please, provide more details and the rationale on how “joint pixels” are detected. Which is the impact of the methodology (based on Bonino) on MHW detection? Have you tried any other methodology? Please, see the references below (global and Mediterranean scales) and discuss why you chose the methodology in Bonino, used in the Mediterranean where scales are much smaller than in the Pacific.
Sun, D., Jing, Z., Li, F. & Wu, L. Characterizing global marine heatwaves under a spatio-temporal framework. Prog. Oceanogr. 211, 102947 (2023).
Pastor, F., Paredes-Fortuny, L. & Khodayar, S. Mediterranean marine heatwaves intensify in the presence of concurrent atmospheric heatwaves. Communications Earth & Environment 5, 797 (2024).
Although you mention a possible impact of regridding in the MHW analysis. Have you checked the impact of regridding in the dataset characteristics? Some simple statistics, correlations… of this impact would be interesting to be included in the manuscript, maybe as supplementary material.
Is the climatology period 1993-2021 the same whole period studied? I understand that the full study period is the period analysed but it has to be clearly stated in the manuscript.
The authors separate MHW events in micro and macro scales, greater or smaller than 5x5 degree. How is this size threshold determined? Have you checked and compared results for other thresholds? An MHW of 4x4º occupies an extensive area, especially in the case of marginal seas. I would like to see some figures about mean size of micro-events, dispersion, percentiles that justify the 5x5 is a good choice. Some micro events can be almost as big as some macro events.
Maybe your threshold is appropriated for the open ocean, but this election needs to be better justified. Check methodology in Pastor (2023) to identify MHW area.
Pastor, F. & Khodayar, S. Marine heat waves: Characterizing a major climate impact in the Mediterranean. Science of the Total Environment 861, (2023).
Minor comments
Line 294 “for the maximum intensity (total MHW days) (Fig. 4b,f)”. Correct if necessary.
2.3.2 Temporal evolution
“The year attribution of a MHW was based on its time start”. Why do not use central date? Have you checked how many MHWs start and end on different years? And how many days of this event correspond to the end year?