the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Understanding pattern scaling errors across a range of emissions pathways
Abstract. The regional impacts of multiple possible future emission scenarios can be estimated by combining a few Earth System Model (ESM) simulations with a linear pattern scaling model such as MESMER which uses the pattern of local temperature responses per degree global warming. Here we use MESMER to emulate the future regional pattern of surface temperature response based on historical single-forcer and future Shared Socioeconomic Pathway (SSP) CMIP6 simulations. Pattern scaling errors are decomposed into two components: differences in scaling patterns between scenarios, and intrinsic timeseries differences between local and global responses in the target scenario. The timeseries error is relatively small for high-emissions scenarios, contributing around 20 % of the total error, but is similar in magnitude to the pattern error for lower-emission scenarios. This irreducible timeseries error limits the efficacy of pattern scaling for emulating strong mitigation pathways and reduces the dependence on the predictor pattern used. The results help guide the choice of predictor scenarios and where to target introducing other dependent variables beyond global surface temperature into pattern scaling models.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(3339 KB)
-
Supplement
(4185 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(3339 KB) - Metadata XML
-
Supplement
(4185 KB) - BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-914', Mathias Hauser, 15 Dec 2022
Review of “Understanding pattern scaling errors across a range of emissions pathways”
Wells et al. consider errors in pattern scaling for different emission scenarios. They decompose the error into timeseries and pattern errors to better understand their sources. This is highly relevant due to the emergence of climate emulators used to estimate local impacts of emissions - often for scenarios the emulators were not originally trained on. Overall the manuscript is well written and clear. However, there are some points I would like the authors to clarify.
Main points
In the data part I missed that you calculate anomalies of the predictor and target variables and I also see no mention of the reference period. Please also explain how you deal with ensemble members. Do you use one or many per model? How do you estimate the local slope for models with many ensemble members? How do you avoid giving more weight to models with more ensemble members?
Please emphasize more that you use only part of the full MESMER emulator.
You mention several times that using patterns to extrapolate are worse than to interpolate and cite a number of studies showing this. However, I miss a citation of Beusch et al. (2022) who also discuss this. Further, the MESMER emulator has been already extensively evaluated (Beusch et al., 2020a, 2020b) and I think your paper would benefit from discussing this and showing how your paper goes beyond the state of the art.
It’s interesting to see that scenarios with a peak in the global mean temperatures show local time lags and would profit from additional predictors. Can you speculate how much the missing MESMER components (i.e. the auto regression) would help alleviate this problem?
Consider changing the way you show significance. I was first confused why you would subtract the standard deviation from your difference signal - I overread the word “magnitude”. Therefore I suggest you do one of the following:
(i) switch to showing significance with a test statistic and hatch the non-significant areas in your plots. This should reduce the number of figures and plots without losing (much) information. (E.g. by using a Wilcoxon Mann-Whitney U test and accounting for the large number of conducted tests, by applying the approach of Benjamini and Hochberg (1990), see also Wilks (2016)).
(ii) If you keep your current approach I strongly suggest to make it more clear - add vertical bars in the title of the figures to make it clear that it is the magnitude of the difference and also explain what values larger, smaller and almost equal to zero mean at around L210.
(iii) Instead of subtracting the standard deviation could you divide by the inter model standard deviation. That would seem more intuitive to me.Minor Points
L131: Why is “pattern scaling more accurate than the timeshift method”? Wouldn’t the latter allow for non-linearities?
L143: The intercept will also depend on how the anomalies are calculated (and how ensemble members are treated).
L212: Explain that the pattern averages to 1 globally per design and only because of the investigated variable is tas.
L245: “pattern difference is not as robust between models” that is an interesting way to put it. Isn’t it good for pattern scaling if there are few regions with strong differences?
Figures
General: many of the color scales you show saturate on a large part of the maps. Consider widening the shown range to allow distinguishing the patterns better. Please write the labels and units as “Error (K)” instead of “Error / K”. Then it looks less like a division.
Figure 1: I appreciate that you showcase the different errors in an example. However, I think using a scenario that is symmetric in its global temperature makes it more difficult to understand than necessary. Consider showing a non-symmetric scenario, e.g. just increasing the temperatures from 1°C to 2°C until the end of the century.
You could also consider switching the first and second columns. If I understand this correctly the (current) middle column is the “forcing” for the emulator while the (current) first column is the “response”, so switching them could help clarify this relationship.
Figure 2 and 3: Panels a) and b) don’t have a diverging scale and should therefore not feature a diverging colormap. Please use one with a sequential color map. (If you want to emphasize deviations from 1 you can keep a diverging color map but you should mention this and also use another color map as for c) and d)) Depending on what you decide on showing significance, also consider changing the colormap of d) to indicate it shows something else than c).
Figure 4. I’d be interested to see how similar the pattern in b) and c) are, the saturation in b) makes this difficult.
Figure 7: I suggest you label the “Target” below the axes and to maybe not rotate them by 45° (they might just have enough room) - up to you. Please add % as units to d).
Figure 8: The black vertical lines described on L416 are missing.
Text
L7: delete “multiple”
L7: delete “a few”
L26: Expand “IPCC AR6 WG1”?
L38: Maybe delete “change”
L80: Expand “RCP” and explain what this is.
L85: “than the RCPs” -> “than any of the RCPs”?
L84: “remains to be done” consider rewriting
L100-L104: Make it clearer that these are your two assumptions (e.g. turn it into a list or add (i) and (ii)).
L102: timeseries -> temporal
L103: simply modified -> scaled
Section 2.1: I highly recommend to split this into two sections one on the data and one on MESMER.
L136: against the smoothed -> against smoothed
L137: parameter -> “slope” or “scaling factor”
L137-L138: This … SSP119: The sentence sounds off.
L205: You explain the pattern in panel b) first. Consider reordering.
L349: Upon even only -> Even for
L419: Consider rewriting the introductory sentence.
References
Hochberg, Y. and Benjamini, Y. (1990), More powerful procedures for multiple significance testing. Statist. Med., 9: 811-818. https://doi.org/10.1002/sim.4780090710
Beusch, L., Gudmundsson, L., and Seneviratne, S. I.: Emulating Earth system model temperatures with MESMER: from global mean temperature trajectories to grid-point-level realizations on land, Earth Syst. Dynam., 11, 139–159, https://doi.org/10.5194/esd-11-139-2020, 2020a.
Beusch, L., Gudmundsson, L., & Seneviratne, S. I. (2020b). Crossbreeding CMIP6 Earth System Models with an emulator for regionally optimized land temperature projections. Geophysical Research Letters, 47, e2019GL086812. https://doi.org/10.1029/2019GL086812
Beusch, L., Nicholls, Z., Gudmundsson, L., Hauser, M., Meinshausen, M., and Seneviratne, S. I.: From emission scenarios to spatially resolved projections with a chain of computationally efficient emulators: coupling of MAGICC (v7.5.1) and MESMER (v0.8.3), Geosci. Model Dev., 15, 2085–2103, https://doi.org/10.5194/gmd-15-2085-2022, 2022.
Wilks, D. S. (2016). “The Stippling Shows Statistically Significant Grid Points”: How Research Results are Routinely Overstated and Overinterpreted, and What to Do about It, Bulletin of the American Meteorological Society, 97(12), 2263-2273.
Citation: https://doi.org/10.5194/egusphere-2022-914-RC1 -
AC1: 'Author response to reviewers', Chris Wells, 30 Mar 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-914/egusphere-2022-914-AC1-supplement.pdf
-
AC1: 'Author response to reviewers', Chris Wells, 30 Mar 2023
-
RC2: 'Comment on egusphere-2022-914', Raphael Hébert, 19 Jan 2023
General Comments:
The paper by Wells et al. presents an analysis of the errors arising when using the mean component of MESMER for emulating climate model simulations for different emission scenarios. The paper is well structured and the analysis appears sounds, and therefore I find it eligible for publications after revision.I think it could generally be rewritten more concisely as there are detailed descriptions of results that are sometimes trivial, and description of figures that would be better placed in figure captions. In addition, I think the motivation and usage of the model could be better explained. Particularly, I understood from Beusch et al. (2020) that MESMER served to emulate a single model-scenario (i.e. self-emulation) to general a large ensemble, e.g. we have model X with scenario SSPyyy and we emulate this single run to obtain a large ensemble with random internal variability. What is thus the purpose of understanding the cross scenario errors presented here? Is the goal to use MESMER for making regional projections? In which case, what is the point of using the 2020-2070 period to set up the emulator and then project the 2070-2100 period? Wouldn't we want to rather use historical and/or idealized simulations or observations to set up our emulator and analysed the errors induced by the historical pattern on emission scenarios (e.g. Geoffroy & St-Martin, 2014; Hébert & Lovejoy, 2018)? I initially expected that the aerosol and GHG patterns would be used to calibrate a two-pattern emulator, but rather we only get insights on the difference between the extrapolation of these two patterns, results which I thought were a bit trivial since we already know that aerosols have a localized impacts. Wouldn't it be possible to use those two patterns to emulate future scenarios if we have a decomposition of global mean temperature into aerosol and GHG driven components? I think this would be a more powerful framework since we could then use those patterns to emulate any scenarios given the global mean temperature along with the aerosol and GHG forcing timeseries. It is not necessary for the authors to do this in this paper, but I wanted to outline what I think would be useful to broaden the scope of the study.
Specific Comments:
Line 47: "forcer pattern" --- I'm unsure about the use of 'forcer' here and elsewhere, shouldn't it be 'forced pattern'?
Line 129: "This study utilises the mean response component of the MESMER model (Beusch et al., 2020), implementing pattern scaling to emulate the spatial annual mean temperature response in a scenario." --- Is it still the MESMER model if we use only the mean component? Then isn't it just a regression of the local temperature with respect to the global one?
Line 137: "This is performed to ensure the global average parameter is very close to 1 K/K, as it should be by definition, when predictor the model on an individual low-emission scenario such as SSP119." --- What global average parameter are we talking about? The global average of the local sensitivities? Why does it matter to smooth or not the local temperature for the regression to obtain an average close to 1? Also, review the formulation of sentence, 'when predictor the model' doesn't sound right. I would also explain here or somewhere else the units of K/K since at first one thinks why don't they just cancel, and well they do, but I understand you wanted to make explicit that this was a local sensitivity of the local temperature to the global one, right?
Line 145: "A given emulation consists of the predictor set – comprising one or many scenarios – and a target scenario." --- I think this could be better explained. Are we talking about a set of model simulations following certain emission scenarios that are used to estimate the pattern, and then one separate scenario with its own set of model simulation is used as target?
Line 156: "temperatures relative to pre-industrial times rise from 1 K in 2015 to 2 K" --- Maybe give the approximate year when the temperature reaches the 2k 'from 1k in 2015 to 2k in ????'
Line 208: "In hist-aer, the land-ocean distinction is still clear, but the northern hemisphere land is particularly sensitive, due to the historical concentration of aerosol emissions within this region." --- Sensitive isn't exactly the right word right? Are the land region really more sensitive, or is it purely because of the higher aerosol emissions there that the regression slopes are higher? I would consider rephrasing the paragraph to clarify this.
Line 217: "Parts of the NHMLs exhibit a significantly more sensitive response to hist-aer, including the USA, Europe, and east Asia, and the Southern Hemisphere oceans are significantly less sensitive." --- Again, this sounds like sensitivity to aerosols is a local property of the system, but really, the pattern of the response just corresponds to the sources of aerosol emissions. This would likely be outside the scope of this study as it might require more data about the spatial distribution and dispersion of aerosols, but it would be interesting to quantify the actual sensitivity to aerosols taking into account the pattern of emissions (and their dispersion).
Figure 2ab,3ab: I'm not sure the divergent colour palette is appropriate since there is no fundamental difference between values below 1 and above right?
Figure 2d,3d: Wouldn't it be more informative to look at the ratio of the absolute difference over the inter-model spread?
Line 228: "Figure 3 shows the same analysis for SSP119 and SSP585 in a similar way to Figure 2." --- I would complete this sentence with a restatement of what is calculated, something like, if I understand well: '...similar way to Figure 2, i.e. the local temperature series are regressed with the global mean temperature to extract a local sensitivity to global temperature changes.' (could be written more concisely).
Line 251: "Clear, significant differences are therefore found between the temperature response patterns attributable to different historical forcers, consistent with their different spatial patterns." --- I hate to be this guy, but if you say significant, the reader expects a p-value, and you should explain the statistical test used, the null hypothesis considered, etc.
Figure 3: What period is used to train the pattern? In the methods it is said that the first 50 years are used, but I don't think it is said which period the SSP simulations cover. In any case, I think it would be helpful to explicitly state the tiem period sued for training the model.
Line 271: "since the aerosol pattern is more sensitive here than the GHG response, and the Southern Ocean is conversely under-sensitive" --- Again, I'm really not convinced by the usage of sensitivity when it comes to the aerosol pattern. It's not about the sensitivity of aerosols, but rather the strength and spatial distribution of the aerosol forcing. The Southern Ocean is not less sensitive to aerosol forcing, there are just much less aerosol emissions reaching that region.
Line 305: "Errors are significant in the out-of-sample emulations" --- Again, if you say significant, we expect a p-value, if you don't want to give a p-value, use larger instead, otherwise we would also like to know if the smaller errors of the self-emulation are significant or not, just because they are smaller doesn't mean they might not be significant.
Line 358: "Note the smaller scale on the timeseries error plot." --- Might be more useful to have this statement in the caption.
Line 365: "note the slight variations in the SSP119 column compared to the SSP585 one." --- Do you mean 'smaller' variations rather than 'slight'?
Line 387: "The patterns are similar between SSP119 and SSP126, indicating some consistency between scenarios in this effect." --- Why are only those two scenarios considered for this comparison? Wouldn't it also be interesting to see the pattern for SSP245 with a later peak and drop?
Figure 5: Unclear on what period the patterns used for emulation were calculated.
Line 390: A lot of this paragraph could belong to the caption instead. There were several such instances where the figure was described in the text rather than in the captions, I would consider improving the captions and shortening the text to the results only and avoiding the description of the figures there.
Line 390: I would motivate why those specific region-model-scenario are used, I guess simply to explore problematic behaviours?
Figure 9: What period is used to train the predictor?Citation: https://doi.org/10.5194/egusphere-2022-914-RC2 -
AC1: 'Author response to reviewers', Chris Wells, 30 Mar 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-914/egusphere-2022-914-AC1-supplement.pdf
-
AC1: 'Author response to reviewers', Chris Wells, 30 Mar 2023
-
AC2: 'Comment on egusphere-2022-914', Chris Wells, 12 Apr 2023
We thank both reviewers for their detailed, useful comments on our manuscript. We have combined our responses to each review into a single PDF, which we have attached in a comment to each reviewer. Please note the responses to Reviewer 2 begin on Page 6 of the PDF.
Citation: https://doi.org/10.5194/egusphere-2022-914-AC2
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-914', Mathias Hauser, 15 Dec 2022
Review of “Understanding pattern scaling errors across a range of emissions pathways”
Wells et al. consider errors in pattern scaling for different emission scenarios. They decompose the error into timeseries and pattern errors to better understand their sources. This is highly relevant due to the emergence of climate emulators used to estimate local impacts of emissions - often for scenarios the emulators were not originally trained on. Overall the manuscript is well written and clear. However, there are some points I would like the authors to clarify.
Main points
In the data part I missed that you calculate anomalies of the predictor and target variables and I also see no mention of the reference period. Please also explain how you deal with ensemble members. Do you use one or many per model? How do you estimate the local slope for models with many ensemble members? How do you avoid giving more weight to models with more ensemble members?
Please emphasize more that you use only part of the full MESMER emulator.
You mention several times that using patterns to extrapolate are worse than to interpolate and cite a number of studies showing this. However, I miss a citation of Beusch et al. (2022) who also discuss this. Further, the MESMER emulator has been already extensively evaluated (Beusch et al., 2020a, 2020b) and I think your paper would benefit from discussing this and showing how your paper goes beyond the state of the art.
It’s interesting to see that scenarios with a peak in the global mean temperatures show local time lags and would profit from additional predictors. Can you speculate how much the missing MESMER components (i.e. the auto regression) would help alleviate this problem?
Consider changing the way you show significance. I was first confused why you would subtract the standard deviation from your difference signal - I overread the word “magnitude”. Therefore I suggest you do one of the following:
(i) switch to showing significance with a test statistic and hatch the non-significant areas in your plots. This should reduce the number of figures and plots without losing (much) information. (E.g. by using a Wilcoxon Mann-Whitney U test and accounting for the large number of conducted tests, by applying the approach of Benjamini and Hochberg (1990), see also Wilks (2016)).
(ii) If you keep your current approach I strongly suggest to make it more clear - add vertical bars in the title of the figures to make it clear that it is the magnitude of the difference and also explain what values larger, smaller and almost equal to zero mean at around L210.
(iii) Instead of subtracting the standard deviation could you divide by the inter model standard deviation. That would seem more intuitive to me.Minor Points
L131: Why is “pattern scaling more accurate than the timeshift method”? Wouldn’t the latter allow for non-linearities?
L143: The intercept will also depend on how the anomalies are calculated (and how ensemble members are treated).
L212: Explain that the pattern averages to 1 globally per design and only because of the investigated variable is tas.
L245: “pattern difference is not as robust between models” that is an interesting way to put it. Isn’t it good for pattern scaling if there are few regions with strong differences?
Figures
General: many of the color scales you show saturate on a large part of the maps. Consider widening the shown range to allow distinguishing the patterns better. Please write the labels and units as “Error (K)” instead of “Error / K”. Then it looks less like a division.
Figure 1: I appreciate that you showcase the different errors in an example. However, I think using a scenario that is symmetric in its global temperature makes it more difficult to understand than necessary. Consider showing a non-symmetric scenario, e.g. just increasing the temperatures from 1°C to 2°C until the end of the century.
You could also consider switching the first and second columns. If I understand this correctly the (current) middle column is the “forcing” for the emulator while the (current) first column is the “response”, so switching them could help clarify this relationship.
Figure 2 and 3: Panels a) and b) don’t have a diverging scale and should therefore not feature a diverging colormap. Please use one with a sequential color map. (If you want to emphasize deviations from 1 you can keep a diverging color map but you should mention this and also use another color map as for c) and d)) Depending on what you decide on showing significance, also consider changing the colormap of d) to indicate it shows something else than c).
Figure 4. I’d be interested to see how similar the pattern in b) and c) are, the saturation in b) makes this difficult.
Figure 7: I suggest you label the “Target” below the axes and to maybe not rotate them by 45° (they might just have enough room) - up to you. Please add % as units to d).
Figure 8: The black vertical lines described on L416 are missing.
Text
L7: delete “multiple”
L7: delete “a few”
L26: Expand “IPCC AR6 WG1”?
L38: Maybe delete “change”
L80: Expand “RCP” and explain what this is.
L85: “than the RCPs” -> “than any of the RCPs”?
L84: “remains to be done” consider rewriting
L100-L104: Make it clearer that these are your two assumptions (e.g. turn it into a list or add (i) and (ii)).
L102: timeseries -> temporal
L103: simply modified -> scaled
Section 2.1: I highly recommend to split this into two sections one on the data and one on MESMER.
L136: against the smoothed -> against smoothed
L137: parameter -> “slope” or “scaling factor”
L137-L138: This … SSP119: The sentence sounds off.
L205: You explain the pattern in panel b) first. Consider reordering.
L349: Upon even only -> Even for
L419: Consider rewriting the introductory sentence.
References
Hochberg, Y. and Benjamini, Y. (1990), More powerful procedures for multiple significance testing. Statist. Med., 9: 811-818. https://doi.org/10.1002/sim.4780090710
Beusch, L., Gudmundsson, L., and Seneviratne, S. I.: Emulating Earth system model temperatures with MESMER: from global mean temperature trajectories to grid-point-level realizations on land, Earth Syst. Dynam., 11, 139–159, https://doi.org/10.5194/esd-11-139-2020, 2020a.
Beusch, L., Gudmundsson, L., & Seneviratne, S. I. (2020b). Crossbreeding CMIP6 Earth System Models with an emulator for regionally optimized land temperature projections. Geophysical Research Letters, 47, e2019GL086812. https://doi.org/10.1029/2019GL086812
Beusch, L., Nicholls, Z., Gudmundsson, L., Hauser, M., Meinshausen, M., and Seneviratne, S. I.: From emission scenarios to spatially resolved projections with a chain of computationally efficient emulators: coupling of MAGICC (v7.5.1) and MESMER (v0.8.3), Geosci. Model Dev., 15, 2085–2103, https://doi.org/10.5194/gmd-15-2085-2022, 2022.
Wilks, D. S. (2016). “The Stippling Shows Statistically Significant Grid Points”: How Research Results are Routinely Overstated and Overinterpreted, and What to Do about It, Bulletin of the American Meteorological Society, 97(12), 2263-2273.
Citation: https://doi.org/10.5194/egusphere-2022-914-RC1 -
AC1: 'Author response to reviewers', Chris Wells, 30 Mar 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-914/egusphere-2022-914-AC1-supplement.pdf
-
AC1: 'Author response to reviewers', Chris Wells, 30 Mar 2023
-
RC2: 'Comment on egusphere-2022-914', Raphael Hébert, 19 Jan 2023
General Comments:
The paper by Wells et al. presents an analysis of the errors arising when using the mean component of MESMER for emulating climate model simulations for different emission scenarios. The paper is well structured and the analysis appears sounds, and therefore I find it eligible for publications after revision.I think it could generally be rewritten more concisely as there are detailed descriptions of results that are sometimes trivial, and description of figures that would be better placed in figure captions. In addition, I think the motivation and usage of the model could be better explained. Particularly, I understood from Beusch et al. (2020) that MESMER served to emulate a single model-scenario (i.e. self-emulation) to general a large ensemble, e.g. we have model X with scenario SSPyyy and we emulate this single run to obtain a large ensemble with random internal variability. What is thus the purpose of understanding the cross scenario errors presented here? Is the goal to use MESMER for making regional projections? In which case, what is the point of using the 2020-2070 period to set up the emulator and then project the 2070-2100 period? Wouldn't we want to rather use historical and/or idealized simulations or observations to set up our emulator and analysed the errors induced by the historical pattern on emission scenarios (e.g. Geoffroy & St-Martin, 2014; Hébert & Lovejoy, 2018)? I initially expected that the aerosol and GHG patterns would be used to calibrate a two-pattern emulator, but rather we only get insights on the difference between the extrapolation of these two patterns, results which I thought were a bit trivial since we already know that aerosols have a localized impacts. Wouldn't it be possible to use those two patterns to emulate future scenarios if we have a decomposition of global mean temperature into aerosol and GHG driven components? I think this would be a more powerful framework since we could then use those patterns to emulate any scenarios given the global mean temperature along with the aerosol and GHG forcing timeseries. It is not necessary for the authors to do this in this paper, but I wanted to outline what I think would be useful to broaden the scope of the study.
Specific Comments:
Line 47: "forcer pattern" --- I'm unsure about the use of 'forcer' here and elsewhere, shouldn't it be 'forced pattern'?
Line 129: "This study utilises the mean response component of the MESMER model (Beusch et al., 2020), implementing pattern scaling to emulate the spatial annual mean temperature response in a scenario." --- Is it still the MESMER model if we use only the mean component? Then isn't it just a regression of the local temperature with respect to the global one?
Line 137: "This is performed to ensure the global average parameter is very close to 1 K/K, as it should be by definition, when predictor the model on an individual low-emission scenario such as SSP119." --- What global average parameter are we talking about? The global average of the local sensitivities? Why does it matter to smooth or not the local temperature for the regression to obtain an average close to 1? Also, review the formulation of sentence, 'when predictor the model' doesn't sound right. I would also explain here or somewhere else the units of K/K since at first one thinks why don't they just cancel, and well they do, but I understand you wanted to make explicit that this was a local sensitivity of the local temperature to the global one, right?
Line 145: "A given emulation consists of the predictor set – comprising one or many scenarios – and a target scenario." --- I think this could be better explained. Are we talking about a set of model simulations following certain emission scenarios that are used to estimate the pattern, and then one separate scenario with its own set of model simulation is used as target?
Line 156: "temperatures relative to pre-industrial times rise from 1 K in 2015 to 2 K" --- Maybe give the approximate year when the temperature reaches the 2k 'from 1k in 2015 to 2k in ????'
Line 208: "In hist-aer, the land-ocean distinction is still clear, but the northern hemisphere land is particularly sensitive, due to the historical concentration of aerosol emissions within this region." --- Sensitive isn't exactly the right word right? Are the land region really more sensitive, or is it purely because of the higher aerosol emissions there that the regression slopes are higher? I would consider rephrasing the paragraph to clarify this.
Line 217: "Parts of the NHMLs exhibit a significantly more sensitive response to hist-aer, including the USA, Europe, and east Asia, and the Southern Hemisphere oceans are significantly less sensitive." --- Again, this sounds like sensitivity to aerosols is a local property of the system, but really, the pattern of the response just corresponds to the sources of aerosol emissions. This would likely be outside the scope of this study as it might require more data about the spatial distribution and dispersion of aerosols, but it would be interesting to quantify the actual sensitivity to aerosols taking into account the pattern of emissions (and their dispersion).
Figure 2ab,3ab: I'm not sure the divergent colour palette is appropriate since there is no fundamental difference between values below 1 and above right?
Figure 2d,3d: Wouldn't it be more informative to look at the ratio of the absolute difference over the inter-model spread?
Line 228: "Figure 3 shows the same analysis for SSP119 and SSP585 in a similar way to Figure 2." --- I would complete this sentence with a restatement of what is calculated, something like, if I understand well: '...similar way to Figure 2, i.e. the local temperature series are regressed with the global mean temperature to extract a local sensitivity to global temperature changes.' (could be written more concisely).
Line 251: "Clear, significant differences are therefore found between the temperature response patterns attributable to different historical forcers, consistent with their different spatial patterns." --- I hate to be this guy, but if you say significant, the reader expects a p-value, and you should explain the statistical test used, the null hypothesis considered, etc.
Figure 3: What period is used to train the pattern? In the methods it is said that the first 50 years are used, but I don't think it is said which period the SSP simulations cover. In any case, I think it would be helpful to explicitly state the tiem period sued for training the model.
Line 271: "since the aerosol pattern is more sensitive here than the GHG response, and the Southern Ocean is conversely under-sensitive" --- Again, I'm really not convinced by the usage of sensitivity when it comes to the aerosol pattern. It's not about the sensitivity of aerosols, but rather the strength and spatial distribution of the aerosol forcing. The Southern Ocean is not less sensitive to aerosol forcing, there are just much less aerosol emissions reaching that region.
Line 305: "Errors are significant in the out-of-sample emulations" --- Again, if you say significant, we expect a p-value, if you don't want to give a p-value, use larger instead, otherwise we would also like to know if the smaller errors of the self-emulation are significant or not, just because they are smaller doesn't mean they might not be significant.
Line 358: "Note the smaller scale on the timeseries error plot." --- Might be more useful to have this statement in the caption.
Line 365: "note the slight variations in the SSP119 column compared to the SSP585 one." --- Do you mean 'smaller' variations rather than 'slight'?
Line 387: "The patterns are similar between SSP119 and SSP126, indicating some consistency between scenarios in this effect." --- Why are only those two scenarios considered for this comparison? Wouldn't it also be interesting to see the pattern for SSP245 with a later peak and drop?
Figure 5: Unclear on what period the patterns used for emulation were calculated.
Line 390: A lot of this paragraph could belong to the caption instead. There were several such instances where the figure was described in the text rather than in the captions, I would consider improving the captions and shortening the text to the results only and avoiding the description of the figures there.
Line 390: I would motivate why those specific region-model-scenario are used, I guess simply to explore problematic behaviours?
Figure 9: What period is used to train the predictor?Citation: https://doi.org/10.5194/egusphere-2022-914-RC2 -
AC1: 'Author response to reviewers', Chris Wells, 30 Mar 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-914/egusphere-2022-914-AC1-supplement.pdf
-
AC1: 'Author response to reviewers', Chris Wells, 30 Mar 2023
-
AC2: 'Comment on egusphere-2022-914', Chris Wells, 12 Apr 2023
We thank both reviewers for their detailed, useful comments on our manuscript. We have combined our responses to each review into a single PDF, which we have attached in a comment to each reviewer. Please note the responses to Reviewer 2 begin on Page 6 of the PDF.
Citation: https://doi.org/10.5194/egusphere-2022-914-AC2
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
395 | 158 | 22 | 575 | 41 | 5 | 5 |
- HTML: 395
- PDF: 158
- XML: 22
- Total: 575
- Supplement: 41
- BibTeX: 5
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Christopher D. Wells
Lawrence S. Jackson
Amanda C. Maycock
Piers M. Forster
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(3339 KB) - Metadata XML
-
Supplement
(4185 KB) - BibTeX
- EndNote
- Final revised paper