the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Data-driven discovery of mechanisms underlying present and near-future precipitation changes and variability in Brazil
Abstract. Untangling the complex network of physical processes driving regional precipitation regimes in the present (1979–2014) and near-future climates (2020–2050) is fundamental to support a more robust scientific basis for decision making in the water-energy-food nexus. We propose a data-driven mechanistic approach to: (Goal 1) identify changes and variability of the regional precipitation mechanisms and (Goal 2) reduce the ensemble spread of future projections by weighting and filtering models that satisfactorily represent these drivers in present climate. Goal 1 is achieved by applying the Partial Least Squares (PLS) technique, a two-sided variant of principal component analysis (PCA), on a reanalysis dataset and 30 simulations of the future climate submitted to CMIP6 to discover the links between global sea-surface temperature (SST) and precipitation in Brazil. Goal 2 is achieved by selecting and weighting the future climate simulations from climate models that better represent the dominant modes discovered by the PLS in the present climate; with this subset of climate simulation, we produce precipitation change maps following IPCC’s WG1 methodology. The main mechanistic link discovered by the technique is that the generalised warming of the oceans promotes a suppression of precipitation in Northeast and Southeast Brazil, possibly mediated by the intensification of the Hadley circulation. We show that this pattern of precipitation suppression is stronger in the near-future precipitation change maps produced using our methodology. This demonstrates that a reduction of epistemic uncertainty is achieved after we select models that skillfully represent these mechanisms in the present climate. Therefore, the approach is capable of supporting both a quantitative analysis of regional changes as well as the construction of storylines supported by mechanistic evidence.
- Preprint
(2230 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-48', Peter Pfleiderer, 21 Feb 2024
The authors present an interesting model constraining application for precipitation changes in Brazil. The analysis is based on Partial Least Squares (PSL) regression. The scope and results of the study are highly relevant on a methodological as well as on a practical level. In the current state, the manuscript lacks some information on the method as well as some details in the results to allow a reasonable interpretation of the results. I would suggest major revisions before publication.
Although the method section is well written there are a few points that were not fully clear or could be misinterpreted due to lacking details:
1) How is the NRMSE calculated? You say that it is obtained by "comparing PLS scores and loadings between each model and those derived from the ERA5". How do you aggregate the comparison of scores and loadings? Do you weight score differences more than loading differences since the loadings have more features?
2) How many iterations of the PLS are done and how much of the variance is explained by the n(?) components? Is there a way to compute the variance explained by n components (similar to PCA)? For instance, is component 1 considerably more important than component 2?
3) Is the first component in climate models always related to ENSO as shown in fig. 1 for ERA5 or are there some models where the first component resembles more a pattern as in fig. 2? If that would be the case and if component 1 and 2 would be similarly important, do you consider this when computing the NRMSE or do you always compare component 1 in the model with component 1 in ERA5? It would be interesting to see figures comparable to fig. 3 and fig. 4 for individual climate models.My main question concerns the way how PLS is used to weight models: Can we assume that comparing individual scores and loadings identified in ERA5 and a climate model tells us how well the climate model reproduces the dynamics? Couldn't there be cases where different aspects/features of the SST forcing on precipitation (e.g. ENSO) are reproduced by the climate model but where the association of these aspects/features to components in PLS ends up to differ from ERA5? It would be helpful if the authors could discuss the assumptions made for the evaluation of model performance using PLS in more detail.
What are components 3 and 4? is there an interpretation for these components?
Table 2: How would you interpret the high weight of GFDL-ESM4? How skillful can a model be if it does not capture the ENSO dynamic (assuming that component 1 always represents some natural variability related to ENSO)? Or put differently, wouldn't we trust EC-Earth3-CC more as it robustly captures the ENSO and the climate change component? These points relate to my questions above concerning the comparability of components between models and ERA5.
Furthermore, I would find it interesting to have the NRMSE listed in the table. I would also find it interesting to see the NRMSE for models that are dropped due to lower skill.l128-129: Is the trend statistically significant? What do you mean by "scores do not show a strong linear trend"? I would agree, that trends in comparison to the trends of component 2, these trends are "weaker" but I still see a trend in fig. 1c.
Fig. 5: Are the maps (a & b) ensemble medians? Or is it a mean considering the weights from table 2?
Minor comments:
l56: does the "t" in "XtY" stand for transpose?
equation 1: why is it max||u|| = ||v|| ?
l63-64: Check sentence. Is there something missing?
l162-163: Is this strong linear trend seen in most models or could it be that the trend is mostly due to a subset of climate models (relating to question 3).
l222-223: Do you have a reference supporting this hypothesis?Citation: https://doi.org/10.5194/egusphere-2024-48-RC1 -
RC2: 'Comment on egusphere-2024-48', Elena Saggioro, 22 Apr 2024
General comments
This paper applies an interesting technique, the Partial Least Squares (PLS) technique - a variation of the Principal Component Analysis to two temporally varying fields, to: 1) detect relationships between global SSTs and local precipitation over Brazil in the past observed record and 2) select and weight CMIP6 models based on their representation of this relationship to investigate its change in the next 30 years and via this constraint the spread of precipitation projections in the region.
This analysis provides interesting insights in the local character of precipitation change in Brazil. It also applies a novel technique for selection of regionally skilful climate models based on physical mechanisms, which is an area of research where new ideas are much needed to provide information that can be useful from a climate adaptation point of view. The paper is overall well written and provides conclusions that are of interest to the WCD community.
However, in the current form, there are several methodological aspects and elements of the interpretation that needs clarification before publication. I therefore suggest major revision before publication.
Specific comments
Introduction:
There is a lack of reference to previous analysis of the link between SST and precipitation in the region (only mentioned in L45). Can you please provide a brief overview, to better locate the contribution of this study?
Method:
I would appreciate a more detailed introduction to the PLS method:
- Can you rephrase in physical terms what “maximize the information present in XtY” mean? (Is it the correlation in time between SST and Precip at two different locations?)
- Could the authors expand on how the modes are identified (e.g. Where can we see the “modes” from Eq 1? )
- Can you give, as example, how the reader should interpret two “loading patterns” in relation to each other (e.g. for mode 1 in Fig1.a and Fig1.b)?
- I would find helpful if the authors could clearly define each term (e.g. mode, loading, scores), associate with a mathematical symbol and show their formula where relevant. Please then repeat the symbol each time it is mentioned in the Methods section, to help the reader connect the terms/formula more easily. Also, as noted in the technical corrections, the use of this terminology is at times inconsistent in the text/figures captions.
How do you combine the NRMSE from the scores and loads into one value (for each mode)? (L93)
Present climate results:
To increase the readers’ trust of the selected models, it would be good to see the 1 and 2 components of the models to get a feeling of how well they perform compared to “observation” beyond the NRMSE. A selection could appear in the Supplementary Material and only referenced in the text.
What do components 3 and 4 represent? Why using them, in case they are not linked to any physical mechanisms?
Future climate results:
What is the implication of models that do not represent well some of the first 4 components selected? (see Table 2; some models do not represent component 1 even which seems to be crucial). Does considering all 4 of them regardless not result in possibly selecting models that actually behave very differently?
To allow for clearer link between components and decrease in uncertainty in Fig 5.b, it would be interesting to see what changes to Fig5.b if:
- Only the models that match ERA for at least Component 1 are included (e.g. no GFDL-ESM4 as seen from Table 2): will the Component 1 of the precipitation signal dominates the overall projected change from the models?
- Only the models that match ERA for at least Component 1 and 2 are included (e.g. no CNRM-ESM2-1 as seen from Table 2
These tests are suggested because it seems that most of the drying in the north/wetting in the south is due to Components 1 and 2. Hence, I would imagine the models that represent them will be the ones that reduce the uncertainty and reveal that pattern.
Further, it would be interesting to see what happens to Fig5.b if no weighting is applied to the selected models (but just a simple average is taken): is the weighting very important, or does the PLS method identified models are already “better” without the need for weighting?
Discussion:
While I do not think the following is the case, it still would be good for the authors to comment on how the reduction in the uncertainty in precipitation changes for the PLS ensemble versus the full one is not an “automatic” result deriving from the construction of the procedure itself. I think this is not the case, because the selection is done on the past climate, not on the future. But it would be good to elaborate on this as it is a question that often arises for filtering methods like this one.
Finally, I would suggest adding a comment on the assumptions of this method. If I understand correctly, this approach rests of the assumption that the features detected in the past for the CMIP6 models (via the PLS procedure) are going to identify models that will also behave closer to reality in the future. More specifically, it seems that:
- the physical assumptions are
- in the future, some of the dominant features relevant to Brazil precipitation will be linked to SSTs (and more specifically to ENSO and generalised warming of the oceans)
- that the models that better represent these connection in the past will continue to be the most able to represent it under increasing forcing in the future
- while the methodological assumption is that the PLS method can reliably detect the models with the correct mechanisms representing the physical connection between SST/ENSO and precipitation.
These are justifiable assumptions, but I think a discussion of them and their limitations are missing.
Technical corrections
Data: are anomalies or full field used?
l45: is this a separate question (then better to phrase with question mark for consistency with the style of the first question) or a comment to the first question above?
L49: this question is phrased rather oddly, a rewrite would be useful. ( adding also a question mark at the end for consistency with the style of the first question)
L54: introduce lat-lon as “latitude – longitude (lat-lon)”
L56: Xt means transpose?
L56: seems there is an extra “and”? in “The PLS method identifies pairs of latent variable vectors and that maximises….”
L59-62: Please explain in simple terms why you get the modes from this equation. Does “Xu” stands for matrix-vector product between X and u? What is the dimension of u?.
L79: can you comment briefly on why was ERA5 chosen instead of GPCP directly for precipitation?
L85: can you comment briefly on why SST used from COBE and not ERA5?
L108-109: Why does the ratio between 2020-2050 and historical ensemble mean climatology is interpreted as “uncertainty?” Is this not the climate change signal, instead? And do you maybe mean a change (future-past) is at the numerator?
L119: is the Amazon region removed before or after the PLS analysis?
L120: can you comment on why “the precipitation in the Amazon region presents significantly higher variability” : is this because of the precipitation induced by transpiration from the trees? Is this choice of cropping out the Amazon forest something done in other papers too?
L139: was it not between 1979 and 2014? (see L85)
L140: what is the anomaly? I have noticed a somewhat inconsistent (or unclear) use of the terms “saliences – correlation” (figure title), “anomaly” and “loading matrices” (caption) : please clarify.
L199: is the map showing a ratio between the CHANGE in precipitation and the past [Precip(future)- Precip(Past)]/ Precip(Past), or a ratio between the values Precip(future)/Precip(Past)? It is not clear from the text or the caption.
Fig 1.c: There is a negative trend in Figs 1.c (not commented on, only for Fig 1.d): could you elaborate on it? Is this linked to any observed trends in variability in the region? How does this relate to the interpretation of Lines 66-69?
Fig 5,6 : I would find more intuitive and consistent to use the brown-green colorscale here since we talk about precipitation change? Why not cropping the Amazon are from here too?
Citation: https://doi.org/10.5194/egusphere-2024-48-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
152 | 32 | 10 | 194 | 3 | 5 |
- HTML: 152
- PDF: 32
- XML: 10
- Total: 194
- BibTeX: 3
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1