the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A machine learning approach for evaluating Southern Oceancloud-radiative biases in a global atmosphere model
Abstract. The evaluation and quantification of Southern Ocean cloud-radiation interactions simulated by climate models is essential in understanding the sources and magnitude of the radiative bias that persists in climate models for this region. To date, most evaluation methods focus on specific synoptic or cloud type conditions and are unable to quantitatively define the impact of cloud properties on the radiative bias whilst considering the system as a whole. In this study, we present a new method of model evaluation, using machine learning, that can at once identify complexities within a system and individual contributions.
To do this, we use an XGBoost model to predict the radiative bias within a nudged version of the Australian Community Climate and Earth System Simulator – Atmosphere-only Model, using cloud property biases as predictive features. We find that the XGBoost model can explain up to 55 % of the radiative bias from these cloud properties alone. We then apply SHapley Additive exPlanations feature importance analysis to quantify the role each cloud property bias plays in predicting the radiative bias. We find that biases in liquid water path is the largest contributor to the cloud radiative bias over the Southern Ocean, though important regional and cloud-type dependencies exist. We then test the usefulness of this method in evaluating model perturbations and find that it can clearly identify complex responses, including cloud property and cloud-type compensating errors.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(5399 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(5399 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-531', Anonymous Referee #1, 02 Sep 2023
The authors present a novel approach to diagnose the contributions to biases in Earth System Models (ESMs) using a machine learning regression technique along with sensitivity metrics. They apply the technique to cloud radiative biases over the Southern Ocean in ACCESS-AM2 and discern useful relationships which are discussed in the context of a broader model evaluation presented elsewhere. They also examine the derived relationships as a function of cloud type and present a really cool plot in Figure 4! Overall, I find this an interesting and well written paper that brings a novel perspective to diagnosing biases in ESMs and am happy to recommend it for publication subject to one minor suggestion for improvement.
I'm not sure if it belongs in the introduction, or discussion, but I am struck by the relationship between this approach and calibration approaches which use an emulator to relate model parameters to (potentially multiple) model outputs (c.f. Watson-Parris et al. 2022) in order to reduce model biases. Such emulators allow estimation of the sensitivity of particular outputs to given inputs (essentially feature importance, see e.g. Lee et al. 2013), and have recently been used with multiple observables (LWP, Nd, etc) to constrain ERF_aci (Regayre et al. 2023). I feel the authors' approach adds a useful step in relating the biases to observables, and this might help alleviate the difficulties in that work in choosing the 'best' observations to use for bias (or uncertainty) reduction. Generally, since the authors perform a small perturbation study I think it would be useful to more explicitly link the approach to model tuning and discuss how it can help improve the process.
Minor typos:
L257: 'but' -> 'by'
L404: understating -> understanding
Citation: https://doi.org/10.5194/egusphere-2023-531-RC1 - AC1: 'Reply on RC1', Sonya Fiddes, 20 Dec 2023
-
RC2: 'Comment on egusphere-2023-531', Anonymous Referee #2, 19 Sep 2023
Summary:
In this manuscript, Fiddes and coauthors use a new approach, a XGBoost model, to examine shortwave radiative biases over the Southern Ocean in the Australian Community Climate and Earth System Simulator (ACCESS) – Atmosphere-only Model. They find the XGBoost model can explain up to 55% of the radiative bias as being due to biases in clouds, and using an associated Shapley Additive exPlanations (SHAP) feature importance analysis find that biases in liquid water path is the largest contributor to the cloud radiative bias.
Overview:
I am entirely in favor of developing new approaches to evaluate climate models, and in particular, I find the additive nature of the SHAP values interesting. That said, I seem to be missing (or not correctly understanding) some key aspects, such as how the dependence between the “cloud features” (what one might call collinearity in a multiple linear regression) means for interpreting the results. As is, I don’t perceive how this new approach is providing new insights beyond what existing approaches could provide. I explain in more detail in my comments that follow below. This doesn’t make the results incorrect, and in fact, clearly the primary scientific result called out in the abstract – that biases in liquid water path are the largest contributor to the cloud radiative biases – is correct. In my view this article would be much stronger if it directly compared results based on cloud regimes and multiple linear regression (both of which seem to have already done at some level) and demonstrated in a concrete way the additional value of the XGBoost approach.
General Comments:
1) On the relative merits of the XGBoost/SHAP technique presented here.
At several points in the manuscript, you assert that the XGBoost technique presented here achieves results that current techniques do not. For example, you write in the abstract that “most evaluation methods focus on specific synoptic or cloud type conditions and are unable to quantitatively define the impact of cloud properties on the radiative bias whilst considering the system as a whole.” Or later on line 467, you write “No other analysis method is able to consider a system as a ’whole’ (within the confines of the data provided that is), and subsequently isolate the effect of individual components.”
Frankly, I do not see how these statements are correct. Nothing, for example, stops one from taking a “cloud regime” analysis, aggregating data by regimes, and calculating the mean SWCRE and LWP bias for each regime -- with the result that one ends up know the contribution of each regime to the overall biases in SWCRE and LWP. Or indeed running a multiple-linear regression to map the SW bias (with or without cloud regimes) to some set cloud properties. In what way would such analyses be “not quantitative” and/or “not consider the system as a whole”?
Indeed, you appear to have already applied a multiple-linear regression (I think perhaps to the region as a whole rather than cloud regimes); where on line 208/218 you indicate that a multiple linear regression was “only” able to predict between 42-43% of the variance in SWCRE as compared to 58% with XGBoost. Frankly, I am not sure that I see going from 42 to 58% of explained variance as a major improvement. And (if I understand) the multiple-linear regression also identifies bias in LWP as the principal predictor for bias in SWCRE (line 200-203).
As a paper where one of the principal goals is to argue that the XGBoost/SHAP analysis is superior (in some aspects), I think it would be better to compare directly the XGBoost/SHAP analysis to other approaches and demonstrate in a concrete way (that goes beyond commenting on the explained variance) what is gained.
2) Dependence between cloud “features”
I don't understand how the dependence between cloud variables / features is being taken into account in the XGBoost/SHAP analysis. Yes, I see you have a dendrogram and clustering index (Figure 2c), but these metrics don’t seem to be used in interpreting later results.
For example, TauL is obviously related to LWP and CFL. What are the panels f, m, and t in Figure 3 showing me? If I understand, the SHAP values in Figure 3f represent only the "additional" information in the tau-bias that is not covered LWP and CFL? (In short, it is not a measure of how much tau errors contribute to the biases, but rather the contribution of using TauL being used as an additional correction factor on top of LWP and CFI). If yes, does this mean factors such as effective radius OR the inherent non-linearity between LWP and albedo/TOA SW flux is being “modeled” by this term?
Somewhat similarly, I am not clear on what the “offset” in the CFL (Figure 3p) - or the offset in other variables means. As-is, the situation appears no different to me than the interpretation of variables in a multiple-linear regression, and is superior in the sense that linearity is not assumed. (Indeed the bottom row suggests that the relationships are not linear, at least for larger biases in the cloud features, which is no doubt how the explain variance can go up, but this would – if anything- make the interpretation more difficult).
I must admit that I found myself wonder what would happen if you put SW cloud albedo into you set of cloud features?
3) Early description, cloud “features” & data
I initially found much of the early text difficult to follow, in part because of the abstract language and use of the broad term “cloud features”. It is not until Figure 2 (on about page 9, well into the results section) that I understood that this meant LWP, CFL … . I think it would be helpful to introduce the “features” you are going to use in section 2.1.
Please also see a variety of specific comments as regards these data.
As is, I am still not clear as to whether LWP is referring to “in cloud” LWP or “domain mean” LWP (meaning a spatial average which includes clear sky / non cloudy “pixels” in a given spatial domain). And the same is true of IWP, TauL and TauI.
Specific Comments:
Line 27. You write “Currently, our climate and weather models do not take into account the pristine composition of the SO atmosphere …”. What constitutes “our” models? Perhaps “some” or “many” and provide references?
Line 97. COSP contains a variety of simulators. Do you mean just the ISCCP or MODIS simulator? Please clarify.
*Line 103. It is good that you are identifying the satellite products, but I think you should go a step further and identify which specific “variables” or “fields” are being used, and discuss implication / limitation (see also comment line 114). With MODIS, for example, not every pixel has a successful retrieval. (I know you are using the L3 summary product, but this product still represents a subset of observed pixels). In particular, you can find LWP is both the MODIS and CERES products you list. *What LWP variable did you use? *In principle, you should include some discussion on the quality of the observational data, and discuss any limitation that might affect your analysis.
Line 107. Please give the ACCESS-AM2 horizontal grid resolution?
*Line 114. You write “… the model’s COSP liquid water path (LWP) and ice water path (IWP) showed considerable biases when compared to the observed COSP products, and hence the raw model output was used for these fields instead.” I am not clear why this justifies using the raw model output! Is not the point of the simulator to make satellite datasets comparable to model? For example, you don't know from MODIS observations the cloud LWP when deep cloud systems are present with lots of ice condensate at the top. You only get an IWP retrieval for MODIS. But your model will have LWP present in the deep systems.
Also, as far as I know there is no simulator built specifically for the CERES-SYN product. I am guessing here you are using MODIS data for cloud “features” and CERES SYN only for radiation/SWCRE, Yes?
Line 119. What does “fitted” mean here? Please expand this description.
Line 136. Using the word ‘folds’ to describe “4-Fold cross validation” is not very helpful. What does a “fold” entail? Please expand/rephrase this description.
Figure 1. Please provide area-weighted mean values for the thee maps on the left. What is the spatial scale of the data (is this a fixed lat/lon grid)? What year (or years) of data is this, all 5 years?
Figure 1. The scatter plot does not seem very useful. What are the units on the colorbar? Perhaps replace with plot showing absolute bias vs percentage of data. That is, if you take the X% of points with the smallest (absolute value of) of bias, what is the maximum value of the (absolute) bias of this subset. Plot maximum value of the bias vs. X %. A non-linear scale might help here (for example, X = 100, 99, 95, 85, 75, 50, 25, 5, 1).
*Figure1 Caption. (a) Are all biases given in this article (obs - model)? Please define and use consistently. I think perhaps you have this backwards in the caption. It is typical to use “model – obs” such that a negative bias means the model value is too small (as compared to the observations) -- And I think later in the article negative LWP bias does mean model LWP is too small (as compared to observations). (b) You should also define how you are calculating SWCREtoa in the text (I suspect this might explain the apparent sign reversal here).
Line 187. You write “Examination of the model and satellite fields separately (not shown) shows that that the asymmetrical bias appears to be due to ACCESS-AM2 failing to capture the observed spatial variability.” This seems like an interesting (and to me surprising) point. Perhaps show the individual fields in figure 1, not just the differences?
Line 184. “… reaching the surface..”? Here and at several other points in the text you refer to "too much reaching the surface". While I have no doubt this is true, you are using the CRE defined at TOA which means “more SW absorbed in the system” not just more SW reaching the surface. I know this is nitpicking, but I think it would be more technically correct to stick with "too little SW being reflected back to space” or “too much absorbed SW.”
Line 209. How does the bias and standard deviation in the prediction for bias from the multiple-linear regression compare with those in the next paragraph (lines 221-223) given for XGBoost?
Figure 2c. The dendrogram is very hard to read. The vertical grey lines are too close together.
Line 280. You write, “A distance of one would imply feature independence, while zero would imply complete redundancy.” So if I understand, the lines have very similar clustering distances and are relatively close to one. Does this mean each variables contain largely equal amounts of information? Please discuss what this means as regards the relationship between the specific “features” you have choosen. For example, what does this mean as regards understanding what TauL, LWP, and CFL mean? Please see general comments.
Line 309. I think “CFI” on this line should be “CFL”.
Line 314. I am struggling to understand what this “offset” DOES mean. All other cloud properties being fixed, less cloud DOES mean more sunlight would reach the surface. So is this telling me there are compensating errors (for example having too much LWP is being compensated for by too little CFL) OR is this a consequence of non-linearity in the system (for example, on average LWP is larger on average when CFL is larger) OR simply radiation is not a linear function of LWP)? As written, you seem to be telling me what it does NOT mean, and I would rather understand what is DOES mean.
Line 317. I think the phrase at the end of the sentence which includes the word “compensating” is potentially confusing. I think you mean mid-latitudes and high-latitudes bias are compensating in the hemispheric mean, but maybe you mean CFI biases are compensating something else (CFL biases)? Perhaps rephrase for clarity.
Line 322. “not intuitive”? The CFI and IWP features aren't "ice in place of liquid"; It is simply too much or too little ice (regardless of what liquid is doing). The features make intuitive sense to me as positive bias in CFI and IWP (inner two rings) gives a negative SHAP factor just as too much LWP gives a negative shape factor – and vice versa the negative bias in CFI and IWP in the (outer ring) gives a positive shape factor. If there is ice in place of liquid than liquid bias will necessarily be negative when ice bias is positive – that is you can't look at these terms in isolation to see the phase change / compensating errors – and indeed this is occurring and you do comment on this in the sentences that follow.
(The only thing that is not intuitive to me is that a negative SHAP factor – due to increase in LWP or IWP – means a “negative radiation bias” (line 300). But if I understand this is due to the way SWCRE_TOA is being defined).
Line 378. Perhaps note that for subpolar ML, StC, MS SHAP, CFL is larger than LWP, suggesting cloud amount rather than in-cloud LWP is the dominant source of bias for these cloud type?? If yes, shouldn’t this be reflecting in the main conclusion. Its LWP overall, but for some cloud types CFI is the dominant bias?
Line 394. This seems like pretty weak speculation to me. Perhaps add "we speculate ...". Rather to me it seems likely that CTP is to a large degree taking the role of providing measure of cloud-type. (Perhaps offer this as an alternative speculation?)
Line 422. The noise is obvious, but it is not obvious to me that the noise is “introduced by the nudging”. Do you mean the signal is small (because nudging limits the effect of the change in ice capacitance as compared to a free running model) and so the "signal to noise" is small? Or do you mean that some the "noise", that is the variation in the fields examined is actually larger than would be the case for a free running model? Please clarify.
Line 438. You write “For CFL, higher clouds (cirrus, convective, frontal) are found to be associated with a reduction in CFL and an increase in SHAP values.” Any idea why ?
Line 481. Perhaps changes “like-for-like” to “coincident in time”? (To me just using a satellite similar means compare like-to-like”)
Line 489. While I am sure it is true there isn't one "fix", that the radiation bias is asymmetric does not demonstrated this. Rather I would argue that different cloud regimes have different biases, and there are differences in the distribution of cloud regimes in different locations.
Line 491. Perhaps change “cannot be” to "the XGBoost model suggest that biases cannot be explained by … ".
Citation: https://doi.org/10.5194/egusphere-2023-531-RC2 - AC2: 'Reply on RC2', Sonya Fiddes, 20 Dec 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-531', Anonymous Referee #1, 02 Sep 2023
The authors present a novel approach to diagnose the contributions to biases in Earth System Models (ESMs) using a machine learning regression technique along with sensitivity metrics. They apply the technique to cloud radiative biases over the Southern Ocean in ACCESS-AM2 and discern useful relationships which are discussed in the context of a broader model evaluation presented elsewhere. They also examine the derived relationships as a function of cloud type and present a really cool plot in Figure 4! Overall, I find this an interesting and well written paper that brings a novel perspective to diagnosing biases in ESMs and am happy to recommend it for publication subject to one minor suggestion for improvement.
I'm not sure if it belongs in the introduction, or discussion, but I am struck by the relationship between this approach and calibration approaches which use an emulator to relate model parameters to (potentially multiple) model outputs (c.f. Watson-Parris et al. 2022) in order to reduce model biases. Such emulators allow estimation of the sensitivity of particular outputs to given inputs (essentially feature importance, see e.g. Lee et al. 2013), and have recently been used with multiple observables (LWP, Nd, etc) to constrain ERF_aci (Regayre et al. 2023). I feel the authors' approach adds a useful step in relating the biases to observables, and this might help alleviate the difficulties in that work in choosing the 'best' observations to use for bias (or uncertainty) reduction. Generally, since the authors perform a small perturbation study I think it would be useful to more explicitly link the approach to model tuning and discuss how it can help improve the process.
Minor typos:
L257: 'but' -> 'by'
L404: understating -> understanding
Citation: https://doi.org/10.5194/egusphere-2023-531-RC1 - AC1: 'Reply on RC1', Sonya Fiddes, 20 Dec 2023
-
RC2: 'Comment on egusphere-2023-531', Anonymous Referee #2, 19 Sep 2023
Summary:
In this manuscript, Fiddes and coauthors use a new approach, a XGBoost model, to examine shortwave radiative biases over the Southern Ocean in the Australian Community Climate and Earth System Simulator (ACCESS) – Atmosphere-only Model. They find the XGBoost model can explain up to 55% of the radiative bias as being due to biases in clouds, and using an associated Shapley Additive exPlanations (SHAP) feature importance analysis find that biases in liquid water path is the largest contributor to the cloud radiative bias.
Overview:
I am entirely in favor of developing new approaches to evaluate climate models, and in particular, I find the additive nature of the SHAP values interesting. That said, I seem to be missing (or not correctly understanding) some key aspects, such as how the dependence between the “cloud features” (what one might call collinearity in a multiple linear regression) means for interpreting the results. As is, I don’t perceive how this new approach is providing new insights beyond what existing approaches could provide. I explain in more detail in my comments that follow below. This doesn’t make the results incorrect, and in fact, clearly the primary scientific result called out in the abstract – that biases in liquid water path are the largest contributor to the cloud radiative biases – is correct. In my view this article would be much stronger if it directly compared results based on cloud regimes and multiple linear regression (both of which seem to have already done at some level) and demonstrated in a concrete way the additional value of the XGBoost approach.
General Comments:
1) On the relative merits of the XGBoost/SHAP technique presented here.
At several points in the manuscript, you assert that the XGBoost technique presented here achieves results that current techniques do not. For example, you write in the abstract that “most evaluation methods focus on specific synoptic or cloud type conditions and are unable to quantitatively define the impact of cloud properties on the radiative bias whilst considering the system as a whole.” Or later on line 467, you write “No other analysis method is able to consider a system as a ’whole’ (within the confines of the data provided that is), and subsequently isolate the effect of individual components.”
Frankly, I do not see how these statements are correct. Nothing, for example, stops one from taking a “cloud regime” analysis, aggregating data by regimes, and calculating the mean SWCRE and LWP bias for each regime -- with the result that one ends up know the contribution of each regime to the overall biases in SWCRE and LWP. Or indeed running a multiple-linear regression to map the SW bias (with or without cloud regimes) to some set cloud properties. In what way would such analyses be “not quantitative” and/or “not consider the system as a whole”?
Indeed, you appear to have already applied a multiple-linear regression (I think perhaps to the region as a whole rather than cloud regimes); where on line 208/218 you indicate that a multiple linear regression was “only” able to predict between 42-43% of the variance in SWCRE as compared to 58% with XGBoost. Frankly, I am not sure that I see going from 42 to 58% of explained variance as a major improvement. And (if I understand) the multiple-linear regression also identifies bias in LWP as the principal predictor for bias in SWCRE (line 200-203).
As a paper where one of the principal goals is to argue that the XGBoost/SHAP analysis is superior (in some aspects), I think it would be better to compare directly the XGBoost/SHAP analysis to other approaches and demonstrate in a concrete way (that goes beyond commenting on the explained variance) what is gained.
2) Dependence between cloud “features”
I don't understand how the dependence between cloud variables / features is being taken into account in the XGBoost/SHAP analysis. Yes, I see you have a dendrogram and clustering index (Figure 2c), but these metrics don’t seem to be used in interpreting later results.
For example, TauL is obviously related to LWP and CFL. What are the panels f, m, and t in Figure 3 showing me? If I understand, the SHAP values in Figure 3f represent only the "additional" information in the tau-bias that is not covered LWP and CFL? (In short, it is not a measure of how much tau errors contribute to the biases, but rather the contribution of using TauL being used as an additional correction factor on top of LWP and CFI). If yes, does this mean factors such as effective radius OR the inherent non-linearity between LWP and albedo/TOA SW flux is being “modeled” by this term?
Somewhat similarly, I am not clear on what the “offset” in the CFL (Figure 3p) - or the offset in other variables means. As-is, the situation appears no different to me than the interpretation of variables in a multiple-linear regression, and is superior in the sense that linearity is not assumed. (Indeed the bottom row suggests that the relationships are not linear, at least for larger biases in the cloud features, which is no doubt how the explain variance can go up, but this would – if anything- make the interpretation more difficult).
I must admit that I found myself wonder what would happen if you put SW cloud albedo into you set of cloud features?
3) Early description, cloud “features” & data
I initially found much of the early text difficult to follow, in part because of the abstract language and use of the broad term “cloud features”. It is not until Figure 2 (on about page 9, well into the results section) that I understood that this meant LWP, CFL … . I think it would be helpful to introduce the “features” you are going to use in section 2.1.
Please also see a variety of specific comments as regards these data.
As is, I am still not clear as to whether LWP is referring to “in cloud” LWP or “domain mean” LWP (meaning a spatial average which includes clear sky / non cloudy “pixels” in a given spatial domain). And the same is true of IWP, TauL and TauI.
Specific Comments:
Line 27. You write “Currently, our climate and weather models do not take into account the pristine composition of the SO atmosphere …”. What constitutes “our” models? Perhaps “some” or “many” and provide references?
Line 97. COSP contains a variety of simulators. Do you mean just the ISCCP or MODIS simulator? Please clarify.
*Line 103. It is good that you are identifying the satellite products, but I think you should go a step further and identify which specific “variables” or “fields” are being used, and discuss implication / limitation (see also comment line 114). With MODIS, for example, not every pixel has a successful retrieval. (I know you are using the L3 summary product, but this product still represents a subset of observed pixels). In particular, you can find LWP is both the MODIS and CERES products you list. *What LWP variable did you use? *In principle, you should include some discussion on the quality of the observational data, and discuss any limitation that might affect your analysis.
Line 107. Please give the ACCESS-AM2 horizontal grid resolution?
*Line 114. You write “… the model’s COSP liquid water path (LWP) and ice water path (IWP) showed considerable biases when compared to the observed COSP products, and hence the raw model output was used for these fields instead.” I am not clear why this justifies using the raw model output! Is not the point of the simulator to make satellite datasets comparable to model? For example, you don't know from MODIS observations the cloud LWP when deep cloud systems are present with lots of ice condensate at the top. You only get an IWP retrieval for MODIS. But your model will have LWP present in the deep systems.
Also, as far as I know there is no simulator built specifically for the CERES-SYN product. I am guessing here you are using MODIS data for cloud “features” and CERES SYN only for radiation/SWCRE, Yes?
Line 119. What does “fitted” mean here? Please expand this description.
Line 136. Using the word ‘folds’ to describe “4-Fold cross validation” is not very helpful. What does a “fold” entail? Please expand/rephrase this description.
Figure 1. Please provide area-weighted mean values for the thee maps on the left. What is the spatial scale of the data (is this a fixed lat/lon grid)? What year (or years) of data is this, all 5 years?
Figure 1. The scatter plot does not seem very useful. What are the units on the colorbar? Perhaps replace with plot showing absolute bias vs percentage of data. That is, if you take the X% of points with the smallest (absolute value of) of bias, what is the maximum value of the (absolute) bias of this subset. Plot maximum value of the bias vs. X %. A non-linear scale might help here (for example, X = 100, 99, 95, 85, 75, 50, 25, 5, 1).
*Figure1 Caption. (a) Are all biases given in this article (obs - model)? Please define and use consistently. I think perhaps you have this backwards in the caption. It is typical to use “model – obs” such that a negative bias means the model value is too small (as compared to the observations) -- And I think later in the article negative LWP bias does mean model LWP is too small (as compared to observations). (b) You should also define how you are calculating SWCREtoa in the text (I suspect this might explain the apparent sign reversal here).
Line 187. You write “Examination of the model and satellite fields separately (not shown) shows that that the asymmetrical bias appears to be due to ACCESS-AM2 failing to capture the observed spatial variability.” This seems like an interesting (and to me surprising) point. Perhaps show the individual fields in figure 1, not just the differences?
Line 184. “… reaching the surface..”? Here and at several other points in the text you refer to "too much reaching the surface". While I have no doubt this is true, you are using the CRE defined at TOA which means “more SW absorbed in the system” not just more SW reaching the surface. I know this is nitpicking, but I think it would be more technically correct to stick with "too little SW being reflected back to space” or “too much absorbed SW.”
Line 209. How does the bias and standard deviation in the prediction for bias from the multiple-linear regression compare with those in the next paragraph (lines 221-223) given for XGBoost?
Figure 2c. The dendrogram is very hard to read. The vertical grey lines are too close together.
Line 280. You write, “A distance of one would imply feature independence, while zero would imply complete redundancy.” So if I understand, the lines have very similar clustering distances and are relatively close to one. Does this mean each variables contain largely equal amounts of information? Please discuss what this means as regards the relationship between the specific “features” you have choosen. For example, what does this mean as regards understanding what TauL, LWP, and CFL mean? Please see general comments.
Line 309. I think “CFI” on this line should be “CFL”.
Line 314. I am struggling to understand what this “offset” DOES mean. All other cloud properties being fixed, less cloud DOES mean more sunlight would reach the surface. So is this telling me there are compensating errors (for example having too much LWP is being compensated for by too little CFL) OR is this a consequence of non-linearity in the system (for example, on average LWP is larger on average when CFL is larger) OR simply radiation is not a linear function of LWP)? As written, you seem to be telling me what it does NOT mean, and I would rather understand what is DOES mean.
Line 317. I think the phrase at the end of the sentence which includes the word “compensating” is potentially confusing. I think you mean mid-latitudes and high-latitudes bias are compensating in the hemispheric mean, but maybe you mean CFI biases are compensating something else (CFL biases)? Perhaps rephrase for clarity.
Line 322. “not intuitive”? The CFI and IWP features aren't "ice in place of liquid"; It is simply too much or too little ice (regardless of what liquid is doing). The features make intuitive sense to me as positive bias in CFI and IWP (inner two rings) gives a negative SHAP factor just as too much LWP gives a negative shape factor – and vice versa the negative bias in CFI and IWP in the (outer ring) gives a positive shape factor. If there is ice in place of liquid than liquid bias will necessarily be negative when ice bias is positive – that is you can't look at these terms in isolation to see the phase change / compensating errors – and indeed this is occurring and you do comment on this in the sentences that follow.
(The only thing that is not intuitive to me is that a negative SHAP factor – due to increase in LWP or IWP – means a “negative radiation bias” (line 300). But if I understand this is due to the way SWCRE_TOA is being defined).
Line 378. Perhaps note that for subpolar ML, StC, MS SHAP, CFL is larger than LWP, suggesting cloud amount rather than in-cloud LWP is the dominant source of bias for these cloud type?? If yes, shouldn’t this be reflecting in the main conclusion. Its LWP overall, but for some cloud types CFI is the dominant bias?
Line 394. This seems like pretty weak speculation to me. Perhaps add "we speculate ...". Rather to me it seems likely that CTP is to a large degree taking the role of providing measure of cloud-type. (Perhaps offer this as an alternative speculation?)
Line 422. The noise is obvious, but it is not obvious to me that the noise is “introduced by the nudging”. Do you mean the signal is small (because nudging limits the effect of the change in ice capacitance as compared to a free running model) and so the "signal to noise" is small? Or do you mean that some the "noise", that is the variation in the fields examined is actually larger than would be the case for a free running model? Please clarify.
Line 438. You write “For CFL, higher clouds (cirrus, convective, frontal) are found to be associated with a reduction in CFL and an increase in SHAP values.” Any idea why ?
Line 481. Perhaps changes “like-for-like” to “coincident in time”? (To me just using a satellite similar means compare like-to-like”)
Line 489. While I am sure it is true there isn't one "fix", that the radiation bias is asymmetric does not demonstrated this. Rather I would argue that different cloud regimes have different biases, and there are differences in the distribution of cloud regimes in different locations.
Line 491. Perhaps change “cannot be” to "the XGBoost model suggest that biases cannot be explained by … ".
Citation: https://doi.org/10.5194/egusphere-2023-531-RC2 - AC2: 'Reply on RC2', Sonya Fiddes, 20 Dec 2023
Peer review completion
Journal article(s) based on this preprint
Data sets
ACCESS-AM2 Southern Ocean cloud and radiation data and code for SHAP analysis Sonya Fiddes https://doi.org/10.5281/zenodo.7196622
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
381 | 150 | 22 | 553 | 14 | 12 |
- HTML: 381
- PDF: 150
- XML: 22
- Total: 553
- BibTeX: 14
- EndNote: 12
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Marc D. Mallet
Alain Protat
Matthew T. Woodhouse
Simon P. Alexander
Kalli Furtado
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(5399 KB) - Metadata XML