the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Finite domains cause bias in measured and modeled distributions of cloud sizes
Abstract. A significant uncertainty in assessments of the role of clouds in climate is characterization of the full distribution of their sizes. Orderofmagnitude disagreements exist among observations of such key distribution parameters as the power law exponent and the range over which a power law applies. A study by Savre and Craig (2023) proposed this discrepancy owes in large part to inaccurate fitting methods. Rather than linear regression to a logarithmicallytransformed histogram of cloud sizes, an alternative method termed Maximum Likelihood Estimation was recommended. Here, we counter that Maximum Likelihood Estimation is illsuited to measurements of physical objects like clouds, and that the accuracy of linear regression can be improved with the simple remedy that bins containing less than ~24 counts be omitted from the regression. Further, we argue that the unavoidably finite nature of measurement domains is a much more significant source of error than has previously been appreciated. Finite domain effects are sufficient to account for previously observed discrepancies among reported cloud size distributions. We provide a simple procedure to identify and correct finite domain effects that could be applied to any measurement of a geometric size distribution of objects, whether physical, ecological, social or mathematical.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(1073 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(1073 KB)  Metadata XML
 BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed

RC1: 'Comment on egusphere202467', George Craig, 08 Mar 2024
This paper considers two issues:
First, it is questioned whether maximum likelyhood estimation (MLE) is necessary or even appropriate for fitting power laws to cloud size distributions. It is shown that the arguement of of Clauset et al. that linear regression in logspace is less appropriate, based on the assumed distribution of errors is not compelling  in cases where there is enough data to ensure that the fit is good, the central limit theorem implies both methods give the same result. This is a useful contribution since MLE is harder to implement than simple LR. However, the authors go on to say that MLE is inappropriate  this is not demonstrated, and seems no more likely to be true for this method than for LR.Second, it is shown that finite size effects can affect attempts to identify and fit power laws, even on scales significantly smaller than the domain size. This can affect the apparent presence of absence of power laws, and can lead to the estimated slopes being too large or too small. A simple procedure of rejecting bins that contain more than a maximum fraction of truncated clouds is shown to be effective at removing the sensitivity to domain size for power law or exponential distributions, albeit at the cost of reducing the data available for the fit. In principle, the potential problems of finite domain effects are known, and some studies check for this, but many do not, so demonstration of problem and suggestion of a simple procedure to avoid it is a useful contribution.
Overall, this is an interesting paper  I learned something  and the recommendation of a simple procedure to improve the quality of fitted distributions by removing bins with insufficient data or a large fraction of truncated clouds is valuable. The literature would be improved if everyone followed these rules. However, as detailed in the major comments below, some of the conclusions are unclear or even misleading, and I therefore recommend major revisions before publication.
Major Comments1. Appendix A shows that the argument of Clauset et al. that applying linear regression in loglog space was inappropriate because of the implicit assumption about the error distribution does not apply if there is a sufficient amount of data. The central limit theorem implies that the error distributions will converge to Gaussian in either space. So there is no reason to expect MLE to give better results than LR. The paper also argues that the assumptions underlying MLE are violated for cloud fields (l.116ff). This seems reasonable but also holds for LR; in particular the heteroscedasticity associated with finite domain size described here is also a violation of the assumptions for least square fitting. As far as I can see, the only argument that actually prefers LR to MLE is in Appendix A (l.385). This is the suggestion by Lovejoy et al. that perhaps errors in cloud sizes are lognormally distributed and therefore normal in log space. But the Appendix goes on to conclude, correctly, that we don't really know what the error distributions are, so without examples of one method producing better results that the other, it seems inappropriate to draw conclusions about one method being more appropriate than another.
2. The paper cites the work Savre and Craig (2023) (hereafter SC) as a motivation, but focuses exclusively on the use of MLE, ignoring other key aspects of the recommended methodology, namely the use of a goodness of fit test to identify the appropriate region to fit. In practice, I suspect that removing bins that do not contain a minimum number of points from the fit, as recommended here, and restricting the range of fitting using a goodness of fit test, as recommended by SC, will confine both methods to the regime where there is "enough" data, and the LR and MLE methods will give the same exponents. This seems to be the case for the examples presented here.
3. The paper apparently argues that fitting is not main cause of variation in estimated exponents, but that finite domain effects could be. It is shown that the use of MLE vs LR is not relevant, but there is still a potential sensitivity to the fitting methodology in that bins with insufficient data must be rejected to obtain robust results. Both issues could contribute to the diversity of exponents found in the literature, but it seems difficult to say much about their relative importance without reanalyzing previous data sets. It may not even be possible to separate the two effects cleanly, since as noted in text (l.257), bins with truncated clouds can coincide with bins that have few clouds, and both will be eliminated together. And of course different studies examine different meteorological situations and a diversity of distributions may be the correct answer. Given that some authors (e.g. Heus and Seifert 2013) have claimed to have checked for domain size effects, statements like "Finite domain effects are sufficient to account for previously observed discrepancies among reported cloud size distributions" (l.9) don't seem justified.
Minor Commentsl.6 See major comment. Also the phrase "physical objects like clouds" is odd  like clouds in what respect?
l.57 Would it be possible to come up with another example where assumptions of the fitting algorithm are not met. It was a bit confusing to have finite size effects introduced at this point in the paper.
l.60 SC do not simply argue that "the lack of consensus among prior measurements of cloud sizes owes to the use of inaccurate statistical methods to fit power law distributions." They also show that there can be real physical differences in the distributions, for example associated with the diurnal cycle.
l.124 It seems unlikely that any fitting procedure on real cloud data can be proven to be statistically optimal  see major comment 1.
Fig. 2 typo in xaxis label "30"
l.175ff The formulation of the problem in this paragraph seems to assume that there is a universal distribution of cloud sizes that would be seen in all the studies if it were not for methodological problems with fitting and domain size. One might argue that the hypothesis of a universal distribution has not yet been conclusively disproved by the diversity of observed distributions due to the potential methodological problems.
l.230 "as being a real characteristic of clouds" change to "as also being a real characteristic of clouds under certain conditions"
l.285 It's interesting that periodic BCs produce a peak in the size distribution near the domain size, similar to fits that include truncated clouds. Is there a reason for this?
George Craig
Citation: https://doi.org/10.5194/egusphere202467RC1 
RC2: 'Comment on egusphere202467', Theresa Mieslinger, 12 Mar 2024
Motivated by a recent study by Savre and Craig, 2023, the authors of the present paper investigate and discuss the derivation of power law exponents for describing cloud size distributions in observations and guided by a theoretical model based on percolation theory. The authors argue against the superiority of the MLE technique for estimating the exponent of distributions following a power low because first, observations of cloud sizes are not statistically independent and second, clouds are a visual imprint of their environment and likewise impact their environment such that the probability of further cloud development is changed, I.e. again not independent. The authors show instead that power law exponents can be well estimated from traditional leastsquares fits in doubly logarithmic histogram plots with two main tweaks in the fitting procedure: 1) from an error analysis a minimal bin count of 24 is estimated to yield a robust fit. 2) truncation effects at domain boundaries need to be treated appropriately. For the latter, the authors suggest to only include bins where the fraction of truncated clouds is less than 50% in the derivation of a power law exponent.
The paper is very wellwritten and the reader is guided by a clear structure. It challenges recent literature on the topic and points out two simple tweaks to overcome shortcomings in previous literature, both presenting useful contributions to the scientific field. Some choices in the applied methods could be described and justified in more detail and my specific questions/comments are listed below in the order major, minor and formal:
Major comments:
 The truncation effect is quite obvious in Figures 4, 5, 6 and absolutely reasonable. However, cloud size distributions in previous literature rather show functional forms close to the ones present in this paper when truncated clouds are excluded. Could you please add a few words discussing this thought?
 For the 4000km x 4000km GOES domain considered, truncation effects seem small and the authors use this case as a reference, while truncation effects increase for smaller (sub)domains. Also, the authors mention that not only domain size, but also the data resolution is an important factor (e.g. line 262264). Is there a suggestion from the authors for a domain size  resolution combination or a minimal number of pixels needed to minimise truncation effects? This would make it easier to set previous literature into context where the authors argue that truncation is handled in a suboptimal way.
 Only if less than 50% of all clouds in a given size bin touch the domain boundaries, the respective bin is taken into account for deriving a power law exponent. I was wondering how sensitive the results are to the 50% threshold and whether the authors tested lower/higher values. Could you add a sentence explaining this choice?
 The under and overestimations stated in the Conclusion in line 333 seem to be strongly related to the size of the subdomain which seem to be chosen rather arbitrarily. Could you explain why those are reasonable limits? Surely the over/underestimation could be higher for even smaller domain sizes.
Minor comments:
 Line 145147: could you justify the relaxation from two orders of magnitude to only one? I suppose you’d have fewer samples to base your statistics on and going for only one oder of magnitude is a compromise?
 Line 223228: could you add the reason for simulating a 10000 x 10000 percolation lattice instead of resampling the GOES lattice, i.e. 2000x2000 pixels? Intuitively I would have assumed that you would want to simulate the same pixel number. Also, how do the qvalues for the percolation lattice and also the grid cells stated in line 227 fit to the pixel numbers stated in Figure 5? It seems unnecessarily complicated to not go for the same pixel numbers in theory and observations, but maybe there is a good reason for it.
 Line 247: what is meant by “hypothetical scenario for GOES cloud areas”? Is it simply one subset of the image or do ALL subdomains go into the curves? Related to that, are the numbers stated in the following lines 250251 only for this example or representative error estimates? The authors use them later in the conclusion and it reads as they are upper/lower bounds for over and underestimations due to truncation effects. If it’s indeed only one image it could also easily happen that you sample two very different cloud regimes as your 4000km x 4000km domain includes large clusters as part of the ITCZ as well as small trade cumulus clouds. Related, the xtick labels to Figure 7 seem odd as there are counts for negative cloud areas. Also the Figure caption together with the paragraph discussing that Figure (line 247ff) leaves it unclear to me how the subset is designed and whether it is representative. Please clarify.
 The title of the subchapter 3.4 seems a bit broad and could be sharpened to set the reader’s expectations. It seemed to me that you rather test your suggested truncation fix in an exponential distribution and show that it works there, too.
 In line 262262 the authors state that errors could be further reduced. To what extent did you test other domain sizes / resolutions and could you add further info or include “(not shown)” such that it becomes clear that this statement is based on an analysis rather than gut feeling?
 Comment to Appendix B: domain truncation effects is a major focus of this paper. I would suggest to move the first part of Appendix B (maybe the content of lines 400406) to the main part of the paper, but I leave it up to the authors to decide whether that is appropriate. Also, it would make the paper even stronger if the suggested correction for truncation were applied. But I can also accept if that goes beyond the scope of the present paper.
Formal comments / typos:
 Figure 2 seems to have a typo in the xtick labels at the minimum bin count threshold 30
 Please add in the caption to Figure 3 some reference to the “left” and “right” plot for clarity
 Figure 4, 5, 6, and 7 have several missing superscript numbers in the x and ytick labels. Please correct.
 Fig 6 is mentioned before Fig 5 in the text. Please switch to make it easier to follow and jump back and forth.
 Caption Table1: is there a word missing in the last sentence? “…between the two domain sizes and [methods?] is expressed in units…”
 Typo in line 260: “… for a series of of subdomains created…”
 Typo in caption to Table D1 in second last sentence: “coorespond” instead of “correspond”
Citation: https://doi.org/10.5194/egusphere202467RC2  AC1: 'Reply to reviewer comments', Thomas DeWitt, 29 Apr 2024
Interactive discussion
Status: closed

RC1: 'Comment on egusphere202467', George Craig, 08 Mar 2024
This paper considers two issues:
First, it is questioned whether maximum likelyhood estimation (MLE) is necessary or even appropriate for fitting power laws to cloud size distributions. It is shown that the arguement of of Clauset et al. that linear regression in logspace is less appropriate, based on the assumed distribution of errors is not compelling  in cases where there is enough data to ensure that the fit is good, the central limit theorem implies both methods give the same result. This is a useful contribution since MLE is harder to implement than simple LR. However, the authors go on to say that MLE is inappropriate  this is not demonstrated, and seems no more likely to be true for this method than for LR.Second, it is shown that finite size effects can affect attempts to identify and fit power laws, even on scales significantly smaller than the domain size. This can affect the apparent presence of absence of power laws, and can lead to the estimated slopes being too large or too small. A simple procedure of rejecting bins that contain more than a maximum fraction of truncated clouds is shown to be effective at removing the sensitivity to domain size for power law or exponential distributions, albeit at the cost of reducing the data available for the fit. In principle, the potential problems of finite domain effects are known, and some studies check for this, but many do not, so demonstration of problem and suggestion of a simple procedure to avoid it is a useful contribution.
Overall, this is an interesting paper  I learned something  and the recommendation of a simple procedure to improve the quality of fitted distributions by removing bins with insufficient data or a large fraction of truncated clouds is valuable. The literature would be improved if everyone followed these rules. However, as detailed in the major comments below, some of the conclusions are unclear or even misleading, and I therefore recommend major revisions before publication.
Major Comments1. Appendix A shows that the argument of Clauset et al. that applying linear regression in loglog space was inappropriate because of the implicit assumption about the error distribution does not apply if there is a sufficient amount of data. The central limit theorem implies that the error distributions will converge to Gaussian in either space. So there is no reason to expect MLE to give better results than LR. The paper also argues that the assumptions underlying MLE are violated for cloud fields (l.116ff). This seems reasonable but also holds for LR; in particular the heteroscedasticity associated with finite domain size described here is also a violation of the assumptions for least square fitting. As far as I can see, the only argument that actually prefers LR to MLE is in Appendix A (l.385). This is the suggestion by Lovejoy et al. that perhaps errors in cloud sizes are lognormally distributed and therefore normal in log space. But the Appendix goes on to conclude, correctly, that we don't really know what the error distributions are, so without examples of one method producing better results that the other, it seems inappropriate to draw conclusions about one method being more appropriate than another.
2. The paper cites the work Savre and Craig (2023) (hereafter SC) as a motivation, but focuses exclusively on the use of MLE, ignoring other key aspects of the recommended methodology, namely the use of a goodness of fit test to identify the appropriate region to fit. In practice, I suspect that removing bins that do not contain a minimum number of points from the fit, as recommended here, and restricting the range of fitting using a goodness of fit test, as recommended by SC, will confine both methods to the regime where there is "enough" data, and the LR and MLE methods will give the same exponents. This seems to be the case for the examples presented here.
3. The paper apparently argues that fitting is not main cause of variation in estimated exponents, but that finite domain effects could be. It is shown that the use of MLE vs LR is not relevant, but there is still a potential sensitivity to the fitting methodology in that bins with insufficient data must be rejected to obtain robust results. Both issues could contribute to the diversity of exponents found in the literature, but it seems difficult to say much about their relative importance without reanalyzing previous data sets. It may not even be possible to separate the two effects cleanly, since as noted in text (l.257), bins with truncated clouds can coincide with bins that have few clouds, and both will be eliminated together. And of course different studies examine different meteorological situations and a diversity of distributions may be the correct answer. Given that some authors (e.g. Heus and Seifert 2013) have claimed to have checked for domain size effects, statements like "Finite domain effects are sufficient to account for previously observed discrepancies among reported cloud size distributions" (l.9) don't seem justified.
Minor Commentsl.6 See major comment. Also the phrase "physical objects like clouds" is odd  like clouds in what respect?
l.57 Would it be possible to come up with another example where assumptions of the fitting algorithm are not met. It was a bit confusing to have finite size effects introduced at this point in the paper.
l.60 SC do not simply argue that "the lack of consensus among prior measurements of cloud sizes owes to the use of inaccurate statistical methods to fit power law distributions." They also show that there can be real physical differences in the distributions, for example associated with the diurnal cycle.
l.124 It seems unlikely that any fitting procedure on real cloud data can be proven to be statistically optimal  see major comment 1.
Fig. 2 typo in xaxis label "30"
l.175ff The formulation of the problem in this paragraph seems to assume that there is a universal distribution of cloud sizes that would be seen in all the studies if it were not for methodological problems with fitting and domain size. One might argue that the hypothesis of a universal distribution has not yet been conclusively disproved by the diversity of observed distributions due to the potential methodological problems.
l.230 "as being a real characteristic of clouds" change to "as also being a real characteristic of clouds under certain conditions"
l.285 It's interesting that periodic BCs produce a peak in the size distribution near the domain size, similar to fits that include truncated clouds. Is there a reason for this?
George Craig
Citation: https://doi.org/10.5194/egusphere202467RC1 
RC2: 'Comment on egusphere202467', Theresa Mieslinger, 12 Mar 2024
Motivated by a recent study by Savre and Craig, 2023, the authors of the present paper investigate and discuss the derivation of power law exponents for describing cloud size distributions in observations and guided by a theoretical model based on percolation theory. The authors argue against the superiority of the MLE technique for estimating the exponent of distributions following a power low because first, observations of cloud sizes are not statistically independent and second, clouds are a visual imprint of their environment and likewise impact their environment such that the probability of further cloud development is changed, I.e. again not independent. The authors show instead that power law exponents can be well estimated from traditional leastsquares fits in doubly logarithmic histogram plots with two main tweaks in the fitting procedure: 1) from an error analysis a minimal bin count of 24 is estimated to yield a robust fit. 2) truncation effects at domain boundaries need to be treated appropriately. For the latter, the authors suggest to only include bins where the fraction of truncated clouds is less than 50% in the derivation of a power law exponent.
The paper is very wellwritten and the reader is guided by a clear structure. It challenges recent literature on the topic and points out two simple tweaks to overcome shortcomings in previous literature, both presenting useful contributions to the scientific field. Some choices in the applied methods could be described and justified in more detail and my specific questions/comments are listed below in the order major, minor and formal:
Major comments:
 The truncation effect is quite obvious in Figures 4, 5, 6 and absolutely reasonable. However, cloud size distributions in previous literature rather show functional forms close to the ones present in this paper when truncated clouds are excluded. Could you please add a few words discussing this thought?
 For the 4000km x 4000km GOES domain considered, truncation effects seem small and the authors use this case as a reference, while truncation effects increase for smaller (sub)domains. Also, the authors mention that not only domain size, but also the data resolution is an important factor (e.g. line 262264). Is there a suggestion from the authors for a domain size  resolution combination or a minimal number of pixels needed to minimise truncation effects? This would make it easier to set previous literature into context where the authors argue that truncation is handled in a suboptimal way.
 Only if less than 50% of all clouds in a given size bin touch the domain boundaries, the respective bin is taken into account for deriving a power law exponent. I was wondering how sensitive the results are to the 50% threshold and whether the authors tested lower/higher values. Could you add a sentence explaining this choice?
 The under and overestimations stated in the Conclusion in line 333 seem to be strongly related to the size of the subdomain which seem to be chosen rather arbitrarily. Could you explain why those are reasonable limits? Surely the over/underestimation could be higher for even smaller domain sizes.
Minor comments:
 Line 145147: could you justify the relaxation from two orders of magnitude to only one? I suppose you’d have fewer samples to base your statistics on and going for only one oder of magnitude is a compromise?
 Line 223228: could you add the reason for simulating a 10000 x 10000 percolation lattice instead of resampling the GOES lattice, i.e. 2000x2000 pixels? Intuitively I would have assumed that you would want to simulate the same pixel number. Also, how do the qvalues for the percolation lattice and also the grid cells stated in line 227 fit to the pixel numbers stated in Figure 5? It seems unnecessarily complicated to not go for the same pixel numbers in theory and observations, but maybe there is a good reason for it.
 Line 247: what is meant by “hypothetical scenario for GOES cloud areas”? Is it simply one subset of the image or do ALL subdomains go into the curves? Related to that, are the numbers stated in the following lines 250251 only for this example or representative error estimates? The authors use them later in the conclusion and it reads as they are upper/lower bounds for over and underestimations due to truncation effects. If it’s indeed only one image it could also easily happen that you sample two very different cloud regimes as your 4000km x 4000km domain includes large clusters as part of the ITCZ as well as small trade cumulus clouds. Related, the xtick labels to Figure 7 seem odd as there are counts for negative cloud areas. Also the Figure caption together with the paragraph discussing that Figure (line 247ff) leaves it unclear to me how the subset is designed and whether it is representative. Please clarify.
 The title of the subchapter 3.4 seems a bit broad and could be sharpened to set the reader’s expectations. It seemed to me that you rather test your suggested truncation fix in an exponential distribution and show that it works there, too.
 In line 262262 the authors state that errors could be further reduced. To what extent did you test other domain sizes / resolutions and could you add further info or include “(not shown)” such that it becomes clear that this statement is based on an analysis rather than gut feeling?
 Comment to Appendix B: domain truncation effects is a major focus of this paper. I would suggest to move the first part of Appendix B (maybe the content of lines 400406) to the main part of the paper, but I leave it up to the authors to decide whether that is appropriate. Also, it would make the paper even stronger if the suggested correction for truncation were applied. But I can also accept if that goes beyond the scope of the present paper.
Formal comments / typos:
 Figure 2 seems to have a typo in the xtick labels at the minimum bin count threshold 30
 Please add in the caption to Figure 3 some reference to the “left” and “right” plot for clarity
 Figure 4, 5, 6, and 7 have several missing superscript numbers in the x and ytick labels. Please correct.
 Fig 6 is mentioned before Fig 5 in the text. Please switch to make it easier to follow and jump back and forth.
 Caption Table1: is there a word missing in the last sentence? “…between the two domain sizes and [methods?] is expressed in units…”
 Typo in line 260: “… for a series of of subdomains created…”
 Typo in caption to Table D1 in second last sentence: “coorespond” instead of “correspond”
Citation: https://doi.org/10.5194/egusphere202467RC2  AC1: 'Reply to reviewer comments', Thomas DeWitt, 29 Apr 2024
Peer review completion
Postreview adjustments
Journal article(s) based on this preprint
Viewed
HTML  XML  Total  BibTeX  EndNote  

284  83  26  393  18  37 
 HTML: 284
 PDF: 83
 XML: 26
 Total: 393
 BibTeX: 18
 EndNote: 37
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
Thomas D. DeWitt
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(1073 KB)  Metadata XML