the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Cloud Fraction estimation using Random Forest classifier on Sky Images
Abstract. Cloud fraction (CF) is an integral aspect of weather and radiation forecasting, but real time monitoring of CF is still inaccurate, expensive and exclusive to commercial sky imagers. Traditional cloud segmentation methods, which often rely on empirically determined threshold values, struggle under complex atmospheric and cloud conditions. This study investigates the use of a Random Forest (RF) classifier for pixel-wise cloud segmentation using a dataset of semantically annotated images from five geographically diverse locations. The RF model was trained on diverse sky conditions and atmospheric loads, ensuring robust performance across varied environments. The accuracy score was always above 85 % for all the locations along with similarly high F1 score and Receiver Operating Characteristic – Area Under the Curve (ROC-AUC) score establishing the efficacy of the model. Validation experiments conducted at three Atmospheric Radiation Measurement (ARM) sites and two Indian locations, including Gadanki and Merak, demonstrated that the RF classifier outperformed conventional Total Sky Imager (TSI) methods, particularly in high-pollution areas. The model effectively captured long-term weather and cloud patterns, exhibiting strong location-agnostic performance. However, challenges in distinguishing sun glares and cirrus clouds due to annotation limitations were noted. Despite these minor issues, the RF classifier shows significant promise for accurate and adaptable cloud cover estimation, making it a valuable tool in climate studies.
- Preprint
(1384 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 07 Jul 2025)
-
RC1: 'Comment on egusphere-2024-3364', Anonymous Referee #2, 06 Jun 2025
reply
General comments
================
The paper presents cloud fraction estimates using a random forest classifier on sky images. They compare the prediction results of the model against the semantically annotated sky images originating from multiple observation sites in the world. Overall, the paper is well structured and well explained; making it easier for the reader to follow and understand. The authors emphasize on the limitations of the study that are (1) the white-balance / colour calibration of the datasets acquired with different sensors; (2) the air pollution levels that prevents more accurate predictions for India and Australia data; (3) the artifacts on the images (sun glares, cirrus) making it difficult to extract information on sky properties. Despite these, they demonstrated that RF classifier outperforms traditional methods in high-pollution sites. Minor corrections need to be applied to improve readability, structure and syntax.Specific comments
=================Introduction:
The introduction lacks a presentation/structure paragraph of the remaining of the sections of the paper. Like : "The paper is structured as follows. In Section 2 we present..."Section 2
Maybe merge the Section 2 "Observing sites" and 3 "Data Generation and Preprocessing" into one single section called "Data" for easier structure for the reader:2. Data
2.1 Observing sites and datasets
2.2 Preprocessing
2.3 Selection
2.4 Ground truth masksSection 3
Add a 2 panels figure to show a raw image vs the preprocessed version, to illustrate where the dead zones are located in the image, and what the circular mask does.Section 4
Figure 1 must be cited explicitly in the text and lacks a caption sentence.Section 5
Any idea/hypothesis of why the model performs worse in Canada and Australia ? You stated some explanations for Indian data, but not for these. Why is has the German dataset the best results ?Section 5.1
The first paragraph needs to be moved in a prior position in the paper as it explains what is the cloud fraction and what's the goal of the model of this study; like at the beginning of Section 4 for example, or even earlier in the paper.Section 5.2
- First paragraph : accurate white-balance or colour calibration is required then ? Is there some practical solution to this sensor uniformization issue ?
- Add a Figure or subfigure for which the predictions are really accurate, instead of only showing the shortcomings and outliers.Technical corrections
=====================Abstract:
- "vary" used too much in 3 consecutive sentences; use synonym
- efficiency instead of efficacyIntroduction:
- line 27-28: instead of we need, use "the scientific community requires specific devices"
- line 31 : capture data at high temporal resolutions
- line 36 : caveat with " Many researchers have adopted this (Chauvin et al., 2015; Chow et al., 2011; Ghonima et al., 2012; Kuhn et al., 2018);(Lothon et al., 2019)" and then comma and end sentence. Something is lacking. Replace by:
The clear sky (CSL) threshold method, as outlined by Shields et al. (2009), uses spectral information—particularly from the red and blue bands—to differentiate between cloudy and clear-sky conditions. This technique has been widely adopted by researchers (e.g., Chauvin et al., 2015; Chow et al., 2011; Ghonima et al., 2012; Kuhn et al., 2018; Lothon et al., 2019). However, a notable limitation is that the threshold value can vary across an image, influenced by the relative distance between the sun and each image pixel.
- line 40 : use citet or citealt and not citep for in-sentence citation.
- line 46 and so on : same citation issue
- line 57 : distinguish clouds from clear skySection 2
- line 78 : reformulate better "All these sites have the common make of instrument - a Total Sky Imager (TSI) that takes the sky images."
- line 81 : syntax + repetition "A shadow band is also placed that continuously rotates as it tracks the sun. This shadow band blocks the intense direct sun that can saturate the images."Section 3.1
- line 108 : "this would be essential" and not instrumental
- line 110 : validation of the models training.Section 3.2
- MATLAB image labeller app: add some footnote link or citationSection 4
- line 130 + 131 : replace no. with the number of
- line 131 : did not explain what Mfeat is
- line 137 : reformulate like for example:
"The primary limitation of Random Forest (RF) models lies in their interpretability. Because RF relies on an ensemble of numerous decision trees to generate predictions, it can be challenging to trace and explain the rationale behind a specific prediction."Section 5
- line 161 : no comma before "were used to train a random forest classifier"
- line 162 : no need for (n_estimators=100) and (random_state=42)
- line 165 : let's call it the "test" or "validation" dataset ?
- line 170 : "Table 1 shows the metrics for each dataset location"
- line 175 : "Overall, ..."Section 5.1
- Fig 2 : last two sentences need to be moved in the main text after the reference of the Figure or discarded to avoid repetitions in the paper.
- Fig. 2 : To improve clarity, use different symbols for each model, e.g. dots, crosses and triangles. FYI the plots of Fig. 2, do not appear in color on the prepublished version.
- Fig 3 : line 205 : "first column are" english
- Fig 3 : "nicely" is not to be used in the text; use more objective synonym
- line 215 : do you have citation that maybe studies this phenomena ?
- line 224 : the horizontal axis, instead of "x axis"; idem for vertical axis; January to December in full words
- line 229 : "Germany and Canada" datasets, to precise
- Fig 4 : reformulate better like:
"Figure 4: Median Cloud Fraction (CF) heatmaps for four regions—Australia, Germany, Canada, and India—comparing CF estimates from TSI data, RF classifier output, and their percentage difference. The horizontal axis denotes the months (January to December), and the vertical axis indicates the local time of day (06:00–18:00). Distinct regional patterns emerge: TSI tends to overestimate CF in Australia (January–June) and in Germany and Canada, while underestimating CF in India."Conclusion
- line 274 : "it has numerous..." and not the data as we are talking about the CFCitation: https://doi.org/10.5194/egusphere-2024-3364-RC1 -
RC2: 'Comment on egusphere-2024-3364', Anonymous Referee #1, 16 Jun 2025
reply
This paper applies a RF classifier to ground-based sky images from several locations for estimating cloud fraction. The authors prepare annotated datasets, train site-specific and merged RF models, and compare the model’s CF output against TSI results. While the manuscript is well-organised and the topic is of interest for the atmospheric observation and machine learning communities, I find the work lack of methodological novelty, analysis depth, and evaluation rigor. The comments below aim to guide the authors toward a significantly strengthened version of this manuscript.
1. The model is explicitly trained at the pixel level (as shown in Fig. 3), yet the evaluation is based solely on cloud fraction (CF), a scene-level aggregate statistic. This disconnect is concerning. If the model is trained to perform per-pixel segmentation, why are there no pixel-wise metrics (e.g., accuracy, F1, precision/recall, IoU) reported? This omission makes it difficult to assess how well the model actually distinguishes cloud vs. sky on a per-pixel basis, and not just whether it approximates CF correctly. Especially given that annotated segmentation masks are available, this should be straightforward to add.
2. Though the authors mention performance degradation due to sun glare and cirrus clouds, this is illustrated only through a few hand-picked examples. There is no quantification of how prevalent these issues are in the dataset, nor an analysis of how performance varies across such conditions. Similarly, no confusion matrix or class-specific breakdown is presented to identify key failure modes. A more systematic error analysis would strengthen this part.
3. The authors highlight that RF is computationally efficient, but there is no measurement of runtime, memory usage, or inference speed. Even a simple runtime comparison on a CPU vs. a lightweight CNN would be informative.
Other minor comments:
Line 30: “satellite-based imagers have lower temporal resolutions”. The authors ignore the fact that geostationary satellites provide very high temporal resolution (10-minute or better) imagery. This should be acknowledged to give a more balanced view.
Line 95: “images captured during rain were also removed”. Please clarify how rain-contaminated images were identified. Was this done manually or through an automated threshold/filter?
Line 123: “Random Forest” -> Should be abbreviated as RF.
Line 137: The sentence stating that RF models are “difficult to interpret” is vague. Please be specific: are the authors referring to the difficulty of tracing individual pixel classifications back to specific trees or features? If so, mention this explicitly.
Line 159 (Figure 1 caption): The caption is too terse. I would expect a more informative caption that explains the key steps in the algorithm flowchart.
Citation: https://doi.org/10.5194/egusphere-2024-3364-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
113 | 23 | 7 | 143 | 3 | 3 |
- HTML: 113
- PDF: 23
- XML: 7
- Total: 143
- BibTeX: 3
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1