the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Algorithm for continual monitoring of fog life cycles based on geostationary satellite imagery as a basis for solar energy forecasting
Abstract. Detection and monitoring of fog and low stratus (FLS) is particularly important in the context of photovoltaic power production, as FLS, unlike moving clouds, can persist longer and impact larger regions simultaneously, making regional power grid balancing harder. Although photovoltaic power production is limited to daytime hours, its short-term forecasting (especially during the early hours of the day) in the context of high PV penetration systems grid operation, benefits from a complete knowledge of FLS life cycle. As this life cycle usually begins at night and ends during the day, a day-night consistency in the algorithms used for monitoring FLS is required. This study presents an algorithm for detection of FLS over Europe based on the infrared bands of the SEVIRI (Spinning Enhanced Visible and InfraRed Imager) instrument onboard the Meteosat second generation geostationary satellites. As the method operates based on the SEVIRI infrared observations only, it is expected to be stationary in time and thus can provide a coherent and detailed view of FLS development over large areas over the 24 H day cycle. The algorithm is based on a gradient boosted trees machine learning model that is trained with ground truth observations from Meteorological Aviation Routine Weather Reports (METAR) stations and the SEVIRI observations at bands cantered at 8.7, 10.8, 12.0 and 13.4 μm wavelengths. The METAR data used here comprises a total number of 2,544,400 datapoints spread over the winters (i.e., 1st of September to 31st of May) of the years 2016–2022 and 356 locations across Europe. Among them, the datapoints corresponding to 276 stations and the winters of 2016–018 and 2019–2021 (~45 % of all datapoints) were used to train the algorithm. The remaining datapoints comprise four independent datasets which were used to validate the algorithm’s performance and applicability to the time spans and locations in the study area (i.e., Europe) that extend beyond particular locations and time spans covered by the datapoints used for training the algorithm. Additionally, the algorithm’s accuracy at the locations of METAR stations with that of the stablished state-of-the-art daytime FLS detection algorithm Satellite-based Operational Fog Observation Scheme (SOFOS). Validation of the algorithm against the METAR data, showed that the algorithm is well suited for detection of FLS. Specifically, the algorithm is found to detect FLS with probabilities of detection (POD) ranging from 0.70 to 0.82 (for different inter-comparison approaches), and false alarm ratios (FAR) between 0.21 and 0.31. These numbers are very close to those achieved by SOFOS for discriminating the FLS from other sky conditions at the tested locations and time spans. These results also showed that the technique’s applicability in the study region extends beyond the particular locations and time spans covered by the datapoints used for training the algorithm.
- Preprint
(2625 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2023-2885', Anonymous Referee #1, 05 Jul 2024
Review of Jahani et al. "Algorithm for continual monitoring of fog life cycles based on geostationary satellite imagery as a basis for solar energy forecasting"
The authors try to develop a IR-only fog detection using channels at 8.7, 9.7, 10.8, 12, 13.4 microns of SEVIRI. The best combinations of these channels for fog detection are inferred by XGBoost. Before this work, the same group developed another fog detection algorigthm, named SOFO, based on multiple tests (cloud, ice, snow, droplets) using IR-vis channels at 0.6, 0.8, 1.6, 3.9, 8.7, 10.8, 12 microns from the same satellite instrument. SOFO is more physically based but requires more channels to operate. They show that the XGBoost-based method has a POD of ~75-80% and an FAR of ~20-30%.
The authors argue that their XGBoost-based method is expandable because while their validation/test datasets contain regimes that are not included in the training dataset, the POD and FAR performances on those regimes remain comparable to the training (except test4 which has fog frequencies < 1%).
Overall the presentation is clear. The motivation of developing a new method of IR-only fog detection is discussed. The steps creating the training dataset, test1, test2, test3, and test4 are outlined. These details make sure their work is reproducible.
My biggest concern is that the FAR is well above 20%. Statistically, a method would be deemed useful if it has a FAR less than 5%-10%. My concern also applies to their previous method, SOFO, which has an even greater FAR (as high as 40%). The XGBoost-based method may seem to be better but note that the POD of XGBoost-based is 10% less than SOFO. So based on the POD and FAR, in my opinion, both methods do not seem to be practical. A potential problem is that a part of the training dataset has been based on the SOFO method to create the fog/low stratus labels. Therefore, errors in SOFO would propagate into the training dataset and eventually upset the training of XGBoost.
There is a lack of physical explanation why BT12.0, BT8.7 - BT12.0, BT10.8 - BT12.0, and BT12.0 - BT13.4 would have been "chosen" by XGBoost. Their searching process (randomizing the combinations of the channels and find which minimal set of combinations give a desirable result) is typical of modern machine learning approaches. But in applied sciences, the interpretation of the results is as important as the method itself.
Most of the discussions of PV in the abstract and the text are irrelavant to the study. At least the discussions of PV in the abstract should be removed. In addition, the term "life-cycle" in the title is misleading because the current manuscript does not study the life-cycle of fog/low stratus.
A "train dataset" should be a "training dataset".
Citation: https://doi.org/10.5194/egusphere-2023-2885-RC1 -
AC1: 'Reply on RC1', Babak Jahani, 12 Dec 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-2885/egusphere-2023-2885-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Babak Jahani, 12 Dec 2024
-
RC2: 'Comment on egusphere-2023-2885', Anonymous Referee #2, 05 Sep 2024
This paper presents a method of low stratus and fog (FLS) detection day and night time detection. The novelty is to train a machine learning algorithm (XGBOOST) using open observations from airport weather stations (METAR) and Meteosat Second Generation IR observations. The general methodology is robust and the results are sufficiently interpreted.
However, the position of paper considers as state-of-the-art only "the well-established and well-validated SOFOS". Which is a daytime technique. However, the SAF NWC products, operationnal since 2016, provides FLS detection day and night. The Optimal Cloud Analysis proposes also a day night cloud top height product. We can also cite MSG-CPP from KNMI and APOLLO-NG recently developped by the DLR. Even if SOFOS is certainly an excellent reference to test this new algorithm, this paper cannot ignore the state-of-start. For a complete analysis, if a day and night product exists, authors must compare their algorithms with them, not only from SOFOS which has been developed within the same team.
Reading this paper was quite difficult. An effort should me made to shorten and simplify argumentations.
Especially, paragraph 2.4 is extremely long and should be shortened with the help of the very clear Table 1. More over 2.4 should divided in several sub sections (e.g FLS label characteristics can be separated from dataset building.
Paragraph 2.5 shows some useless repetition such as the list of channel and channel combination which is written twice (and a third time in the conclusion paragraph).
Lines from 304 to 320 could be synthetised in a table (this is just a suggestion)
Lines 374-380 repeat arguments already in paragraph 2.4
Lines from 385 to 390 are a simple repetition of information visible in figure 2.
Detailed remarks :
Line 85 and 109 : centered instead of "cantered"
Line 114 please give the reference (website) to find METAR data
Line 145 : precise in which EUMETSAT product did you find elevation above sea level ?
Line 194 : why only per winter year ?
Line 214 : why did you choose to do a quality check after division the datasets ? There is risk of unbalanced dataset size
Line 244 : test3 (0.2%) is certainly test4
Line 260 What is the "previous model" ?
Line 444 ":Although" with lower case after ":"
Citation: https://doi.org/10.5194/egusphere-2023-2885-RC2 -
AC2: 'Reply on RC2', Babak Jahani, 12 Dec 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-2885/egusphere-2023-2885-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Babak Jahani, 12 Dec 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
313 | 80 | 49 | 442 | 12 | 16 |
- HTML: 313
- PDF: 80
- XML: 49
- Total: 442
- BibTeX: 12
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1