the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Machine-learning-based approach for solar radiation model uncertainty identification, attribution and bias correction
Abstract. Accurate surface solar irradiance (SSI) is essential for climate monitoring and solar energy applications, yet operational radiation data products such as the Copernicus Atmospheric Monitoring Service (CAMS) solar radiation service (CRS) still exhibit systematic and situation dependent errors. These are arising from uncertainties in clouds, aerosols and surface properties as well as from area-time mismatch between ground observations and pixel or model gridbox averaged properties. In this study, we develop a data-driven XGBoost-based uncertainty model that predicts the instantaneous CRS irradiance uncertainty for global horizontal (GHI), diffuse horizontal (DHI) and beam normal irradiance (BNI) using only the operational CRS inputs. We apply SHapley Additive exPlanations (SHAP) to quantify the contribution of individual physical predictors and to diagnose the dominant relations of observed deviations to CRS input parameters. Across all components, cloud optical depth is identified as the primary driver of CRS irradiance uncertainty. Aerosol optical depths of different aerosol components and surface reflectance (albedo and BRDF parameters) have additional component-dependent influences, particularly for DHI and BNI. The SHAP analysis also reveals a solar zenith angle dependence with contributions increasing at high solar zenith angles (SZA). A single-case analysis under overcast conditions demonstrates how SHAP can attribute individual large errors to specific cloud, aerosol and surface processes. Finally, we apply the trained situation-dependent model as a post-processing bias correction to the CRS irradiances. The bias correction reduces the median bias from 5.0 to −0.6 Wm-2 for GHI and from 11.1 to 1.0 Wm-2 for DHI. The bias correction improves the root-mean-squared errors and correlation coefficients for all components GHI, DHI, and BNI. The results demonstrate that physically interpretable machine-learning methods can both identify the dominant irradiance deviations based on operationally available CRS input parameters and provide an effective path for a post-processed bias correction.
- Preprint
(2990 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 22 Jul 2026)
-
RC1: 'Comment on egusphere-2026-2743', Anonymous Referee #1, 03 Jun 2026
reply
This paper presents an error-correction framework for the CAMS solar radiation service (CRS) using an XGBoost-based regression model, with SHAP used for feature attribution. While the attempt to improve operational irradiance products is relevant, the work faces significant limitations in terms of scientific novelty, rigorous benchmarking, and the conceptual definition of “uncertainty.” The presentation of the methodology in Section 2.3 is unnecessarily opaque, and the scope of the study lacks a broader, more robust framework suitable for a general audience or global applications. Below are the detailed comments:1. The methodology relies on standard tools (XGBoost + SHAP) applied to existing data, which limits the novelty of the contribution.2. The findings regarding the drivers of error (e.g., cloud and aerosol influences) are largely intuitive and established in the literature. The paper would benefit from a more critical discussion on why a data-driven approach is necessary here if the drivers are already known, and what specific new insights these models provide beyond confirming known physics.3. The authors should clarify why they chose to model the error specifically, rather than attempting to predict the irradiance or clear-sky index directly, and whether a direct prediction model was considered or tested as a baseline.4. There is a major concern regarding the use of the term "uncertainty." In the atmospheric sciences and remote sensing, uncertainty implies a probabilistic, quantified measure (e.g., variance, confidence intervals). This work presents a point-regression model, which does not provide a probabilistic view. The authors should either adopt more precise terminology (e.g., "systematic deviation" or "bias") or incorporate formal uncertainty quantification techniques (e.g., quantile regression, ensemble methods, or evaluating metrics like CRPS, PIT, or reliability diagrams).5. The study lacks a comparison with alternative methods. To justify the use of XGBoost, it is essential to benchmark against other statistical or simpler ML models (e.g., linear regression, random forests) to demonstrate the specific value added by this approach.6. The current approach of training a single model for the entire disk is likely inappropriate, as errors in solar radiation retrievals are highly dependent on climate, seasonal regimes, and sky conditions. The authors should consider a foundation model in their future work.7. Section 2.3 is unnecessarily complex. The algebraic derivation provided in Eq. (2) adds little value and obscures a straightforward procedure. The section could be significantly condensed by simply stating that a regression model was developed on the retrieval error.8. The paper is saturated with repetitive plots that do not offer proportional informational gains. A more concise synthesis of the findings, perhaps moving some supplemental diagnostic plots to an appendix, would strengthen the narrative and focus the reader's attention on the key results.ReplyCitation: https://doi.org/
10.5194/egusphere-2026-2743-RC1 -
RC2: 'Comment on egusphere-2026-2743', Anonymous Referee #2, 22 Jun 2026
reply
This manuscript presents a machine-learning-based framework for diagnosing and correcting irradiance residuals in the CAMS Solar Radiation Service (CRS) using XGBoost and SHAP-based feature attribution. The study addresses a relevant topic and provides an interesting combination of operational irradiance products, explainable machine learning techniques, and physical interpretation of model residuals. The manuscript is generally very well written and the proposed framework shows promising potential for improving solar radiation products and investigating the factors associated with their errors.
Some aspects of the methodology and interpretation require possibly further clarification. In particular, the distinction between predictor importance and physical error attribution should be discussed more carefully, as SHAP values quantify associations with the model residuals but do not uniquely identify or describe their underlying physical causes. Additionally, some conclusions regarding aerosol-related error sources appear a bit stronger than directly supported by the presented analysis. I also believe that the manuscript would benefit from additional discussion of representativeness effects, uncertainties associated with the residual definition, and the robustness of some conclusions.
Furthermore, the presented SHAP framework offers substantial potential beyond the analysis shown here. A station-level and/or seasonal investigation of the SHAP contributions could provide valuable insight into the geographical and temporal variability of the dominant error drivers and would further strengthen the physical interpretation of the presented results. Such analyses may also represent an interesting direction for future work.
I think this is a very interesting work worth being published. Some comments below might be useful for improving the manuscript.
The detailed comments are provided below:
Line 65: "The CRS provides..." Please introduce the full name of CRS at its first occurrence.
Line 67: Please specify the temporal coverage of the CRS dataset used in this study, as well as its temporal and spatial (latitude/longitude) resolution.
Line 73: Please introduce the acronym "AOD" after the first occurrence of "aerosol optical depth".
Line 74: Please introduce the acronyms "TCWV" (total column water vapour) and "TCO or TCO3" (total column ozone).
Line 80: Please provide more details regarding the CRS v4.6 service, including a brief description and appropriate reference(s).
Line 86: Please introduce the acronym "SZA".
Figure 1: It would be useful to provide additional information on the temporal availability or number of observations per station of each network. This could be visualized, for example, using marker size or color shading while retaining the current network classification.
Lines 112-113: The ground-based measurements were averaged within centered ±15 min intervals the actual pixel observation time. Given the strong temporal variability of solar irradiance, particularly during morning and afternoon periods when SZA changes rapidly, have the authors investigated the sensitivity of the results to shorter averaging windows (e.g. ±5 or ±10 min)? I believe that a brief sensitivity analysis or discussion would strengthen the methodology.
Section 2.3: The theoretical formulation presented in Eqs. (1)–(3) links the residual error to deviations between the ideal and approximate CRS radiative transfer framework. However, the target variable is defined as the difference between ground-based observations and CRS estimates and therefore also contains contributions from measurement uncertainties, retrieval errors, and representativeness errors associated with the comparison between point (or station) observations and grid-scale estimates. For example, under broken-cloud conditions, a point measurement may observe a cloud-free period while the corresponding CRS pixel is partially cloudy (or vice versa), resulting in substantial residuals that are not directly related to deficiencies in the radiative transfer model itself. Please discuss how these additional sources of uncertainty may affect the interpretation of the ML-predicted residuals and the attribution of errors to specific CRS input parameters.
Lines 165-166: The authors state that the XGBoost hyperparameters were "tuned through experimentation". Please provide additional details regarding the hyperparameter optimization procedure, e.g. including e.g. the evaluation metric used or the range of tested values.
Section 2.5: Several predictors used in the analysis are expected to be strongly correlated (e.g. Cloud Optical Depth, Cloud Coverage, and Cloud Type, as well as some of the surface reflectance parameters). Could this affect the SHAP-based feature attribution and the interpretation of the feature importance rankings?
Lines 200–201: Since the evaluation of the original CRS products is presented later in the manuscript, consider briefly summarizing the baseline CRS performance earlier in the Results section to provide context for the subsequent machine-learning-based corrections.
Line 209: I think it would be better to introduce the acronym “SZA” in Section 2.1.
Lines 222-224: The authors suggest that the importance of AODAM and AODNI may indicate inaccurate optical properties of ammonium and nitrate aerosols in the CRS model. However, SHAP values primarily quantify the contribution of predictors to the ML-estimated residuals and do not directly identify the underlying physical cause of the error. The observed importance of AODAM and AODNI may also reflect aerosol retrieval uncertainties, interactions with cloud properties, or representativeness errors associated with relatively “polluted” environments.
Lines 259-260: The conclusion that aerosol-related improvements are particularly important under low-SZA conditions should be interpreted with caution. As noted above, the lowest SZA bins contain substantially fewer observations and originate from a limited subset of stations.
Lines 265–266 “The lack of clear SHAP–feature dependencies for aerosols and surface parameters high lights the importance of accurately representing their interactions with cloud fields and scattering geometry.”: The conclusion that improvements in aerosol optical properties would provide substantial benefits for reducing DHI errors is not entirely clear from the presented SHAP analysis. Earlier in the section, the manuscript notes that AODNI and AODAM do not exhibit a consistent SHAP–feature relationship. If the observed importance of these variables primarily reflects interactions with cloud fields and scattering geometry, as suggested in the following sentence, please clarify why deficiencies in the aerosol optical properties themselves are considered the most likely explanation.
Lines 317–319: The importance of the BRDF-related parameters for the BNI residual is somewhat unexpected given the weak direct influence of surface reflectance on the direct beam component. Please further discuss the physical interpretation of this result.
Lines 323–325: Given the extremely low AODs reported for this case (e.g. AODAM = 0.006 and AODSS = 0.006), please discuss the physical significance of the aerosol-related SHAP contributions. In particular, it is unclear whether these aerosol loadings are sufficiently large to produce meaningful radiative effects, or whether the associated uncertainties may be of the same order as the reported values.
Section 3.4: The selected case study corresponds to a heavily overcast scene (Cloud Optical Depth = 77.6, Cloud Coverage = 100%) with very low AODs. While this example effectively demonstrates the dominant role of cloud-related predictors, it provides limited insight into the interpretation of aerosol-related SHAP contributions. Consider including an additional case characterized by relatively high aerosol loading and relatively low cloud influence, which would better illustrate the role of aerosol-related predictors and complement the current cloud-dominated example.
Lines 346–360: While substantial improvements are reported for DHI and BNI, the improvements for GHI appear relatively modest (RMSE reduction from 105.3 to 103.0 W m⁻² and R² increase from 0.873 to 0.879). Is there a physical explanation ?
Lines 363–364: The statement that the XGBoost-based error model can "effectively learn and correct systematic deficiencies in the CRS radiation products" appears somewhat strong. Since the residuals may also contain contributions from measurement uncertainties and representativeness errors.
Lines 373–374: The statement that the feature importance rankings are robust across solar elevations should be interpreted with caution, particularly for the lowest SZA bins, which contain substantially fewer observations and represent a geographically limited subset of the dataset.
Citation: https://doi.org/10.5194/egusphere-2026-2743-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 118 | 25 | 15 | 158 | 12 | 10 |
- HTML: 118
- PDF: 25
- XML: 15
- Total: 158
- BibTeX: 12
- EndNote: 10
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1