Probing the spuriousness of observed downwelling radiative fluxes over the global Tropical Oceans
Abstract. A statistical evaluation of daily downwelling radiation data from the Global Tropical Moored Buoy Array (GTMBA), collocated with Clouds and Earth's Radiant Energy Systems (CERES) satellite data, was conducted from 2000 to 2023. This study addressed systematic biases and spurious data downwelling shortwave (Qs) and longwave (QL) radiation, which are crucial for understanding the ocean's energy budget and climate system. Two filtering methods were applied: a fixed threshold (Qs > 350 Wm-2, QL > 450 Wm-2) and a dynamic threshold (mean + 2 standard deviations per station) to check the spuriousness. For Qs, both methods had a negligible effect or resulted in a slight increase in bias, with no significant improvement in root mean squared error (RMSE) or correlation across regions. However, for QL showing large discrepancies in buoy observations in peaks (420–440 Wm-2) compared to satellite (400–410 Wm-2), both fixed and dynamic threshold filtering consistently improved correlation, bias, and RMSE. This greater effectiveness for QL is attributed to its strong influence by atmospheric temperature and humidity profiles, creating systematic biases that filtering effectively addresses. Overall, threshold filtering proved more effective for QL than for Qs, with fixed methods delivering consistently positive results for QL. The study highlights the need for customised threshold strategies to validate in situ data and advocates for further advancements in satellite retrieval algorithms and data assimilation to enhance the accuracy of radiative flux products and mitigate existing biases.
This manuscript investigates potentially spurious downwelling shortwave (Qs) and longwave (QL) radiation observations from the Global Tropical Moored Buoy Array by comparison with collocated CERES satellite data. The topic is relevant to ocean heat-budget studies and quality control of in situ observations. The analysis covers three major tropical mooring networks and provides potentially useful station-level information. However, the current methodology does not yet fully support the identification of “spurious” buoy observations.
Major comments
1. The identification of spurious observations requires independent evidence.
The manuscript largely treats CERES as a reference and interprets large buoy–satellite discrepancies as spurious buoy measurements. However, disagreement may also arise from CERES retrieval errors or from the representativeness difference between point-scale buoy measurements and gridded satellite products. The authors should validate suspected outliers using additional evidence, such as GTMBA quality-control flags, neighbouring buoy records, instrument metadata, temporal consistency checks, or an independent reanalysis product. The term “potentially spurious observations” would be more appropriate unless the errors can be independently confirmed.
2. The fixed and dynamic thresholds require stronger justification.
The fixed thresholds of 350 W m⁻² for Qs and 450 W m⁻² for QL may remove physically plausible daily values under certain atmospheric conditions. Similarly, the mean + 2 standard deviations approach does not fully account for seasonal cycles, geographic differences, or non-normal distributions. The authors should provide sensitivity tests using alternative thresholds and consider station- and season-specific criteria, robust statistics such as the median absolute deviation, or physically based clear-sky limits.
3. The construction of daily averages and the treatment of missing data need clarification.
The buoy records contain irregular temporal gaps, which may bias daily means. The authors should specify the minimum number of valid hourly observations required to calculate a daily value, explain how missing hours were handled, and ensure that the temporal definitions of the buoy and CERES daily data are fully consistent. The fraction of filtered observations should also be reported for each station and region.
4. Statistical significance and uncertainty should be quantified.
Some reported improvements in correlation, bias, and RMSE are relatively small. The authors should determine whether these changes are statistically meaningful using paired bootstrap confidence intervals or another appropriate method. Sample sizes should be reported for each threshold category, particularly for the extreme-value subsets, because correlations calculated from a small and restricted range of observations may be unstable.
5. The statistical metrics are not consistently described.
The methods section refers to standard deviation, bias, correlation, and RMSE, whereas Equation (1) defines MAE and no explicit equation is provided for bias. The authors should revise the methodology section, define all metrics consistently, and clarify whether improvements refer to absolute bias, signed bias, MAE, or RMSE.
6. The scope of the conclusions should be moderated.
The results show that threshold filtering is generally more effective for QL than for Qs. However, the manuscript does not directly quantify the effects of the identified observations on long-term climate trends, SST variability, or ocean heat-budget calculations. These implications should be presented more cautiously unless additional analyses are provided.
Minor comments
The manuscript would benefit from careful English editing and a more concise presentation. Several sections are repetitive. The CERES product name, version, spatial resolution, temporal resolution, and variable definitions should be stated explicitly. Figure labels and legends should also be checked carefully; for example, the legend in Figure 1 appears to contain a typographical error.