the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Self-Supervised Precipitation Forecast Verification Based on Contrastive Learning
Abstract. Accurate precipitation forecast verification (PFV) is essential for improving forecasting models and supporting disaster management. However, current PFV methods remain limited, point-to-point methods are overly sensitive to minor errors, while spatial verification methods commonly require setting parameter and rule comprehensively, which constrains their availability. To tackle these issues, we are inspired by the success of deep learning in image verification through extracting high-level features, and thus propose a self-supervised contrastive learning-based PFV method (CLPFV). First, CLPFV uses precipitations augmentation (displacement, intensity, area size) to simulate actual forecast errors and construct positive and negative training sample pairs. Subsequently, with a novel loss function proportionally penalizing forecast errors, a backbone network is trained in CLPFV to extract high-level precipitation features. Finally, the cosine similarity of features is calculated as CLPFV’s verification score. Experiments demonstrate that CLPFV outperforms traditional (POD, FAR, TS) and spatial (FSS, SAL) verifications in different degrees of forecast errors and aligns better with expert assessments. In general, CLPFV offers an efficient deep learning solution for PFV tasks.
- Preprint
(1381 KB) - Metadata XML
-
Supplement
(1688 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-5746', Anonymous Referee #1, 15 Mar 2026
-
AC1: 'Reply on RC1', Yanwen Wang, 15 May 2026
The supplement pdf can better exhibt our replies, please check the pdf file.
This manuscript proposes a novel precipitation forecast verification (PFV) method, CLPFV, based on self-supervised contrastive learning. The study addresses a meaningful methodological issue, namely, how to develop a comprehensive verification method that is more tolerant of minor forecast errors, more sensitive to substantial errors, and better able to reflect different degrees of error. The basic idea of using data augmentations (displacement, intensity, and area size), together with an improved contrastive loss function, to train a neural network to learn the gradient of forecast errors is both scientifically sound and methodologically elegant. Overall, I find this manuscript valuable and potentially suitable for publication in GMD after minor revision.
Reply: Thank you so much for your comments and we are glad to read that you have positive feedback to our work. We have carefully addressed the concerns raised in your review and made the corresponding revisions to our manuscript. Our Responses are provided in green and revisions in the manuscript are highlighted as blue.
Major Comments
- In the Introduction, there is a logical gap between the discussion of the limitations of spatial methods and the introduction of deep-learning-based image verification. Please explain more explicitly how the extraction of “high-level abstract features” directly helps address the spatial “double penalty” issue.
Reply: Thank you so much for pointing out this logical flaw between the limitations of spatial validation methods and deep-learning-based verification. We have revised the start of paragraph of introducing deep-learning-based verification in lines 66-72: The limitation of existing spatial verification methods essentially stems from their reliance on predefined parameters and rules, preventing them from truly capturing the spatial distributions of observed and forecast precipitation fields. Consequently, conducting PFV from an overall structural perspective promises more reliable results. Inspired by this, we propose a deep-learning-based PFV method that evaluates forecast performance by comparing the high-dimensional features of observed and forecasted precipitations. This approach leverages the exceptional capabilities of deep learning in simulating human cognitive processes and extracting complex features, as well as its remarkable success in image verification practices in recent years.
- I suggest adding a short subsection, for example, “2.1 Basic Idea,” to explicitly present the core logic behind the proposed solution to the scientific gap. Part of the second-to-last paragraph of the Introduction already seems to contain this basic idea.
Reply: Thanks a lot for this insightful suggestion, a short introduction of basic idea can help to better understand our proposed verification method. Therefore, we added a paragraph in the beginning of 2 Methodology in lines 88-95: In this section, we present the proposed verification method, named CLPFV, in detail. The core idea of CLPFV is to conduct the verification by shifting from grid-matching to an overall high-level structural similarity comparison through self-supervised contrastive learning. To be more specific, we first used multi-dimensional precipitation augmentations in CLPFV to create intrinsic supervisory signals from unlabeled data to address the scarcity of labeled samples. Subsequently, we designed an improved contrastive loss function that applies proportional penalties to forecast errors when extracting high-level precipitation features, thereby reasonably reflecting the gradient of errors. Finally, the result is directly calculated by comparing the high-dimensional features of observed and forecasted precipitations, achieving the verification from an overall structural perspective.
- In Section 2, the conceptual framework is somewhat mixed with specific technical implementations (e.g., ResNet-18). In my view, the proposed verification framework does not strictly depend on ResNet-18 or InfoNCE. A brief discussion of the portability of this framework in the Discussion section, such as its applicability to other spatial modeling tasks, would further strengthen the methodological contribution of the paper.
Reply: We totally agree with this comment, which highlights the contribution of CLPFV. Just as reviewer suggested, we added a discussion in 4 Conclusion in lines 417-421: Notably, CLPFV also serves as a feasible framework when evaluating the prediction of other meteorological variables or environmental phenomena. Depending on the specific requirements of forecasting or prediction tasks, e.g., PM2.5 forecast and soil mapping, the elements of CLPFV (such as ResNet-18 deep learning model, InfoNCE loss function, and Cosine similarity calculation) can be changed to adapt with various architectures. This flexibility underscores CLPFV’s methodological contribution as a generalizable paradigm of forecast verification in the broad earth science field.
- The simplification of forecast errors into displacement, intensity, and area size is reasonable and useful. However, “area size” may not fully capture all structural errors in real precipitation forecasts. A brief acknowledgement of this limitation would improve the manuscript.
Reply: We appreciate the careful consideration behind this recommendation. In the paragraph of introducing displacement, intensity, and area size, we added a brief acknowledgement of this limitation and explained why we conducted the precipitation augmentation in these three aspects, in lines 111-113: While these three aspects fall short of providing a comprehensive depiction of the spatial morphology and structure of precipitation, they offer a straightforward and quantitative assessment of forecast errors and are commonly used in PFV studies (Ebert et al., 2009).
- The rationale for the quadratic penalty in the improved loss function could be explained more clearly. The current explanation is understandable but somewhat brief. One or two additional sentences on why a quadratic penalty is appropriate here would make the design more transparent.
Reply: Thank you for pointing out this issue. We added a brief explanation of quadratic penalty in lines 194-197: Specifically, the quadratic function provides a smooth and differentiable penalty for model training and features extraction. It ensures the forecast verification remains tolerant of minor errors while applying increasingly strict penalties to larger errors, effectively distinguishing different ties of forecast quality.
Minor Comments
- Several acronyms are used in the Abstract (POD, FAR, TS, FSS, SAL) without prior definition. Please ensure that all abbreviations are spelled out at their first occurrence.
Reply: Since it is not necessary to introduce the concrete compared verification methods in the Abstract, we directly deleted abbreviations of POD, FAR, TS, FSS, SAL. Then we thoroughly checked the manuscript, now we ensured that all abbreviations are spelled out at their first occurrence.
-
AC1: 'Reply on RC1', Yanwen Wang, 15 May 2026
-
RC2: 'Comment on egusphere-2025-5746', Anonymous Referee #2, 22 Mar 2026
This research proposes a self-supervised contrastive learning-based precipitation forecast verification method. It can effectively reflect the certain degree of forecasts errors through contrastive learning, thereby successfully addressing the “double penalty” issue of traditional point-to-point verification methods and the reliance on manually set parameters in spatial verification methods. Overall, the research holds considerable value and is recommended for publication after minor revisions.
Here are my suggestions and comments:
- In lines 65-66, this sentence “Therefore, a more comprehensive…still needed” is too long to read, please modify it for better understanding.
- In the paragraph of lines 70–75, the rationale for using deep learning to verify precipitation forecasts is not convincible enough. Please provide a more detailed explanation.
- In lines 180-183, the punishment function is not correct, first, it should be p(f-), second, it should provide a specific equation.
- In lines 209-210, the concept of “alternative sampling” is not clear, please specify the steps of sample splitting in detail.
- In figure 7, to improve its self-explanatory ability, it is recommended that the specific deviation values used be labeled directly below each subplot (e.g., “Displacement: -10, -5, 0, +5, +10” under subplot (a)), rather than only described in the caption.
- In the discussion of figure 9, is it possible to provide a benchmark to compare all verification methods? This will make the experimental results more intuitive.
- The content in lines 296–335 may be appropriately simplified to prevent redundancy with the information presented in figure 9.
Citation: https://doi.org/10.5194/egusphere-2025-5746-RC2 -
AC2: 'Reply on RC2', Yanwen Wang, 15 May 2026
The attached pdf can better exhibit our replies, please check the pdf file.
Reviewer 2
This research proposes a self-supervised contrastive learning-based precipitation forecast verification method. It can effectively reflect the certain degree of forecasts errors through contrastive learning, thereby successfully addressing the “double penalty” issue of traditional point-to-point verification methods and the reliance on manually set parameters in spatial verification methods. Overall, the research holds considerable value and is recommended for publication after minor revisions.
Reply: Thank you very much for your thorough review of this manuscript. Your comments and suggestions are very helpful. We have carefully revised the manuscript based on your concerns. Responses are marked as green and revisions in the revised manuscript are highlighted as blue.
Comments
- In lines 65-66, this sentence “Therefore, a more comprehensive…still needed” is too long to read, please modify it for better understanding.
Reply: We completely agree with this comment, now this sentence is improved in lines 64-65: Therefore, a more comprehensive verification method is needed to tolerate minor errors yet penalize significant ones, thereby better reflecting the varying degrees of error.
- In the paragraph of lines 70–75, the rationale for using deep learning to verify precipitation forecasts is not convincible enough. Please provide a more detailed explanation.
Reply: Thanks a lot for this considerable comment. We added an elaboration at the start of this paragraph in lines 66-72: The limitation of existing spatial verification methods essentially stems from their reliance on predefined parameters and rules, preventing them from truly capturing the spatial distributions of observed and forecast precipitation fields. Consequently, conducting PFV from an overall structural perspective promises more reliable results. Inspired by this, we propose a deep-learning-based PFV method that evaluates forecast performance by comparing the high-dimensional features of observed and forecasted precipitations. This approach leverages the exceptional capabilities of deep learning in simulating human cognitive processes and extracting complex features, as well as its remarkable success in image verification practices in recent years.
- In lines 180-183, the punishment function is not correct, first, it should be p(f-), second, it should provide a specific equation.
Reply: Thank you for pointing out this issue. Since added quadratic penalty of CLPFV loss function is to reflect different degrees of precipitation augmentation (f+), the punishment function should be p(f+). But a specific equation is still needed. Thus, we added Eq. (9) and an explanatory sentence in lines 190-191: Eq. (9) shows the specific equation of punishment function \p(f+), which is the combination of the magnitude square of all kinds augmentations \m with a penalty coefficient \lambda.
(9)
- In figure 7, to improve its self-explanatory ability, it is recommended that the specific deviation values used be labeled directly below each subplot (e.g., “Displacement: -10, -5, 0, +5, +10” under subplot (a)), rather than only described in the caption.
Reply: We have modified figure 7, directly added concrete deviation information under each subplot.
Figure 7. An example of simulated forecasted precipitations by applying gradient biases.
- In the discussion of figure 9, is it possible to provide a benchmark to compare all verification methods? This will make the experimental results more intuitive.
Reply: This suggestion is helpful, since benchmark makes Fig. 9 to provide a clearer comparison of different PFV methods. Therefore, we introduced a downward-opening parabola in Fig. 9 as a benchmark reference curve for the PFV experimental results. This specific parabola was selected because its shape precisely characterizes the expected behavior of an ideal PFV method: it assigns the highest verification score in the absence of errors; permits a gradual decline for minor errors to ensure fault tolerance and avoid “double penalty” problem; and enforces an accelerated descent for larger errors to significantly penalize severe forecast failures.
The figure below is the revised Fig.9.
Figure 9. Results of displacement biases (top), intensity biases (middle), area size biases (bottom) experiments. The experimental results of each verification method are presented as average verification score curves with their 95% confidence intervals.
- The content in lines 296–335 may be appropriately simplified to prevent redundancy with the information presented in figure 9.
Reply: We completely agree with this comment. Now we have improved the paragraphs in lines 311-336, reduced the words amount from 660 to 426 and did not change the content.
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 691 | 333 | 70 | 1,094 | 80 | 37 | 92 |
- HTML: 691
- PDF: 333
- XML: 70
- Total: 1,094
- Supplement: 80
- BibTeX: 37
- EndNote: 92
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript proposes a novel precipitation forecast verification (PFV) method, CLPFV, based on self-supervised contrastive learning. The study addresses a meaningful methodological issue, namely, how to develop a comprehensive verification method that is more tolerant of minor forecast errors, more sensitive to substantial errors, and better able to reflect different degrees of error. The basic idea of using data augmentations (displacement, intensity, and area size), together with an improved contrastive loss function, to train a neural network to learn the gradient of forecast errors is both scientifically sound and methodologically elegant. Overall, I find this manuscript valuable and potentially suitable for publication in GMD after minor revision.
Major Comments:
1. In the Introduction, there is a logical gap between the discussion of the limitations of spatial methods and the introduction of deep-learning-based image verification. Please explain more explicitly how the extraction of “high-level abstract features” directly helps address the spatial “double penalty” issue.
2. I suggest adding a short subsection, for example, “2.1 Basic Idea,” to explicitly present the core logic behind the proposed solution to the scientific gap. Part of the second-to-last paragraph of the Introduction already seems to contain this basic idea.
3. In Section 2, the conceptual framework is somewhat mixed with specific technical implementations (e.g., ResNet-18). In my view, the proposed verification framework does not strictly depend on ResNet-18 or InfoNCE. A brief discussion of the portability of this framework in the Discussion section, such as its applicability to other spatial modeling tasks, would further strengthen the methodological contribution of the paper.
4. The simplification of forecast errors into displacement, intensity, and area size is reasonable and useful. However, “area size” may not fully capture all structural errors in real precipitation forecasts. A brief acknowledgement of this limitation would improve the manuscript.
5. The rationale for the quadratic penalty in the improved loss function could be explained more clearly. The current explanation is understandable but somewhat brief. One or two additional sentences on why a quadratic penalty is appropriate here would make the design more transparent.
Minor Comments:
Several acronyms are used in the Abstract (POD, FAR, TS, FSS, SAL) without prior definition. Please ensure that all abbreviations are spelled out at their first occurrence.