GraphIDW: Incorporating spatial autocorrelation in satellite&ndash;gauge precipitation merging using graph neural networks over a tropical region

Peiris, Nadee; Perera, Chamal; Wijayaratna, Nimal; Rajapakse, Lalith; Wijemannage, Ajith

doi:10.5194/egusphere-2025-6551

Preprints

https://doi.org/10.5194/egusphere-2025-6551

Preprints

13 Feb 2026

| 13 Feb 2026

GraphIDW: Incorporating spatial autocorrelation in satellite–gauge precipitation merging using graph neural networks over a tropical region

Nadee Peiris, Chamal Perera, Nimal Wijayaratna, Lalith Rajapakse, and Ajith Wijemannage

Abstract. Ground-based rain gauges remain the benchmark for accurate precipitation measurement; however, their sparse spatial distribution limits the representation of rainfall heterogeneity. Satellite-based Precipitation Products (SPPs) provide consistent spatial coverage but are often affected by retrieval errors and regional biases, restricting their direct use in local-scale hydrological applications. To overcome these limitations, Precipitation Data Merging (PDM) techniques integrating gauge and satellite observations have gained prominence. This study introduces a novel Machine Learning (ML) framework, GraphIDW, which combines Graph Neural Networks (GNNs) with Inverse Distance Weighting (IDW) interpolation to explicitly incorporate spatial autocorrelation into the merging process, addressing a major limitation of traditional ML-based PDM approaches. The framework was evaluated across the Wet Zone of Sri Lanka from 2001 to 2015 using two state-of-the-art SPPs (IMERG and CHIRPS) together with ground observations. IMERG data (0.1°) were first downscaled to 0.05° using CHIRPS, after which the downscaled product was merged with gauge observations through GraphIDW. A total of 60 gauges (70 %) were used for training and 28 (30 %) for validation. Results show that GraphIDW outperforms conventional ML algorithms, including Random Forest, Artificial Neural Network, Support Vector Regression, and XGBoost. It achieved the highest probability of detection (0.97) and reduced root mean square error (RMSE) and mean absolute error (MAE) by 13 %–41 % and 9 %–36 %, respectively, compared with the original SPPs. The results demonstrate that explicitly accounting for spatial dependence through graph-based learning significantly improves precipitation estimation, particularly in regions characterized by strong spatial heterogeneity. By embedding spatial autocorrelation directly into the merging process, GraphIDW provides a robust and computationally efficient framework for generating high-resolution rainfall datasets that are better suited for hydrological analysis in complex climatic and topographic settings.

Received: 30 Dec 2025 – Discussion started: 13 Feb 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Nadee Peiris, Chamal Perera, Nimal Wijayaratna, Lalith Rajapakse, and Ajith Wijemannage

Status: final response (author comments only)

RC1:
'Comment on egusphere-2025-6551', Anonymous Referee #1, 11 Mar 2026
This manuscript presents a novel study applying graph-based machine learning methods to precipitation estimation. The use of Graph Neural Networks (GNNs) is becoming increasingly popular in the Earth sciences, particularly for problems involving non-Euclidean data structures. In this regard, the study addresses an important topic and has the potential to contribute to the growing research exploring graph-based approaches in the spatial mapping of precipitation.
Overall, the paper is well structured and generally easy to follow, with a clear presentation of the study objectives and methodology. However, several major issues related to the methodology, evaluation framework, and clarity of some sections should be addressed before the manuscript can be considered for further review. These comments are outlined in the section below. Additional minor comments, including grammar, typographical corrections, and reference-related issues (like missing references), will be provided in a subsequent review round after the major concerns have been addressed.
Minor Comments
The introduction would benefit from incorporating several important recent studies that have applied innovative deep learning approaches and explicitly accounted for spatial autocorrelation in precipitation estimation frameworks. Including these studies would better position the current work within the existing literature and highlight methodological differences and contributions. Some relevant examples include (there are also several papers missing related ot GNNs methods and their recent applications):
https://doi.org/10.3390/rs15174160

https://doi.org/10.1016/j.rse.2023.113723

https://doi.org/10.1016/j.atmosres.2022.106159

Although the study primarily uses IMERG V6, the Data Availability section mentions the use of IMERG V7. This inconsistency should be clarified

It is recommended to include the Kling–Gupta Efficiency (KGE) metric in the evaluation. KGE has become a widely accepted performance metric in hydrological studies because it simultaneously accounts for correlation, bias, and variability, providing a more balanced assessment of model performance.

A more comprehensive statistical analysis of the gauge observations is needed. For example, it would be helpful to present seasonal variability of precipitation, mean precipitation distribution across stations, and the elevation-precipitation correlation.

In the manuscript, it is stated in the table that only monthly CHIRPS data were used. However, the methodology section and Figure 3 indicate that daily CHIRPS data were also used for downscaling. This discrepancy should be clarified.

Additionally, the manuscript should include a comparison of IMERG before and after downscaling. Downscaling should ideally lead to at least some improvement in accuracy; otherwise, simple interpolation techniques such as bilinear interpolation or nearest neighbor resampling might produce similar results.

Latitude and longitude were used as input features in the model. However, these variables are static spatial attributes, while satellite observations are dynamic temporal features. It can be concluded that the location is already encoded in the graph structures through adjacency and edge weight matrices.

Since the grid structure appears to be regular, the edge weights between nodes are likely identical. In such cases, a binary adjacency matrix may be sufficient. The manuscript should clarify whether weighted edges provide additional benefits in this context.

Please clarify which software packages or libraries were used to implement the GraphIDW model and the other machine learning methods. Providing implementation details improves reproducibility.

Major Comments
The post-processing residual correction was applied only to the GraphIDW approach, while the other machine learning methods were evaluated without this correction. This introduces an inconsistency in the comparison. It is recommended to apply the correction method to all machine learning models, which would allow for more direct (apple-to-apple) comparison between GraphIDW and the traditional ML approaches.

Inverse Distance Weighting (IDW) is a simple yet effective spatial interpolation method and is commonly used as a benchmark method. However, its spatial patterns are often strongly influenced by the uniform weighting scheme and bull’s-eye effect. The manuscript should clearly position IDW as a baseline and discuss its limitations relative to more advanced methods.

The manuscript mentions the Single Mass Curve method, but it is not sufficiently explained. Please provide a brief description of the method and explain how it contributes to assessing the reliability of the products.

It is strongly recommended to include maps showing the spatial distribution of mean precipitation (or representative high-intensity events) across the study region. Such visual comparisons are important because realistic spatial precipitation patterns are a critical indicator of model performance.

The manuscript states that the proposed approach follows the methodology of Baez-Villanueva et al. (2020) and Zhang et al. (2021). However, based on the description provided in Section 3.3, it appears that the implementation corresponds only to the method proposed by Baez-Villanueva et al. (2020). The approach introduced by Zhang et al. (2021) differs slightly. Therefore, the statement that the study follows both approaches may need clarification. Incorporating the method proposed by Zhang et al. (2021) could improve model accuracy. It would therefore be helpful if the authors could clarify this point and explain which method they exactly implemented.

It is recommended to reconsider the inclusion of Figure 13. The comparison of computational speed may not be fully informative without providing details about the computational hardware. For example, methods such as ANN and GNNs can be significantly accelerated when implemented on GPUs, whereas Random Forest (RF) models typically depend heavily on CPU-based multithreading. It is also unclear whether multithreaded training was used for the RF model and how many CPU cores were available. If the authors intend to keep this analysis, it is strongly recommended to report the hardware configuration used for training, including CPU specifications, number of cores, GPU usage (if any), and relevant software settings.
Citation: https://doi.org/10.5194/egusphere-2025-6551-RC1
- AC1: 'Reply on RC1', Nadee Peiris, 17 May 2026
  
  The response to Reviewer 01 is uploaded in the form of a supplement
  
  Citation: https://doi.org/10.5194/egusphere-2025-6551-AC1
RC2:
'Comment on egusphere-2025-6551', Anonymous Referee #2, 15 Apr 2026

This paper compares a few different machine learning (ML) approaches to merging available station information with satellite precipitation estimates to better capture the spatial and temporal patterns of rainfall over a region of Sri Lanka. Overall, I think the paper is well-written and the study is well-done. The findings support that blending the stations using any of the ML-based approaches improves the rainfall estimates when compared to stations that were withheld from the blending routine, and that GraphIDW slightly outperforms the other techniques for this region. With that in mind, I'm suggesting minor revisions to the manuscript, which I think will improve the overall value of the paper.
One thing I think is important when employing machine learning techniques is to justify why these techniques are needed when studying/addressing the issue identified here. I think the paragraph starting at line 51 does a nice job of addressing some of these improvements and different techniques that have been imposed. My guess is that with the relatively dense station network available for this paper’s study region, an intelligent IDW approach might be all that is required to achieve very similar results as the ML techniques used in the paper. Some of this seems to be born out later in the paper when it is shown that performance is about the same when using only 40% of the available stations compared to 70% that was used in the study.
My biggest criticism of the paper might be in the formulation of the “IMERG-daily,0.05” product put forth in equation 1. If I’m understanding this equation correctly, the fraction on the right side of the equation results in each day’s proportion of the monthly total, calculated independently for every 0.1-degree pixel. This daily proportion is then multiplied by the monthly total at the corresponding CHIRPS 0.05-degree pixel. This results in more of a CHIRPS-daily product than an IMERG-daily product, because the sum of the daily values will result in the CHIRPS monthly total. If there are big differences between IMERG and CHIRPS, imagine a monthly IMERG total of 50mm while CHIRPS is 250mm, the result of equation 1 is going to result in a data product that is going to be more like the 250mm rainfall total.
As I was thinking about a better formulation of equation 1, I couldn’t figure out how best to use the higher spatial resolution of CHIRPS combined with the better temporal resolution of IMERG. The thing to do may be to look at the fraction of each 0.05-degree CHIRPS monthly pixel relative to a 0.1-degree version of CHIRPS, and use that proportion to downscale the IMERG monthly totals to the 0.05-degree value, and then use that in the equation 1. While I’m not sure the paper needs to be changed, please address this in the response to the reviewer.
The other downside of the current formulation is that CHIRPS has station data already included in the product, especially before 2007 (if using CHIRPS version 3) or so. As a result you may be normalizing your rainfall estimate to a product that has station data included, and then using that as the “baseline” product in your analysis. Another alternative would be to use CHIRP (no “S”) which has the same spatial resolution and a very similar mean, but is made only using satellite data and without the addition of stations. However, none of this changes the subsequent analysis or methods, but it might mean that your “IMERG” baseline would have some different values in the skill assessments.
I really like the residual interpolation step that is included in GraphIDW. I wonder how much improved the satellite-based estimate would be with the residual IDW added at the end? Similarly, how would the other ML techniques fare with that extra “corrective” step? Given the slim differences in skill metrics across the different ML techniques, I could imagine the residual IDW improving some of the other approaches such that their skill metrics were better than GraphIDW. I would suggest exploring this, and potentially including it in the results.
Conversely, I think you could look at the estimate from just using the GNN technique, without the IDW, to look at the improvement over the IMERG alone. Then, compare that with the final GraphIDW output, and it would give you an estimate of the improvement of each of these components to the overall estimate. If you wanted, that GNN-only estimate could be compared to the other ML techniques to get an idea for how the IDW might help other estimates as well.
This study is using a relatively large number of stations compared to some regions that are more reliant on satellite rainfall estimates. That is very useful for producing such an accurate result, but it may be that the results are different in a significantly more data sparse region. You touch on this a little bit in the discussion, and I think it would be interesting to see how few stations are needed before there is a notable drop in estimate performance.
If you are looking to cut any of the material to make the paper more concise, I might suggest section 4.4. The evaluation included here is brief and the graphics are not particularly compelling. Certainly it is important to show that the technique can capture extreme events, but maybe a little more explanation of the value would be beneficial, because as it is there isn’t quite enough to fully appreciate what is being shown in the maps/barplots.
Overall, I thought the paper is very well-written. The description of the GraphIDW technique, other ML techniques, and the evaluation tests are very clear and easy to follow. I think the results of the different evaluation tests were clearly presented, and the graphics were helpful in displaying the relative merits of each of the techniques. I do think revisiting the formulation of the satellite estimate, and showing the value of the IDW step will dramatically improve the value of the paper and the persuasiveness of the GraphIDW technique.

Citation: https://doi.org/10.5194/egusphere-2025-6551-RC2
- AC2: 'Reply on RC2', Nadee Peiris, 17 May 2026
  
  The response to Reviewer 02 is uploaded in the form of a supplement
  
  Citation: https://doi.org/10.5194/egusphere-2025-6551-AC2
- AC3: 'Reply on RC2', Nadee Peiris, 17 May 2026
  
  The figure [Boxplots of the three quantitative metrics: MAE (a), RMSE (b), and PCC (c), for the GNN, GNN_IDW, and original IM_CH products] is attached again separately here due to its low clarity in the uploaded document.
  
  Citation: https://doi.org/10.5194/egusphere-2025-6551-AC3

Nadee Peiris, Chamal Perera, Nimal Wijayaratna, Lalith Rajapakse, and Ajith Wijemannage

Viewed

Total article views: 1,856 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,201	558	97	1,856	189	177

HTML: 1,201
PDF: 558
XML: 97
Total: 1,856
BibTeX: 189
EndNote: 177

Views and downloads (calculated since 13 Feb 2026)

Month	HTML	PDF	XML	Total
Feb 2026	693	293	51	1,037
Mar 2026	350	170	31	551
Apr 2026	94	40	6	140
May 2026	64	55	9	128

Cumulative views and downloads (calculated since 13 Feb 2026)

Month	HTML	PDF	XML	Total
Feb 2026	693	293	51	1,037
Mar 2026	350	170	31	551
Apr 2026	94	40	6	140
May 2026	64	55	9	128

Viewed (geographical distribution)

Total article views: 1,828 (including HTML, PDF, and XML) Thereof 1,828 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 31 May 2026

Short summary

Rain gauges give very accurate rainfall estimates, but they are too widely spaced to capture local rainfall variability. Satellites cover large regions but often contain local errors. Our study introduces GraphIDW, a new method that smartly combines satellite data and ground observations, considering spatial rainfall patterns. Applied across Sri Lanka, the method produced more accurate rainfall estimates, offering clear benefits for flood forecasting and climate analysis in complex environments.


Total:	0
HTML:	0
PDF:	0
XML:	0