Temporal Inhomogeneities in High-Resolution Gridded Precipitation Products for the Southeastern United States
Abstract. High-resolution gridded precipitation products are widely used in hydroclimatic analyses, although their long-term stability has not been thoroughly evaluated. This study investigates temporal inhomogeneities in five widely used precipitation datasets – Daymet, gridMET, nClimGrid, PRISM, and TerraClimate – across the southeastern United States during 1980–2024. Annual precipitation totals were derived from both monthly and daily data and compared with a reference time series constructed from 120 U.S. Cooperative Observer Program (COOP) gauges. Residual-mass curves and Mann–Whitney U tests were applied to identify temporal inhomogeneities, and trend magnitudes were estimated using the Kendall–Theil robust line. Significant inhomogeneities were detected in most datasets, with nearly 80 % of discontinuities concentrated between 2002 and 2018. These shifts corresponded closely to changes in gauge-network composition and data-processing procedures. Daymet and PRISM exhibited wetting biases linked to the expansion of the Community Collaborative Rain, Hail, and Snow (CoCoRaHS) network and the concurrent decline of COOP gauges, whereas nClimGrid showed a drying bias resulting from increased reliance on Automated Surface Observing System tipping-bucket gauges, which underestimate rainfall. Step increases in TerraClimate and gridMET totals reflected transitions in input data and reprocessing of precipitation forcing fields. These inhomogeneities produced disparate multi-decadal trends ranging from 19 to 48 mm dec⁻¹ compared with a non-significant reference trend of 30 mm dec⁻¹. Among all datasets and combinations tested, the Daymet–nClimGrid pair was the only one without detectable discontinuities and reproduced the reference trend most accurately. This combination provides a homogeneous, temporally consistent dataset for multi-decadal precipitation analyses across the Southeast. Overall, the results demonstrate that unrecognized inhomogeneities in gridded precipitation products can substantially bias regional trend assessments and underscore the need to evaluate and, when necessary, combine datasets to ensure temporal stability in long-term hydroclimatic studies.
General comments
This manuscript compares annual precipitation trends from a static set of long-record COOP stations in the southeastern US with those of several gridded precipitation datasets. The author uses statistical analyses to determine how well the gridded datasets match trends from the set of COOP stations and determine break points in the time series. The study also averages datasets together and recommends a combined dataset as showing the best match in trends. This is a cleanly-written manuscript that is easy to read.
The first thing I noticed was that the PRISM dataset used is the All-Network (AN) dataset, which is not the best one to use in this kind of study. There is an LT (long-term) PRISM dataset that is at 800m resolution and monthly time step, going back to 1895. It is not as well-known to some users. It has been in existence for many years but has only been available without cost for about a year. The data can be found here:
https://data.prism.oregonstate.edu/time_series/us/lt/800m/ppt/monthly/
LT was developed to deal with the fact that dataset accuracy and temporal consistency are at cross purposes, so the best approach is to produce two datasets, each focusing on these different goals. AN was designed to incorporate many station networks and includes weather radar (starting in 2002) for precipitation so as to be as accurate as possible on any given day or month. But, as the author points out, doing so inevitably results in temporal discontinuities as stations come and go, new networks expand (i.e., CoCoRaHS), and weather radar products come online. In the southeast, the PRISM LT dataset uses mostly COOP and WBAN stations so should be very close to what the author is looking for. I respectfully ask that the LT dataset be used as the PRISM dataset in this study.
I was concerned about the finding that the PRISM AN dataset showed an unusually high average annual precipitation in 2002 compared to the COOP average. My first thought was that 2002 was the first year that the National Weather Service released their Stage IV radar-rain gauge product, which we have used since then to add radar-based precipitation features between weather stations (it has made a huge improvement over station-only analyses). Perhaps there was something off about their product in the first year? I have dug into the data and was able to confirm the spike in 2002 of about 100 mm by comparing AN with LT (which should have averages similar to your COOP averages) in a region similar to yours. I then calculated monthly differences in 2002 to see if there was a month or two that were strange in some way during that year. Here is a table showing the differences.
Month in 2002
AN-LT (mm)
AN (mm)
LT (mm)
January
1
116
115
February
1
64
63
March
4
146
142
April
7
71
64
May
7
107
100
June
17
129
112
July
29
161
132
August
22
142
120
September
11
184
173
October
2
139
137
November
-1
109
110
December
0
146
146
Annual
100
1514
1414
The largest differences were in the summer months, when convective precipitation is common. Weather radar does an excellent job of resolving complex convective patterns that station data alone do not sample, but one would expect that there would be roughly an equal number of winners and losers, just with better spatial resolution that produces higher highs and lower lows, with the net effect of a relatively small difference in regional averages.
The AN dataset is constrained by the station data, so radar is only a factor between stations. I was not able to find anything amiss, even running the model for different locations around the area. It may be just a case of the COOP data missing an unusually large number of high precipitation areas this year. I wish I had a better explanation, but so far, I have not found one. As an additional PDF I have included a figure showing maps of precipitation across the region for July 2002, where the differences between LT and AN are greatest and easiest to see. The stations used in each dataset are plotted as well (large circles indicate clusters of two or more stations).
A comprehensive document on all of the PRISM datasets that contains the most up-to-date information on new developments is available online. It is a valuable resource for understanding the latest on PRISM datasets. Table 4 describes the focus of each dataset.
https://prism.oregonstate.edu/documents/PRISM_datasets.pdf
There is also a fairly recent paper from 2021 which covers some important issues about PRISM precipitation time series modeling and can be referenced if desired:
https://journals.ametsoc.org/view/journals/atot/38/11/JTECH-D-21-0054.1.xml
Detailed comments with line numbers
110: with a minimum of eight years required in each group
Why was eight years chosen?
131: I am not sure what is meant by weather-bureau gauges. Do you mean ASOS (Automated Surface Observing System), WBAN (Weather Bureau Army-Navy), or something else? Based on the rest of the paper, I think you mean ASOS.
134: PRISM, which used gauges from 15 networks, showed a similar pattern, with increasing CoCoRaHS and decreasing COOP coverage.
This suggests that the author used the PRISM AN (All Networks) dataset, rather than the LT (Long Term) dataset. AN (All Networks) is not the best PRISM dataset to use for this analysis. In contrast, the LT dataset was developed precisely for this purpose. See general comments for more information.
137: TerraClimate had much less gauge coverage overall, with a maximum of 25% from cooperative gauges and an abrupt decline from 22% to <1% between 2010 and 2011.
According to the text, TerraClimate used WorldClim climatologies and anomalies from CRU time series and JRA reanalyses (but I think JRA is not used in the CONUS). Is the use of COOP data until 2010 shown in the figure derived from what used in CRU? And why did it suddenly stop?
138-139: Information on gauge coverage for gridMET was unavailable.
And yet Figure 3 shows “PRISM & gridMET” station usage. I believe gridMET uses PRISM grids so perhaps Figure 3 is mostly correct?
Figure 4. Percent coverage of the southeastern United States over time by gauge networks used in the five precipitation products.
I do not quite understand this figure. The text refers to Figure 4 when describing a comparison of daily and monthly versions of the datasets, but that is not what the caption refers to. Could it be that the caption should read something like: “Difference between monthly and daily precipitation totals (monthly minus daily?) for the five precipitation products.” But even so, I would not think that gridMET and TerraClimate would have the same results since I don’t think those products are related. Please clarify the figure and the text.
149-151: The overall best products were combinations of products, and those products were nClimGrid-PRISM, Daymet-nClimGrid-TerraClimate, Daymet-nClimGrid-PRISM, Daymet-gridMET-nClimgrid, and Daymet-nClimGrid.
Taking the mean of these datasets at the daily or monthly time step could produce some pretty strange results, even if the combination results in an overall unbiased dataset. They also could be a result of happenstance. For example, does the combination of PRISM and nClimGrid happen to cancel out PRISM’s wetting trend from CoCoRaHS with a drying of nClimGrid by increasing its proportion of ASOS stations? I suggest using the term “most unbiased” rather than “best” since it implies that the combined dataset is superior in every way, rather than for this study’s narrowly defined purpose.
Figure 5: I find it interesting that the PRISM AN dataset showed a discontinuity in 2002. This is the year when weather radar-aided interpolation was introduced into AN. Radar was not used in the LT dataset. See general comments for more on this.
201-202: CoCoRaHS gauges generally record slightly higher precipitation totals than COOP gauges, with increases of about 1–5% (CoCoRaHS, 2019; Goble et al., 2019).
I don’t see that information in either of the two references cited, but you are correct. There is a conference paper by Nolan Doesken (2005) worth citing that reports on results from a 10-year comparison of the 8” SRG with the 4” gauge used by CoCoRaHS. He found that overall, the 4” gauge caught 3% more precipitation than the 8” gauge.
Doesken, N., 2005. A ten-year comparison of daily precipitation from the 4” diameter clear plastic rain gauge versus the 8” diameter metal standard rain gauge. Preprints, 13th Symp. on Meteorological Observations and Instrumentation, Savannah, GA, Amer. Meteor. Soc., https://media.cocorahs.org/docs/AMS_NJD_GaugeComparison_AppldClimate_2-2.pdf
Is this 3% increase sufficient to explain the increasing precipitation over the study period? There is also the possibility that COOP observers show low biases compared to CoCoRaHS because of the difficulty in measuring light precipitation amounts with a measuring stick and an opaque gauge. See this paper for more on that:
Daly, C., W.P. Gibson, G.H. Taylor, M.K. Doggett, and J.I. Smith. 2007. Observer bias in daily precipitation measurements at United States Cooperative Network stations. Bulletin of the American Meteorological Society 88(6): 899-912. https://journals.ametsoc.org/view/journals/bams/88/6/bams-88-6-899.xml
One other (maybe silly!) possibility is that CoCoRaHS observers live in areas that are wetter than the regional average, such as Florida, and are skewing the results in that manner. One would have to control for climatological precipitation conditions to see if wetter areas are now being oversampled compared to dry areas.
228-230: Although using this dataset reduces the spatial resolution inherent to Daymet, the resulting gain in temporal homogeneity makes the Daymet–nClimGrid product the most robust dataset for regional, multi-decadal precipitation assessments.
Again, robust does not seem like the proper term. I would again suggest ‘to have the most unbiased trends” instead of robust. Also, the author would be remiss if they didn’t make some qualifying statements, here. First, the assessment was made using annual precipitation only and did not dig into the monthly or daily data, for example. The second is that the conclusion does not say anything about the overall quality or accuracy of the dataset, only that temporal trends matched up well. Lastly, the evaluation was made at one spatial scale, that of the entire SEUS, with no subregions within it. It is important that your conclusions support the methods and results of the study. For example, I believe that the PRISM LT dataset has stable temporal characteristics and is likely ideal for your purposes, but it is likely not as accurate as PRISM AN on any given day, month, or year. Each dataset has been developed with specific goals in mind.
I look forward to reading a revision of this manuscript and would be happy to correspond with the author and answer any questions. Feel free to contact me.
Best wishes,
Chris Daly
Chris.daly@oregonstate.edu