the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Evaluating the consistency of forest disturbance datasets in continental USA
Abstract. Forests play a crucial role in the Earth System, providing essential ecosystem services and sustaining biological diversity. However, forest ecosystems are increasingly impacted by disturbances, which are often integral to their dynamics but have been exacerbated by climate change. Despite the growing concern, there is currently a lack of globally consistent and temporally continuous data on forest disturbances to characterize changes in disturbance regimes. This gap hinders our ability to accurately assess and respond to these changes.
In this study, we focus on the continental United States and compare four datasets on forest disturbances to evaluate their consistency and reliability regarding their spatial and temporal characteristics and driven agents, when available. Our analysis reveals a moderate agreement across the datasets, with inventory-based comparisons demonstrating the highest level of consistency. In contrast, comparisons involving remote sensing data show lower alignment and a delayed detection of disturbances by satellite observations compared to ground-based inventories. Additionally, discrepancies were observed in the identification of disturbance agents in overlapping areas. Our findings underscore the importance of careful data quality assessment and consideration of their inherent uncertainty when utilizing them for further applications. This study highlights the need for improved data integration and accuracy to advance the understanding of forest disturbances.
- Preprint
(4194 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-3534', Anonymous Referee #1, 03 Feb 2025
The article “Evaluating the consistency of forest disturbance datasets in continental USA” aims for a comparison of different forest disturbance recording products to identify consistency and mismatched between them. It provides an interesting first exploration of differences in ground survey- and remote sensing-based datasets and how they align in some key disturbance characteristics like the timing of the disturbance and disturbance agent. The authors put effort into data cleaning and sub setting to generate a comparable dataset. I found the language to be clear, but the manuscript is very lengthy and would benefit from a shortening and excluding double mentioning of information and putting many details into the appendix to focus the story of the study.
There are, major changes/issues that I would recommend the authors to consider. I will highlight the key aspects first, and then provide line-by-line comments in a classic review style.
1/ Aim of the study: My impression is, that this study is not properly focus on its aim. You do not demonstrate, why one product is more suitable than another one for a certain application and this depends highly on the (research) questions asked. It would be greatly beneficial if you demonstrate one or two use cases and evaluate the performance of the different products (singular and in combination). In some cases, a remote sensing product might be more useful, in other cases (e.g. when I want to analyse non-stand-replacing disturbances) a field-survey based dataset might be better. I am missing the overall guiding question here.
You could for example do a case study: How have disturbance dynamics developed in the U.S. over the past 20 years, compared between Western U.S. (Rocky Mountains) and Eastern U.S., or something similar. In such a setting you could explore if e.g. the Hansen dataset says disturbances have accelerated in one region, but FIA would say they have remained stable. This would also directly link to the climate change context you mentioned in the introduction.
Another avenue could be to test different ground-based surveys to generate disturbance maps and to explore the impact, different ground-based survey designs and information have on the mapping (and the combination of different surveys). This is again a different approach, but could be a point in the discussion.
I ask myself a bit what you expected, as different foci of survey methods will result in different disturbance characteristic information and information granularity. The different methods do not measure necessarily the same process.
2/ Data product choice to compare: It would be interesting to at least compare one other remote-sensing based product with the Hansen map, I listed some products in the line-by-line comments I came across, which are more specific to the US. Also, to separate analysis approaches to compare ground-survey products and remote sensing products seem to be more useful to me. The paper should not be published with at least including one other U.S. specific remote sensing-based forest disturbance product.
3/ Analysis approach: Even though there are some things that are not comparable, some products must be more useful/accurate than others or useful in a wider variety of settings. As a reader I would be interested in precise recommendations when to use which product, and when to avoid which product. And if there is a clearly superior product (but only if), then this should also be made clear. But it needs backing up with numbers. You did a first exploration in how the datasets align/ misalign in some characteristics, but I am missing an analysis which quantifies how data divergence might be affected by ownership, state, surveyor, topography or other environmental drivers (remote sensing products tend to be more uncertain in mountainous regions and areas with higher cloud cover and also field plot visits are more difficult and often rarer in certain locations). You only looked at some aspects separated, but there can be interactions which are important to consider (e.g. Is the year offset for remote sensing products higher in mountainous region? Are offsets very pronounced in some states, which would point to surveyors evaluating disturbance years differently?).
Furthermore, I am still not sure how you account for the swapping in the FIA dataset. Disturbances are very location specific and random; even when USDA claims this does not impact the ability to analyse the data, this is probably only the case for larger area representation or strata representation, but difficult for location-based comparisons. I know that we cannot change the product, but if it consistently misses disturbances, then the authors should reflect precisely on the swapping and make this a core point in the discussion.Line-by-line comment:
Abstract:
L.7: How do you define consistency and reliability? What is your benchmark?
L9: lower alignment as in spatial overlap?
L10: more stylistic, but I prefer when authors stay in the active form (also the text jumps between active and passive)
Main text:
L.16: “These services are sustainable to society …”, I do not really understand what you mean with this
L.20: true, but the paper also found that disturbances have a positive impact on biodiversity, which I would also mention in this context.
L.21: This impact definition of disturbances let’s me ask how you define disturbances in your study? I would have defined it as a mortality event in the first place, but depending on the context, this might be different. A clear disturbance definition would be in general helpful for the comparison of disturbance events.
L.29: I agree that quantifying the climate change impact on disturbances is an important field of research, but I do not see the clear connection between your study and approaching this question. This points in general to one of my main comments, that I would appreciate a comparison between different (combinations) of disturbance products to address different ecological research questions to evaluate their “performance”.
L.34: You speak about evidence being based on sparse data, but point to continental scale datasets in the following sentence. Depending on what kind of research question you ask, the current datasets at least for Europe are already quite impressive. So, when you make this statement, I would refine it a bit more for what kind of research questions we need more and refined data. (Also, for which spatial extend, as you compare a US based product, this might not help for questions in East Asia – so how do you address certain aspects of sparse datasets here?)
L. 38: IDA is not the abbreviation for the Forest Service and I assume it is the dataset you use from that agency. Please introduce the full dataset name before you use the abbreviation (same with FIA in the next lines, thou this one is very known, but helpful to write out for readers outside the field ). As a general (stylistic) hint, I always prefer to avoid the use of abbreviations where possible, as it disturbs the reading flow when the reader is not familiar with the abbreviations from before. But this is a style choice I guess.
L.45: and also depends on disturbance size and underlying forest structure
L.57: Very vague statement. What kind of analysis? Do we need better quality cameras, so the image quality to classify disturbances is better? Or do we need better training for surveyors? Or do we need more refined survey designs, to gain more consistent and high-quality datasets?
L.66: I am unsure how well the choice of datasets to compare is. As not all datasets exhibit the same features you want to compare (disturbance extent or disturbance agent), the pure comparison of the products is always leaning to be unsatisfactory. I do agree thou that it is interesting to look where they agree and disagree – is this what you mean by robustness? - but I am missing in the introduction the read thread to why you choose those different datasets (which to my knowledge partially focus on different disturbance types and agents).
L.70: I really like the effort to look into different disturbance products to explore their potential, but feel that identifying advantages and shortcomings needs an application example to actually test the performance of the different products to answer an (specific) question. Different products can inform differently well depending on the question asked.
L.73 ff.: Ok, but do you test the newly compiled dataset and if it improves analysis applications? I assume you need to reduce every product to some extent to create a harmonized combined dataset, which might be not advantageous in some applications?
L.77 Figure 1: I would appreciate some more map insets to get a better idea of the data sets overlay, also with the remote sensing data.
L.80: Table1: I appreciate the structured display of the different datasets compared in the study! A row indicating the disturbance agents surveyed would be helpful, as well as information on the resolution of the remote sensing product (for readers not familiar with the dataset) and a clearer description of the data format in the data row: (e.g. that the Hansen dataset is a continuous raster and that ITMN and FIA are point based estimates). You do this a bit in the information row, but also not extensively (e.g. you describe the IDS product to describe disturbances from insect and diseases, but in the following text in L.83 you state that it includes fire and wind disturbances).
L.90: You name mortality as an example for a damage type, which again leads me to the question on how you define a disturbance in your study? (I would have set tree mortality as a characteristic of a disturbance, but this depends on your overall definition).
L.91: Is there any information on how surveyors deal with anthropogenic disturbances such as harvest and salvage logging?
L.94: I see the advantage of utilizing the polygon data when comparing the dataset with the remote sensing-based map, but why excluding the point data, when you also compare this dataset with plot-level data?
L.111: The swapping is a major problem when comparing the datasets with each other. If you would do a regional or some kind of sufficient area-based comparison, this might hold. But how do you plan to disentangle the different uncertainties from the disturbance detection itself and the additional swapping?
L.136: I was not aware that the ITMN focuses on drought- and heat induced tree mortality or is this the dominant mortality driver recorded specifically in the US? What is the idea behind comparing the consistency in extent and agent of disturbances, when the extent is not recorded in some instances and the data products specifically focus on different agents?
L.140: I think including the Hansen disturbance map is fine, but why did you not include other remote-sensing based forest disturbance products, which are more specific for the USA? Remote Sensing products on a large scale often come with the cost of reduced accuracy when zooming into specific areas. Also, US specific disturbance maps can account better for specifics of the region and use f.e. country level auxiliary data for mapping improvement. Some other RS products explicitly map disturbances in the US only, e.g.: Masek et a. (2013). Ecosystems. https://link.springer.com/article/10.1007/s10021-013-9669-9 and Zhao et al.(2018). Remote Sensing of Environment. https://www.sciencedirect.com/science/article/abs/pii/S0034425718300476 (or Goward / Dungan 2020)
I am not sure if the products are publicly available, but for such a comparison study it could be worth it to contact the authors. Else there are other available datasets which could be included in a further comparison: Schleeweis, K., G.G. Moisen, C. Toney, T.A. Schroeder, C. Huang, E.A. Freeman, S.N. Goward, and J.L. Dungan. 2020. NAFD-ATT Forest Canopy Cover Loss from Landsat, CONUS, 1986-2010. ORNL DAAC, Oak Ridge, Tennessee, USA. https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1799
I would argue you need to include at least one other U.S. specific disturbance map from Landsat to make this a valid comparison.
L.159: I like the flowchart overview, but there are a lot of details in the graph which might be overwhelming and not necessary with a good explanation in the text. I would keep a reduced version and put this one in the attachment, but this is more of a style choice.
L.160: The Method section is very detailed. I welcome the transparency and decent documentation, but would recommend to shorten it for the sake of readability and flow of the text and keep the more detailed version for the appendix. For example, the exact column codes for the agent categories are not helpful for my understanding at this point, but great for reproducibility, so for those interested this information is great for the attachment.
L.176: Could you find out why there are these really big polygons? And did you exclude polygons post or pre-dissolving polygons with the same disturbance year and agent?
L.186: I find it difficult to exclude all disturbance events without an assigned disturbance agent, especially when you compare the extent or just recording of a disturbance event between data products. Also, when checking for the disturbance period/time comparison, you loose quiet some data here. What is the rational for excluding all this information?
L.195: I am not really convinced that due to the swapping a proper comparison of disturbance event recording and agent is possible here. The buffer only addresses the fuzzing and also here I see a lot of potential bias when comparing it with the remote sensing-based map when e.g. the FIA plot is at the edge of a disturbance and moved a mile away from the disturbance edge. Also, as mentioned earlier, you can try to assess the effect of the swapping on the disturbance detection consistency between products, but how do you account for uncertainties in the other product and the different sources of uncertainty in the disturbance recording in this dataset itself vs. the effect of the swapping? A regional comparison as I proposed in my main comments could help here.
L.215: I find this whole section a bit unstructured and got lost which product you are comparing now with which one. And do you want to compare simply the disturbance occurrence or also the disturbance extent/size when possible?
L.239: So, you did not incorporate the information from the FIA and ITMN databases into a fused data product? I can think of various reasons why this is tricky, but could you elaborate on that choice of excluding the other datasets?
L.248: Why reporting only the values for the smaller buffers, when the fuzzing can go up to 1 mile?
L.156 ff: Next to the spatial overlap of patches, it would be interesting to compare the overall disturbed area mapped by GFC versus IDS (so in the area where we know that IDS mapping took place, how much area in total was mapped by the surveyors versus identified by the Hansen Landsat based map?).
L.284: This whole section can be shortened significantly by only highlighting those comparisons which stands out. The rest of the information can be retrieved from the reader in the graphic and table.
L.288: Why do you not compare all datasets in one graphic?
L.307: In figure 4, please write out the disturbance agents or give the full agent name in the figure description.
L.294: In general, I would propose using a confusion matrix instead of a bar chart comparison, so we can see which agent have been labelled by another agent (is it always the same agent misalignment? – than it is likely the labelling or is the confusion of agents more diffuse?). Also I would urge to report much clearer summary statistics.
L.300: Would be also interesting, to see the agent agreement/disagreement in space (with a map, points indicating by colour the agreement). Maybe there is a spatial pattern? Also, it would be interesting to know if the agent agreement depends on disturbance size or if there is a clustering in time or if this simply depends on the surveyor.
L.307: This section is actually a part of the discussion and not of the result section. When you want to lead over looking into effects of ownership and the recorded disturbance year versus measurement year, then I would give this structure in the methodology section (and prime it in the introduction) and simply report your results here.
L.308: What about the spatial extent/ overlap?
L.315: The agent attribution in the FIA and IDA doesn’t happen every year (but only the disturbance mortality)? This was not clear until this point and should be stated explicitly when describing the dataset. In this case simply comparing the disturbance years might not be a valid approach anymore. How do the surveyors decide on a disturbance year at this point? And which year do you compare in Figure 3?
L.316: Yes, but that comes again down to what you define as a disturbance and without knowing the product too much in detail, there is probably a higher uncertainty due to image quality and training data quality.
L.321: Here I would be right away interested in the number of points which you actually compare (so information from line 329 ff.). Also, it is not clear to me which numbers you compare here until you go more in detail from L 329 ff.
L.325: What is meant with all data?
L.326: This doesn’t seem like a big difference to me looking at the distribution spread (it is just a few months), so I would claim based on these results that it is the same between private and public. But did you take a test on the differences, which also accounts for the different sample size?
L.327: Can you elaborate on that? The SDs for all comparisons range between 6.7-7.4 years (if I did not misread it), that’s pretty much the same isn’t it?
L.334: Again, did you test that in any way? That does not seem like a big difference (between public and private) to me, especially when the agents are not recorded annually. In general, it would have been helpful here to test for variable impact with a modelling approach. How much of the lag in disturbance detection is explained by the dataset combination compared, the ownership, maybe even topography/landscape/ tree species information and the state and year.
L.335: This information belongs into the dataset description, as I assumed up to here (or line 315) that the agent was recorded annually as well. This makes a big difference and introduce immense uncertainty, so far that I am not sure if you rather measure the surveyor differences than anything else.
L.338: How is a plot measured across different years?
L.343: Again, how do the surveyors decide if a tree or tree cohort dies 15 or 17 years ago? There should be some information in a recording/survey protocol about that, otherwise this seems rather arbitrary.
L.365: In this case you need to correct table 1 as you state annual recordings there (or it is easy to misunderstand).
L.385: But seems to be the similar case for the IDS dataset, that mapping only happens when disturbances reach a certain extent and severity.
L.388: Exactly, and it depends on the process(-scale) you want to research. Hence a demonstration of the dataset on a simply question would be helpful to evaluate the suitability of the datasets (and dataset combinations) to evaluate that.
L.394: The buffer helps to identify presence or absence of the disturbance event registration, but does not enable you to compare the disturbance area (overlap), which is an extremely important information when analysing disturbances and their impact on the environment. You cannot do that with all datasets, as the information is missing in two of them, but as you also compared IDS and GFC, I would discuss that here a bit more.
L.396: I do am surprised by that! Do they include the condition of a plot being disturbed as a criterion for the swapping? (so only disturbed plots get swapped?)
L.410: You need to elaborate/discuss more what drives the difference in survey and disturbance recording year, as this is such a big uncertainty and driver of this pattern. If there are no survey manuals or guidelines by the USDA for the recording of the actual disturbance year, I would not use that information at all as it seems rather being a guess.
L.446: You should have reduced the agent categories to the one with the lowest granularity (so the broadest agent definition).
L.446: Yes, but this is also as surveys have a hard time to differentiate drought stress and mortality induced by drought in the field or on the screen. You only have a chance to tear apart stressors or initial disturbance and actual or secondary disturbance leading to mortality by a more detailed research setup.
L.472: That is not true, salvage logging and human interventions in general increase the chances for a disturbance detection, but disturbance detection does not rely on human intervention.
L.481: From which finding do you derive that inventory-based datasets show the highest reliability? Which dataset do you define as the “trusted” benchmark datasets, which is the most reliable?
L.482: Most medium-resolution RS disturbance products have a hard time or no chance at all to identify lower severity disturbances. So, if you want to research/include those in an analysis, you have to turn to other products anyways (this is no new information). I would have been interested to see in which way you can combine remote sensing and ground-based surveys, to gain more insights about disturbance dynamics. This leads back to the request of demonstrating the use of the datasets on different research questions and to evaluate their performance.
L.485: I am not sure how much your findings support this statement. As you compared different details in agent recordings, you say more about the differences in the record method, but not about the reliability in the agent detection. You would have to compare the least detailed agent classes and see how those align.
L.491: A demonstration on using the different ground-based surveys (and their combinations) to generate a remote sensing-based disturbance map in order to understand the implications of the survey uncertainties would have been helpful. I am not sure what to draw from the current analysis.
L.495: This is not really helpful to design methods for disturbance mapping per se. The question on how to set up a disturbance mapping design depends so much on the use case of the data and the research questions which we aim to answer. I am not sure what should follow from this. Should the agencies align their surveys and bundle their efforts, to create one more detailed disturbance data product? In what information dimension (year of disturbance, agent, disturbed area) is which dataset more preferable? Or which dataset combination?
Citation: https://doi.org/10.5194/egusphere-2024-3534-RC1 -
RC2: 'Comment on egusphere-2024-3534', Anonymous Referee #2, 10 Feb 2025
In the article: ““Evaluating the consistency of forest disturbance datasets in continental USA”, the authors compare datasets characterizing forest damage in the Continental United States and Alaska. They compare two regional field and aerial inventory datasets against two global remote sensing datasets. The authors process information in each dataset to create a single comparable dataset, and examine agreement between the datasets.
I found the manuscript to be generally well written and clear. However, the description of each dataset was unnecessarily long, as detailed methods for each are in the original papers. Removing some of the extraneous details from Section 2 would greatly reduce the size of the text.
The methods used were also generally well executed. Using Gaussian distributions to track the lag in disturbance event detection between datasets was a clever method that is simple, intuitive, and effective.
However, I also have a few major issues and suggestions for the authors:
Paper Framing/Motivation)
First, I have concerns about the framing. As a reader, I found myself wondering what the “so what?” of the paper was. The current paper framing suggests that there is a need to compare ground vs remotely sensed datasets for consistency and accuracy. However, the selection of datasets for comparison seems arbitrary. The ground datasets are part of a regional network of concentrated sites with a goal of intensively sampling the entire continental United States, while the remote sensing datasets included are at a global scale. To me, this creates a mismatch in scale that is not rectified within the current framing. Is there a need for better regional assessments of global products? If this is not the motivation, then I think a better comparison would be in using remote sensing datasets that have been calibrated or developed regionally, such as the North American Forest Dynamics (NAFD) Forest Loss Attribution dataset (Schleeweis et al. 2020), or the Monitoring Trends in Burn Severity (MTBS) product. So I am left wondering what this analysis is hoping to accomplish. I could see a reasonable argument for the following:
- Providing a regional assessment of global remote sensing products using field inventories as a benchmark.
- Evaluating dataset performance or appropriate use in different contexts. For instance, determining the scenarios in which using field vs remote sensing datasets (or a combination) might be more useful.
In both cases, it seems the paper needs reconfiguring.
Major issues with Methods)
In addition to a reframing and proper motivation, A clear explanation for the decision to compare these 4 datasets in particular is needed.
If disturbance agent is a foci of the paper, I do not understand why IDS is the only remotely sensed disturbance dataset being used as it most heavily represents one disturbance agent. Is insect damage a focus of the paper or are the authors interested in all disturbances? If all, then it would be more appropriate to compare other regional disturbance datasets derived from remote sensing instead of the GFC (see my comment above).
Why include Alaska in this assessment? FIA collection across Alaska is relatively new and to my knowledge, there has not been an Alaska-wide data release yet. The datasets for Alaska in Figure 1 look to be primarily GFC with very little data from other datasets.
General comments
L65-70 It is difficult to tell which datasets are inventory based vs remote sensing based. Suggest edit and for clarity.
L73 remotely-sensed rather than remote-sensing
L84 IDS is defined multiple times. Define once he first time it is mentioned in the main text. Also, a reference and link to the specific dataset documentation or web page would be preferred.
L85-90 This section needs rephrasing. Currently large portions of sentences seemed borrowed directly from the web pages describing the dataset and applications.
L137-138 sentence needs rephrasing for clarity: 78% of global forested area is north of the Equator
L262. Throughout the writing, the text would benefit from a consistent and clear grouping of the datasets. Are the datasets ground vs remote sensing, point-based vs spatially explicit, etc.? Once a standard language is determined, continuing to group the datasets accordingly across methods, results, and discussion (including figures and tables) would help the reader more easily interpret results.
L267: Should a word be added here? “Indicating that FIA records disturbance”
L353 Fire is a large disturbance component in the United States. It is heavily emphasized in the introduction and discussion, but not reflected in the selected datasets.
L369-372 Uncertainty in the IDS is well documented in this paper as well: (Cohen etal.,2016)
L410-415 Again, as disturbance detection is emphasized in the study, I do not understand why a global dataset composed of estimated stand replacing disturbances (GFC) was compared with other datasets focused on much more nuanced disturbance patterns.
Figure 1. The IDS polygons in Panel B seem to be plotted over the ITMN and FIA values. I would suggest having the IDS polygons mapped first, followed by the GFC raster data, and the other vector data last.
Figure 4. The figure needs a description of the disturbance agents – e.g. what does BB stand for? Either spell out the disturbance agent on the figure or describe the acronyms in the legend.
Table 1. FIA coordinates are fuzzed across all land ownership types, not just those on privately owned land. A portion of those on private lands are swapped.
Table 3. Additional description of this table is needed to discern what the rows and columns represent. Also, here and elsewhere, grouping the datasets based on whether they are remote sensing or ground datasets would be helpful (i.e. placing FIA and ITMN next to one another rather than split between IDS and GFC).
References from comments:
Cohen, W.B., Yang, Z.,Stehman, S.V.,Schroeder, T.A.,Bell, D.M.,Masek, J.G.,et al. (2016). Forest disturbance across the conterminous United States from 1985–2012: The emerging dominance of forest decline. Forest Ecology and Management, 360,242–252. https://doi.org/10. 1016/j.foreco.2015.10.042
Schleeweis, K.G., G.G. Moisen, T.A. Schroeder, C. Toney, E.A. Freeman, S.N. Goward, C. Huang, and J.L. Dungan. 2020. US National Maps Attributing Forest Change: 1986–2010. Forests, 11(6), p.653. https://doi.org/10.3390/f11060653
Citation: https://doi.org/10.5194/egusphere-2024-3534-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
154 | 43 | 14 | 211 | 12 | 10 |
- HTML: 154
- PDF: 43
- XML: 14
- Total: 211
- BibTeX: 12
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 74 | 35 |
Germany | 2 | 41 | 19 |
France | 3 | 15 | 7 |
India | 4 | 8 | 3 |
Brazil | 5 | 7 | 3 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 74