the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
FZStats v1.0: a raster statistics toolbox for simultaneous management of spatial stratified heterogeneity and positional dependence in Python
Abstract. Based on the traditional Focal Statistics and Zonal Statistics tools of mainstream GIS software, we developed a raster statistics toolbox named FZStats v1.0 using Python3 and QT5. The main contributions of this study are as follows. Firstly, the development of a specialized spatial analysis toolset designed to comprehensively address stratified heterogeneity, positional dependence, and their combinations, thereby addressing gaps in existing Focal and Zonal methods that individually tackle stratified heterogeneity and positional dependence problems. Secondly, our toolset features a user-friendly interface and structure, integrates both existing and enhanced spatial statistical methods, supports multi-processing and batch processing capabilities, and provides users with the flexibility to select calculation methods tailored to their computer configurations and application requirements. Thirdly, the newly proposed Focal-Zonal Mixed Statistics method demonstrates superior predictive accuracy compared to the traditional Focal Statistics and Zonal Statistics methods in geothermal detection, which preliminarily showcases the advantages of this new approach. Additionally, we discussed the advantages, robustness, and advancements of the Focal-Zonal Mixed Statistics method, concluding that the development of this new method and toolset is necessary and holds substantial potential for applications across diverse fields.
- Preprint
(2979 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2461', Anonymous Referee #1, 24 Oct 2024
By using Python, this study developed a new spatial statistics toolbox named FZStats v1.0. It provides details on the development process, raw code, and a user-friendly software product. This toolbox not only includes two categories of traditional general spatial statistical tools but also integrates the new developed Focal-Zonal Mixed Statistics method, which I believe is the core contribution of this research. The manuscript is well-structured, showcasing the necessity and advantages of the proposed Focal-Zonal Mixed Statistics method through a comprehensive review of existing research, methodology, model development, and applications, with thorough discussions, making it a clearly contributive and well-written article. To further enhance the quality of this manuscript and better serve its potential readers, I offer the following suggestions for the authors’ consideration:
- As instructed by the authors, the new model requires two input layers: value raster and zonal raster. In lines 214-219, the acquisition of the zonal raster is evidently based on reclassification, or in other words, the discretization of spatial variable. However, how should the discretization scheme, including the number of classes and classification methods, be determined? Could the authors provide some recommendations on this? Since different reclassification parameters may significantly influence the results.
- Line 220, the configuration of model input parameters is crucial for practical applications, but the introduction to how to set and choose window size, window shape, and statistic selection is somewhat insufficient, and readers may need guidance on this aspect.
- Line 248, the coding method proposed by the authors is concise and clear, but there is an issue: if used over a larger area with many factor classifications, it implies more placeholders. Will using this algorithm for coding risk exceeding computer reading limits?
- Line 392, Figure 4 lacks units.
- Line 394, the authors only mention the spatial coupling of LST with slope aspect but do not empirically test whether slope and aspect are major environmental factors influencing LST. Additionally, within a 7.2 km radius, can the effect of elevation on LST be ignored, especially in relatively complex mountainous terrain?
- Line 423, while the mapping in the manuscript is very standardized and exquisite, some figure fonts are too small, such as in Figure 8.
- Line 458, I am unclear about the reason for setting different representative areas for the mines.
- At last, after testing the software toolbox provided by the authors, I noticed that there seems to be no popup notification at the end of the run; I suggest optimizing this feature.
Citation: https://doi.org/10.5194/egusphere-2024-2461-RC1 -
RC2: 'Comment on egusphere-2024-2461', Anonymous Referee #2, 18 Mar 2025
This manuscript presents a Python-based tool with a graphical user interface that integrates Focal and Zonal methodologies, which are commonly used in GIS. These two approaches aggregate finer-resolution raster pixels based on different principles—Zonal methods rely on predefined rules (e.g., different topographies), while Focal methods use distance-based criteria (e.g., proximity to the center of a segment). Essentially, this tool smooths the original raster using statistical measures.
However, the manuscript lacks a clear motivation for why these specific statistics are important. Simply listing potential applications is insufficient to justify their significance. Additionally, FZStats appears to operate only on the intersection of focal and zonal zones (i.e., \( F == \text{True} \land Z == \text{True} \)), without addressing other conditions such as \( F == \text{True} \land Z == \text{False} \) or \( F == \text{False} \land Z == \text{True} \). This oversight raises concerns about how corner cases are handled. Section 2.3 does not clarify these aspects.
The manuscript’s writing style is unconventional and contains informal expressions. For instance, "Some scholars" (Line 77) lacks citations, and "Professor Zhu and his group" (Line 79) is too informal for an academic paper. Similarly, phrases like "We believe" (Line 102) introduce unnecessary subjectivity. Figure 1 also appears redundant. Moreover, the mathematical expressions resemble descriptions of implemented Python functions rather than standard equations. It would be more appropriate to include code snippets instead. Equations 15 and 16 are not expressed in a conventional mathematical form. I recommend revising this section by representing each pixel as an indexed high-dimensional array \([CE_1, CE_2, ..., CE_p]\) and grouping neighboring raster pixels based on shared label patterns.
Another major concern is the lack of a systematic performance evaluation. The methodology is benchmarked against a single real-world dataset with cross-sectional environmental variables, disregarding the fact that such variables often form time series. To strengthen the evaluation, the authors should incorporate multiple benchmarking datasets. Aside from the AUC metrics, Figure 7 suggests that LST outperforms FZStats within the FPR range of approximately 0.1–0.3, raising questions about the legitimacy of the FZStats approach, because methods with FPR beyond this range may not be desired. The manuscript entirely neglects the trade-off between TPR and FPR, which is critical for assessing the method’s robustness.
Overall, I am not convinced that this paper meets the publication standards of GMD in its current form. Substantial revisions are necessary to improve the motivation, methodology, and evaluation.
Citation: https://doi.org/10.5194/egusphere-2024-2461-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
180 | 39 | 150 | 369 | 16 | 14 |
- HTML: 180
- PDF: 39
- XML: 150
- Total: 369
- BibTeX: 16
- EndNote: 14
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 99 | 26 |
China | 2 | 37 | 9 |
Netherlands | 3 | 21 | 5 |
Romania | 4 | 20 | 5 |
Russia | 5 | 20 | 5 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 99