the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The Spatio-Temporal Visualization Tool HMMLVis in Renewable Energy Applications
Abstract. In this work, we present HMMLVis, an original visualization tool for multivariate Granger causal inference. More precisely, for heterogeneous Granger causality to infer causal relationships in time-series following an exponential distribution. HMMLVis is easy to use and can be applied in any scientific discipline exploring time series and their relationships. In this paper, we focus on climatological and meteorological applications. The visualization tool is demonstrated on different types of applications related to meteorological events on the upper/lower tails of the respective distributions using a renewable energy (wind, PV), air pollution, and the EUMETNET postprocessing benchmark data set (EUPPBench) and different temporal horizons. We demonstrate that the HMMLVis method and visualization depicts the known causal and detects causal relations in the temporal dependencies which are additional important information for the respective cases. We believe that HMMVis as an interpretable visualization tool will serve climatologists or meteorologists and in this way it will contribute to knowledge discovery in these scientific fields.
- Preprint
(2134 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (extended)
-
RC1: 'Comment on egusphere-2024-3126', Anonymous Referee #1, 02 May 2025
reply
General Comments
This manuscript presents HMMLVis, a novel and thoughtfully designed visualization tool for heterogeneous Granger causal inference in multivariate time-series data. Built upon the heterogeneous graphical Granger model (HGGM) within a generalized linear model (GLM) framework and employing Minimum Message Length (MML) principles, the tool aims to support the discovery and interpretation of causal relationships in complex time-dependent datasets. This is a timely and well-motivated contribution, particularly as interest in data-driven causal inference grows within Earth system sciences.
The tool is demonstrated across several domains—ranging from renewable energy and air pollution to meteorological benchmark datasets such as EUPPBench—and is accompanied by a PyQt-based graphical interface that lowers the barrier for non-expert users. The interdisciplinary nature of this work is commendable, and the software addresses a real need for interpretable and accessible causal analysis in environmental applications.
That said, while this study reflects a strong technical effort, the focus of the manuscript—primarily on software development and visualization—appears misaligned with the scientific scope and editorial aims of Geoscientific Model Development (GMD). GMD is dedicated to the development, evaluation, and application of models in the geosciences. A large portion of this manuscript, especially Sections 5 and 6, centers on user interface features, visual options, and layout design rather than the advancement of geophysical modeling. In its current form, the paper may be better suited for a journal dedicated to scientific software development or computational tools.
Moreover, the clarity, organization, and presentation of the manuscript require significant improvement to meet publication standards. Several formatting inconsistencies, unclear figures, and missing definitions reduce overall readability.Specific Comments
1. Scientific Scope and Fit for GMD:
While causal inference in environmental sciences is a relevant topic, the primary focus of this work is software visualization, and its main contributions lie in user-interface design and graphical rendering of beta coefficients. The paper does not deeply engage with geophysical model development or novel methodological contributions to causal modeling itself. For this reason, the manuscript may be better suited to journals such as : EGUsphere preprints, or other open source software journal.
2. Visualization and Readability:
Many of the figures (e.g., GUI screenshots, wind rose plots) have awkward scaling or inconsistent proportions, which affects readability. It is recommended to resize and standardize image layouts, especially in Figure 6 and others, to improve visual clarity.
3. Sections 5 and 6 – Placement:
These two sections are heavily focused on user-interface walkthroughs and technical instructions. While informative, they resemble a user guide rather than scientific content and would be more appropriate as supplementary material. The main paper should focus on the scientific rationale, methodological innovations, and evaluation results.
4. Terminology and Acronyms:
The manuscript contains several instances where acronyms (e.g., EUMETNET) are introduced without first spelling out the full name. This is not compliant with standard academic writing practices. Please ensure that all acronyms are introduced in full upon first use, followed by the abbreviation in parentheses.
5. Motivation for HMML and MML:
The rationale for using heterogeneous Granger models and MML-based feature selection should be articulated more clearly for readers unfamiliar with these frameworks. What concrete limitations of classical Granger models does HMML overcome, especially in environmental data contexts?
6. Model Evaluation:
While the tool is applied across several datasets, the evaluation lacks clear performance metrics or validation benchmarks. For example:
How does the tool’s output compare with known or simulated ground-truth causal structures?
Are the inferred relationships stable across time windows and locations?
Could precision/recall, consistency, or information gain be reported?
7. Synthetic Data Use:
The paper uses semi-synthetic datasets (e.g., for PV and wind), but the construction process needs to be described in more detail. How realistic are these datasets? What modeling assumptions underlie their creation, and what uncertainties are introduced?
8. Link Functions and GLMs:
More explanation is needed regarding the choice of link functions and distributions within the exponential family. Were they selected empirically, or based on expert input? Was model fit compared across different options?
9. Scalability and Runtime:
Please provide information on the computational cost of using HMMLVis, particularly with sliding windows and larger variable sets. How long does one window take to process?Minor Comments
1. Throughout the manuscript, some notations (e.g., β\betaβ, η\etaη, indices i,j,ti, j, ti,j,t) are inconsistently formatted—sometimes in plain text, sometimes in math mode. Ensuring typographic and notational consistency across all equations and text would improve professionalism.
2. While most figures are labeled, some captions could be more descriptive to help readers interpret them without referring back to the main text. For example, indicate clearly what variables or locations the time series refer to, what color scales represent, or whether the visualizations correspond to real or synthetic datasets.
3. The abstract currently blends methodology and application without clearly delineating the main contribution. Consider restructuring it into: (1) motivation, (2) method, (3) key results, (4) broader implications—to improve clarity and impact for readers scanning the abstract alone.
4. Some sentences are overly long or have ambiguous phrasing, particularly in Sections 2 and 4. For instance, compound sentences mixing mathematical definitions and explanatory text can be split for better readability. A careful language edit would improve clarity.
5. Check reference formatting for consistency (e.g., Behzadi et al. (2019) vs. Behzadi et al., 2019). Ensure all references are cited in a consistent style and match the GMD citation standards.Citation: https://doi.org/10.5194/egusphere-2024-3126-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
201 | 63 | 59 | 323 | 14 | 19 |
- HTML: 201
- PDF: 63
- XML: 59
- Total: 323
- BibTeX: 14
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1