The Spatio-Temporal Visualization Tool HMMLVis in Renewable Energy Applications

Wöß, Rainer; Hlavácková-Schindler, Katerina; Schicker, Irene; Papazek, Petrina; Plant, Claudia

doi:10.5194/egusphere-2024-3126

Preprints

https://doi.org/10.5194/egusphere-2024-3126

Preprints

16 Oct 2024

| 16 Oct 2024

The Spatio-Temporal Visualization Tool HMMLVis in Renewable Energy Applications

Rainer Wöß, Katerina Hlavácková-Schindler, Irene Schicker, Petrina Papazek, and Claudia Plant

Abstract. In this work, we present HMMLVis, an original visualization tool for multivariate Granger causal inference. More precisely, for heterogeneous Granger causality to infer causal relationships in time-series following an exponential distribution. HMMLVis is easy to use and can be applied in any scientific discipline exploring time series and their relationships. In this paper, we focus on climatological and meteorological applications. The visualization tool is demonstrated on different types of applications related to meteorological events on the upper/lower tails of the respective distributions using a renewable energy (wind, PV), air pollution, and the EUMETNET postprocessing benchmark data set (EUPPBench) and different temporal horizons. We demonstrate that the HMMLVis method and visualization depicts the known causal and detects causal relations in the temporal dependencies which are additional important information for the respective cases. We believe that HMMVis as an interpretable visualization tool will serve climatologists or meteorologists and in this way it will contribute to knowledge discovery in these scientific fields.

Received: 07 Oct 2024 – Discussion started: 16 Oct 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Rainer Wöß, Katerina Hlavácková-Schindler, Irene Schicker, Petrina Papazek, and Claudia Plant

Status: final response (author comments only)

RC1:
'Comment on egusphere-2024-3126', Anonymous Referee #1, 02 May 2025

General Comments

This manuscript presents HMMLVis, a novel and thoughtfully designed visualization tool for heterogeneous Granger causal inference in multivariate time-series data. Built upon the heterogeneous graphical Granger model (HGGM) within a generalized linear model (GLM) framework and employing Minimum Message Length (MML) principles, the tool aims to support the discovery and interpretation of causal relationships in complex time-dependent datasets. This is a timely and well-motivated contribution, particularly as interest in data-driven causal inference grows within Earth system sciences.

The tool is demonstrated across several domains—ranging from renewable energy and air pollution to meteorological benchmark datasets such as EUPPBench—and is accompanied by a PyQt-based graphical interface that lowers the barrier for non-expert users. The interdisciplinary nature of this work is commendable, and the software addresses a real need for interpretable and accessible causal analysis in environmental applications.

That said, while this study reflects a strong technical effort, the focus of the manuscript—primarily on software development and visualization—appears misaligned with the scientific scope and editorial aims of Geoscientific Model Development (GMD). GMD is dedicated to the development, evaluation, and application of models in the geosciences. A large portion of this manuscript, especially Sections 5 and 6, centers on user interface features, visual options, and layout design rather than the advancement of geophysical modeling. In its current form, the paper may be better suited for a journal dedicated to scientific software development or computational tools.

Moreover, the clarity, organization, and presentation of the manuscript require significant improvement to meet publication standards. Several formatting inconsistencies, unclear figures, and missing definitions reduce overall readability.
Specific Comments

1. Scientific Scope and Fit for GMD:

While causal inference in environmental sciences is a relevant topic, the primary focus of this work is software visualization, and its main contributions lie in user-interface design and graphical rendering of beta coefficients. The paper does not deeply engage with geophysical model development or novel methodological contributions to causal modeling itself. For this reason, the manuscript may be better suited to journals such as : EGUsphere preprints, or other open source software journal.

2. Visualization and Readability:

Many of the figures (e.g., GUI screenshots, wind rose plots) have awkward scaling or inconsistent proportions, which affects readability. It is recommended to resize and standardize image layouts, especially in Figure 6 and others, to improve visual clarity.

3. Sections 5 and 6 – Placement:

These two sections are heavily focused on user-interface walkthroughs and technical instructions. While informative, they resemble a user guide rather than scientific content and would be more appropriate as supplementary material. The main paper should focus on the scientific rationale, methodological innovations, and evaluation results.

4. Terminology and Acronyms:

The manuscript contains several instances where acronyms (e.g., EUMETNET) are introduced without first spelling out the full name. This is not compliant with standard academic writing practices. Please ensure that all acronyms are introduced in full upon first use, followed by the abbreviation in parentheses.

5. Motivation for HMML and MML:

The rationale for using heterogeneous Granger models and MML-based feature selection should be articulated more clearly for readers unfamiliar with these frameworks. What concrete limitations of classical Granger models does HMML overcome, especially in environmental data contexts?

6. Model Evaluation:

While the tool is applied across several datasets, the evaluation lacks clear performance metrics or validation benchmarks. For example:

How does the tool’s output compare with known or simulated ground-truth causal structures?

Are the inferred relationships stable across time windows and locations?

Could precision/recall, consistency, or information gain be reported?

7. Synthetic Data Use:

The paper uses semi-synthetic datasets (e.g., for PV and wind), but the construction process needs to be described in more detail. How realistic are these datasets? What modeling assumptions underlie their creation, and what uncertainties are introduced?

8. Link Functions and GLMs:

More explanation is needed regarding the choice of link functions and distributions within the exponential family. Were they selected empirically, or based on expert input? Was model fit compared across different options?

9. Scalability and Runtime:

Please provide information on the computational cost of using HMMLVis, particularly with sliding windows and larger variable sets. How long does one window take to process?
Minor Comments

1. Throughout the manuscript, some notations (e.g., β\betaβ, η\etaη, indices i,j,ti, j, ti,j,t) are inconsistently formatted—sometimes in plain text, sometimes in math mode. Ensuring typographic and notational consistency across all equations and text would improve professionalism.

2. While most figures are labeled, some captions could be more descriptive to help readers interpret them without referring back to the main text. For example, indicate clearly what variables or locations the time series refer to, what color scales represent, or whether the visualizations correspond to real or synthetic datasets.

3. The abstract currently blends methodology and application without clearly delineating the main contribution. Consider restructuring it into: (1) motivation, (2) method, (3) key results, (4) broader implications—to improve clarity and impact for readers scanning the abstract alone.

4. Some sentences are overly long or have ambiguous phrasing, particularly in Sections 2 and 4. For instance, compound sentences mixing mathematical definitions and explanatory text can be split for better readability. A careful language edit would improve clarity.

5. Check reference formatting for consistency (e.g., Behzadi et al. (2019) vs. Behzadi et al., 2019). Ensure all references are cited in a consistent style and match the GMD citation standards.

Citation: https://doi.org/10.5194/egusphere-2024-3126-RC1
- CC1:
  'Reply on RC1', Irene Schicker, 17 Nov 2025
  We thank the reviewer for the careful reading of our manuscript and the constructive suggestions. We want to address three points, in particular, we (i) clarified all acronyms and notation at first occurrence, (ii) expanded the description of the semi-synthetic datasets used in the case studies, including their construction, validation, and limitations, and (iii) added a new figure that directly compares semi-synthetic and operational data for wind power, PV power, and global horizontal irradiance (GHI).
  Reviewer comment 4: “The manuscript contains several instances where acronyms (e.g., EUMETNET) are introduced without first spelling out the full name. This is not compliant with standard academic writing practices. Please ensure that all acronyms are introduced in full upon first use, followed by the abbreviation in parentheses.”
  
  Response:
  We agree and have revised the manuscript to consistently introduce all acronyms at first occurrence by spelling out the full name followed by the abbreviation in parentheses. This includes, but is not limited to, EUMETNET (European Meteorological Network), GHI (global horizontal irradiance), PV (photovoltaic), ERA5 (ECMWF Reanalysis v5), CAMS (Copernicus Atmosphere Monitoring Service), HMML (Heterogeneous Minimum-Message-Length) and HMMLVis. We also checked the notation for all symbols and indices and ensured that they are defined where they first appear in the text.
  
  Reviewer comment 7: “The paper uses semi-synthetic datasets (e.g., for PV and wind), but the construction process needs to be described in more detail. How realistic are these datasets? What modeling assumptions underlie their creation, and what uncertainties are introduced?”
  Response:
  We thank the reviewer for pointing out that our description of the semi-synthetic datasets was too brief. We have substantially expanded the corresponding subsection in the Data and Methods section and added a validation figure (Fig. X for now, see attached PDF) to document the realism and limitations of these datasets.
  
  In the revised manuscript, we now distinguish clearly between:
  
  Wind power case (onshore wind farm):
  We start from ERA5 reanalysis wind fields and downscale them to hub height at the turbine locations using a standard vertical interpolation and site-specific adjustment.
  
  These wind speeds are converted to power using the manufacturer’s power curve and the installed nominal capacity for each turbine.
  
  The resulting “ERA5-synthetic” daily power time series is then compared against anonymized operational turbine data from the same wind farm for the period 2016–2020. We aggregate to daily values, remove days with obvious curtailment plateaus, and normalize both series to [0–1] for anonymization.
  
  The new Fig. X (left column) shows normalized monthly means, a density-coloured scatter plot, and the probability density functions. The correlation between daily measured and ERA5-synthetic power is r ≈ 0.91, and the distributions agree well over most of the range. This demonstrates that, for this particular site and period, ERA5-based semi-synthetic power captures the observed daily-to-seasonal variability sufficiently well for our methodological demonstration.
  
  PV power case (utility-scale PV plant):
  We construct semi-synthetic PV production from ERA5 plus CAMS radiation and atmospheric composition fields, combined with a simple PV performance model and the known installed DC capacity and orientation of the plant.
  
  Again, we compare daily values against measured plant output, normalize both series to [0–1], and show monthly averages, daily scatter and distributions in Fig. X (right column).
  
  The daily correlation between measured and ERA5+CAMS-synthetic PV power is r ≈ 0.98, and the distribution of normalized daily power is very similar, indicating that the semi-synthetic PV dataset realistically reproduces both the seasonal cycle and day-to-day variability.
  
  GHI case (radiation station):
  
  For GHI we use ERA5+CAMS semi-synthetic irradiance at a reference radiation station and compare it to long-term measurements.
  
  Here the correlation of daily values is r ≈ 0.99 and the distributions are almost indistinguishable (Fig. X, middle column), reinforcing that the semi-synthetic series are representative of realistic surface radiation conditions at this site.
  
  We include a more explicitly statement in the introduction and discussion that:
  The semi-synthetic datasets are constructed to be realistic but not perfect representations of operational data;
  
  Their synthetic nature implies that our conclusions are primarily about the behaviour and usefulness of HMMLVis for exploring causal relations in realistic multi-variable time series, rather than about quantifying exact performance of specific power plants;
  
  Uncertainties arise from reanalysis biases, the simplicity of the power-conversion models, and remaining curtailment and data-quality effects, and these are briefly discussed in the revised text.
  
  Overall, the new description and validation figure clarify how the semi-synthetic datasets are built, document their realism with respect to the available operational data, and delimit the scope of the conclusions drawn from these case studies.
  
  Caption for the additional figure:
  Figure X. Comparison of measured (black) and semi-synthetic (red) energy-relevant time series used in the HMMLVis case studies. Left column: normalized daily wind power for the reference wind farm, derived from ERA5 downscaled wind speed at hub height and converted using the turbine power curve (“ERA5-synthetic”). Middle column: daily global horizontal irradiance (GHI) at a radiation station, based on ERA5+CAMs (“ERA5+CAMS-synthetic”). Right column: normalized daily PV power for a utility-scale PV plant, constructed from ERA5+CAMs inputs and a simple PV performance model. For each case, the top row shows normalized monthly means, the middle row shows daily scatter plots with density shading and Pearson correlation coefficients, and the bottom row compares the probability density functions of normalized daily values. Overall, the semi-synthetic datasets reproduce the observed daily-to-seasonal variability well and are therefore suitable for demonstrating HMMLVis on realistic yet anonymized time series.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3126-CC1
  - AC2: 'Reply on RC1', Katerina Schindlerova, 23 Jan 2026
    
    Answers to Reviewer 1:
    
    We enclose a document containing our detailed responses to each comment and question raised by the Reviewer 1. All corresponding changes have been incorporated into the revised manuscript.
    
    We thank you for taking the time to consider our responses.
    
    Citation: https://doi.org/10.5194/egusphere-2024-3126-AC2
- AC1: 'Reply on RC1 -Point 3.', Katerina Schindlerova, 20 Jan 2026
  
  Answer to RC1.3. Section 5 and 6 - Placement.: We added text to cover these comments into Sections 2.1. and 2.2.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3126-AC1
- AC2: 'Reply on RC1', Katerina Schindlerova, 23 Jan 2026
  
  Answers to Reviewer 1:
  
  We enclose a document containing our detailed responses to each comment and question raised by the Reviewer 1. All corresponding changes have been incorporated into the revised manuscript.
  
  We thank you for taking the time to consider our responses.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3126-AC2
RC2:
'Comment on egusphere-2024-3126', Anonymous Referee #2, 15 Dec 2025

In the present study the authors proposed HMMLVis I found a novel instrument for the identification and graphical representation of causal relationships where the concept of heterogeneous Granger causality to analyse temporal series data facilitating a comprehensive exploration of the underlying causal dynamics. It is a code, it could be said package, in a git repository with standard presentation of a working visualisation code.
The manuscript would benefit from enhanced clarity, organisation, and presentation to align with the publication standards, specifically using a structure Introduction, Methods, Results, Discussion, Conclusion could help in the readability. Also The manuscript introduces several acronyms without providing their full names at first mention.
Couldyou check if The Digital Object Identifier (DOI) 10.5281/zenodo.13885371 of the Code and data availability section corresponds to private Zenodo record that is not publicly available

List of keyword is missing after the abstract

Figure 1 is not relevant information and too small numbers for comfortably reading and also a white lost space between the grey tables is there that could be reduced

Figures 2- 7 need to be presented in a more organised way for a publications, those are snapshots of the images.
Figure 2 the caption of the image needs to be complemented, more description to be added.
Figure 3 and Figure 4, use subplots (a) (b) and (c) to introduce each subplot in the caption.
Figure 6 looks a bit as a messy plot, try to make a mosaic with the subimages not overlpaing ones to others to illustrate what you want to achieve with this visualisation.
Line 325 Make a table or put this list information in a descriptive paragraph
Line 340 We list the available parameters in Table 1 below. --> We list the available parameters in Table 1.
5 6.2.1 Data Processing section appear in a part which should be the conclusive part of the paper after the results but this start describing something more appropriate from methodology.
Table 1 -->create a new column with the identificators (ECMWF, id=167*) parts of the parameter name column. Also in the units some J appears Bold and others italic.
The section 6.5 Urban Air quality miss some result plots or extra information, or this use cases with no discussion just presentation could be briefly presented in the introduction of the manuscript
References. Needs to add the doi for the papers that it is missing.

Citation: https://doi.org/10.5194/egusphere-2024-3126-RC2
- AC3: 'Reply on RC2', Katerina Schindlerova, 23 Jan 2026
  
  We enclose a document containing our detailed responses to each comment and question raised by the Reviewer 2. All corresponding changes have been incorporated into the revised manuscript.
  
  We thank you for taking the time to consider our responses.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3126-AC3

Rainer Wöß, Katerina Hlavácková-Schindler, Irene Schicker, Petrina Papazek, and Claudia Plant

Viewed

Total article views: 1,143 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
853	204	86	1,143	30	60

HTML: 853
PDF: 204
XML: 86
Total: 1,143
BibTeX: 30
EndNote: 60

Views and downloads (calculated since 16 Oct 2024)

Month	HTML	PDF	XML	Total
Oct 2024	63	14	6	83
Nov 2024	34	13	2	49
Dec 2024	18	8	0	26
Jan 2025	16	7	41	64
Feb 2025	11	3	6	20
Mar 2025	8	4	1	13
Apr 2025	20	7	0	27
May 2025	22	6	3	31
Jun 2025	62	4	1	67
Jul 2025	7	6	0	13
Aug 2025	61	11	0	72
Sep 2025	347	6	0	353
Oct 2025	18	10	1	29
Nov 2025	25	28	6	59
Dec 2025	43	35	8	86
Jan 2026	68	24	10	102
Feb 2026	25	18	1	44
Mar 2026	5	0	5

Cumulative views and downloads (calculated since 16 Oct 2024)

Month	HTML	PDF	XML	Total
Oct 2024	63	14	6	83
Nov 2024	34	13	2	49
Dec 2024	18	8	0	26
Jan 2025	16	7	41	64
Feb 2025	11	3	6	20
Mar 2025	8	4	1	13
Apr 2025	20	7	0	27
May 2025	22	6	3	31
Jun 2025	62	4	1	67
Jul 2025	7	6	0	13
Aug 2025	61	11	0	72
Sep 2025	347	6	0	353
Oct 2025	18	10	1	29
Nov 2025	25	28	6	59
Dec 2025	43	35	8	86
Jan 2026	68	24	10	102
Feb 2026	25	18	1	44
Mar 2026	5	0	5

Viewed (geographical distribution)

Total article views: 1,097 (including HTML, PDF, and XML) Thereof 1,097 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 03 Mar 2026

Short summary

HMMLVis is a causal inference, easy-to-use visualization software. It can be applied in any scientific discipline exploring time series and their relationships. The tool uses heterogeneous Granger causality. The tool is demonstrated on different types of applications related to meteorological events in a renewable energy, air pollution, and the EUMETNET postprocessing benchmark data. We believe HMMVis will serve climatologists or meteorologists as an interpretable causal visualization tool.


Total:	0
HTML:	0
PDF:	0
XML:	0