the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Technical note: Expanding ensemble forecasts with generative AI: application to volcanic clouds
Abstract. Ensemble-based modelling of the atmospheric dispersal of volcanic clouds enables more realistic forecasts by explicitly accounting for uncertainties in eruption source parameters, meteorological data, and systematic errors in transport models. Many ensemble applications, including quantification of forecast uncertainties, data assimilation, or probabilistic hazard assessments, require a large number of members to mitigate sampling errors and to properly capture probability distributions. However, running large ensembles with Volcanic Ash Transport and Dispersal (VATD) models can be computationally demanding, even for high-performance computing clusters. As a result, operational forecasting is typically restricted to smaller ensembles in order to fit time-to-solution requirements. In contrast, generative AI models can produce large volumes of physically-consistent data samples with minimal computational cost. In this work, a convolutional Variational AutoEncoder (VAE) is trained on an ensemble of 256 forecasts simulated with the FALL3D model and subsequently used to generate larger ensembles, effectively augmenting physics-based ensemble modelling capacity. Ensembles with up to 8192 members were generated nearly instantaneously using the trained neural network, with no reliance on HPC resources. The statistical properties of the expanded ensembles are characterised in detail, and the VAE performance is evaluated against a test dataset composed of 2048 numerical simulations. The VAE-generated ensembles closely approximate the actual (target) probability distribution as well as key sample statistics, such as ensemble mean and spread, with minimal degradation in the evaluation metrics. Finally, we discuss possible future applications of this work, including latent space data assimilation via deep learning.
- Preprint
(1826 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-6097', Anonymous Referee #1, 25 Feb 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-6097/egusphere-2025-6097-RC1-supplement.pdfCitation: https://doi.org/
10.5194/egusphere-2025-6097-RC1 -
RC2: 'Comment on egusphere-2025-6097', Anonymous Referee #2, 01 May 2026
This technical note uses a Variational Autoencoder to expand ensemble forecasts of volcanic ash clouds generated by the FALL3D model. I think the topic is interesting and potentially useful for operational volcanic-cloud forecasting, especially because large physics-based ensembles are computationally expensive. The manuscript is generally clear, and the authors have made a reasonable first attempt to evaluate the generated ensemble against a larger numerical ensemble. In my view, the weakest point is that it does not yet clearly demonstrate whether the proposed VAE method provides real added value beyond a statistical interpolation or smoothing of one specific numerical ensemble. The study is based on one hypothetical eruption case, one model setup, and one forecast time. Therefore, I think the current manuscript is more like a proof-of-concept than a method that has been convincingly demonstrated for broader volcanic-cloud forecasting. I suggest a major revision before it can be considered for publication in ACP.
Major:
- I understand that this is a technical note, and a simple case study is acceptable as a first step. However, the current manuscript sometimes gives the impression that the method is already generally useful for operational ensemble forecasting. I do not think this has been demonstrated. A natural question is: if the volcano, eruption height, emission rate, wind situation, or forecast time changes, can the same trained VAE still be used? Or does a new VAE need to be trained for each new forecast case? If the latter is true, the operational value of the method is more limited. I suggest that the authors either add another case or clearly state that the current work only demonstrates feasibility for a single controlled case.
- The authors show that the VAE-generated ensemble can approximate the 2048-member numerical ensemble, but it is unclear whether this improvement really comes from the VAE. For example, would a bootstrap method, kernel density estimation, PCA-based sampling, or another simple statistical resampling method produce similar results? This is a basic but important comparison. Without such a baseline, it is difficult to know whether the VAE is actually necessary, or whether it mainly smooths and interpolates the information already contained in the 256-member training ensemble. I suggest adding at least one simple baseline method.
- The small-scale error shown in the spectral analysis should be discussed more seriously. Figure 10 shows that the VAE can reproduce large-scale structures, but it fails at smaller spatial scales and introduces artificial high-wavenumber noise. This may not be a small technical detail. In volcanic ash forecasting, plume edges, local gradients, and threshold exceedance areas can be important for aviation warnings. If the VAE changes or smooths these small-scale structures, the exceedance probabilities and hazard boundaries may also be affected. I suggest that the authors discuss whether this limitation matters for practical forecasting.
- The computational-cost argument needs to be clearer. The authors emphasize that VAE samples can be generated very cheaply, but the full workflow also requires producing the 256-member training ensemble and training the neural network. If this has to be done for every forecast case, the computational advantage may be smaller than suggested. I suggest that the authors provide a clearer cost comparison: running 256 FALL3D members plus VAE training and generation versus directly running a larger FALL3D ensemble. The authors should also clarify whether the trained VAE is expected to be reused across different forecasts.
- The 2048-member FALL3D ensemble is treated as the target distribution, but it is still a model-generated reference, not the true atmospheric distribution. I think the manuscript should be more careful when using terms such as “true distribution” or “actual distribution”. The VAE is evaluated against a larger FALL3D ensemble, not against real observations. This distinction should be made clearer, especially when discussing possible operational applications.
Minor:
- The manuscript repeatedly uses expressions such as “physically consistent” or “physics-grounded samples”. I understand what the authors mean: the VAE is trained using outputs from a physical model. However, the generated samples themselves are not produced by solving the governing equations. Therefore, I think this wording is a little too strong. The generated fields may be statistically similar to FALL3D outputs, but this does not necessarily mean that they satisfy physical constraints such as mass conservation or transport consistency. I suggest either adding some basic physical checks, or using more cautious wording, such as “statistically consistent with the FALL3D ensemble”.
- The title and abstract should make it clearer that this is a proof-of-concept study based on one volcanic-cloud simulation setup.
- Some claims in the conclusion are too strong. For example, the statement that the method can be extremely useful for ensemble expansion and operational frameworks should be softened unless more cases are tested.
- The authors mention future applications in latent-space data assimilation. This is interesting, but it is not tested in this manuscript. I suggest keeping this part shorter and more cautious.
- It would be useful to explain more clearly why the authors selected the thresholds used in the exceedance probability analysis. Are these thresholds operationally meaningful, or just chosen for demonstration?
- The manuscript should state more explicitly whether the VAE was trained only on ash column mass at +24 h. If so, the limitation of ignoring temporal evolution should be emphasized.
Citation: https://doi.org/10.5194/egusphere-2025-6097-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 460 | 161 | 36 | 657 | 112 | 107 |
- HTML: 460
- PDF: 161
- XML: 36
- Total: 657
- BibTeX: 112
- EndNote: 107
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1