the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Prediction of magnetic activities for Solar Cycle 25 using neural networks
Abstract. Recent advancements in artificial intelligence research have shown promising results in addressing scientific and operational challenges related to Space Weather. The abundance of historical solar wind data collected near Earth presents an opportunity to leverage modern scientific methodologies that integrate large datasets and computational modeling. In this study, we analyzed multivariate solar wind data spanning the twenty-third to twenty-fifth solar cycles to develop a predictive model for geomagnetic storms. Our improved long-short-term memory recurrent neural network model with an attention mechanism demonstrated accurate predictions of moderate events between 2023 and 2025, outperforming international reference models. We also evaluated the model's performance in predicting the intense geomagnetic storm of May 2024, which saw a significant Dst index amplitude variation exceeding 400 nT. This research contributes to the advancement of early warning systems, risk mitigation strategies, and offers a new approach to analyzing geomagnetic storms morphology.
- Preprint
(6741 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-2569', Anonymous Referee #1, 15 Aug 2025
-
AC1: 'Reply on RC1', Thiago Sant Anna, 20 Aug 2025
Dear Referee #1,
Thank you for your comments. I will implement the changes and get back to you soon.
Sincerely,
Thiago Moeda
Citation: https://doi.org/10.5194/egusphere-2025-2569-AC1
-
AC1: 'Reply on RC1', Thiago Sant Anna, 20 Aug 2025
-
RC2: 'Comment on egusphere-2025-2569', Anonymous Referee #2, 28 Sep 2025
In this manuscript, an LSTM-based neural network with attention is used to predict the Dst geomagnetic index at a 48-hour horizon using OMNI hourly solar-wind and spacecraft position data (training set: 1996–2022). They report experiments on the last solar cycle and compare their model to a Dst–CNN baseline. and to the Kyoto (observational) Dst. The network uses two bidirectional LSTM layers, attention layers, dropout, and layer norm; normalization is standard (z-score on training set); missing values are filled using a monotone cubic spline. Performance is reported with RMSE and correlation coefficients and illustrated with time-series plots of selected storm intervals. The authors report modest improvement compared to the state of the art. Generally, the manuscript addresses relevant scientific questions, but the results yield incremental improvement, which would be acceptable if they provided uncertainty quantification Dst. It should be noted that architecture, including bi-directional LSTM, has already been applied to Dst forecasting (M.A. Jahin, M.F. Mridha, Z. Aung, N. Dey, R.S. Sherratt, 2024 arXiv:2407.06658v2), and bi-directional LSTM has been massively used in space weather in recent years. I would recommend that the authors perform a more extensive literature review in the bibliography.
Recommendation: Reject and resubmit.
Major concerns (must be addressed before publication)
A. Reproducibility, data, and code availability
The manuscript states, “Code and data availability. Please contact the authors directly.” This is insufficient for a methods-heavy, ML-based manuscript. The authors must provide a public repository (e.g., GitHub/Zenodo) with the code, model architecture, training scripts, random seeds, and pre- and post-processing pipelines. Reproducibility is essential for readers to validate and build upon the work. Otherwise, the value of the machine learning paper is lower.
B. Unclear model causality and the use of bidirectional LSTMs
In time-series forecasting, bidirectional architectures can inadvertently use “future” information if not carefully constrained; this may leak information and artificially improve performance. The authors must explicitly clarify how the bidirectional LSTMs were used. If the model really is causal (only uses past input), explain this explicitly and justify the choice of bidirectional vs. causal (unidirectional) LSTM. Preferably, show an ablation: unidirectional vs bidirectional performance.
C. Evaluation protocol and baseline fairness.
The authors must confirm that the Dst-CNN and the proposed model are evaluated on the same test set and with the same pre-processing (or explain any differences). They should also provide simple baselines (persistence, climatology, and the classic Burton empirical model) in addition to the Dst-CNN. Many forecasting papers include persistence and an empirical model as low bars. The issue is that RMSE scores are quite similar, but no statistical significance (confidence intervals or p-values) is provided, which makes it difficult to conclude whether any improvement has been achieved. A simple solution could be to split the training set into subsets and train on them, since the test set is essentially chosen a priori.
C. Metrics and interpretability for extreme events
The manuscript focuses on moderate to intense storms (operationally important). Reporting only RMSE and correlation is insufficient because these aggregate metrics can hide poor performance at extremes. The authors should provide event-based metrics: hit/miss/false alarm rates for thresholds (e.g., Dst < −50 nT, < −100 nT), detection lead time, and amplitude error on events.
D. Missing-data interpolation strategy and its impact
The manuscript uses monotone cubic spline interpolation to fill missing solar-wind and Dst values. This choice could smooth peaks and troughs (affecting storm amplitudes) and introduce artificial continuity across gaps. Authors must report: the fraction of missing data, distribution of gap lengths, and whether gaps are concentrated in particular years or near-storm periods. Was this operation also applied to the test set?
E. Presentation, units & interpretability of reported errors
The manuscript reports an RMSE “below 0.1”, which seems not to be consistent with Figure 5. This discrepancy must be fixed or explained. Another curiosity that remains unaddressed is the difference between the training and testing losses. This is expected, but, for instance, it is unclear from the manuscript whether the authors have performed hyperparameter tuning for the optimal model. Moreover, the manuscript does not state whether RMSE is in normalized units or physical units (nT). Readers need physical-unit errors (nT) for operational assessment. Please report RMSE/MAE both in normalized units and in nT. Ensure all plots have axis labels with units and readable legends, as many plots are currently unreadable. The correlation coefficient is mentioned on page 11 but never defined.
F. Minor/editorial issues (line-level / presentation)
-
- Language & typos: Several grammar issues and typos (e.g., repeated “Figure Figure 8-I on line 240”). Sentences like: “Regarding the modeling project, unlike us who use recurrent neural networks…” on line 250 lead me to suggest that the manuscript needs careful English editing. The conclusion is very short. Notice strange sentences like: “reasonable that the comparison in relation to what Nature itself produces in-situ is sufficient, being characterized by measurements of observational data. Thus, adopting the conception of an AI project capable of achieving its main objective, it is defined by systematic experiments based on the foundation of the Data Science methodology.” I also don’t like the statement: “Semi-supervised learning is used when training sets need to be more insightful and reliable.” Personally, I don’t think it is necessary to define semi-supervised learning since it is not applied in this manuscript, but if the authors insist, they may simply look up the definition online.
- Figure captions: Many captions lack units and a sufficient description. Figures 6–8: indicate which line corresponds to which model and show error bands where possible.
- Hyperparameters: Provide a full hyperparameter table (units, dropout, learning-rate schedule, optimizer settings, weight initialization strategy, regularization).
- References: A few references and DOIs appear inconsistent in format. Ensure standard formatting.
Recommendation: Reject and resubmit.
Rationale: The manuscript presents a potentially useful application of LSTM+attention models to Dst forecasting. However, several crucial methodological details are ambiguous or missing (use of bidirectional LSTMs, treatment of missing data, and reproducibility/code availability), and the evaluation is incomplete for operational/extreme-event claims. These problems require substantial additional analysis, clearer presentation, and public release of code/data (or at least scripts to reproduce the results) before the manuscript is suitable for publication.
Citation: https://doi.org/10.5194/egusphere-2025-2569-RC2 -
AC2: 'Reply on RC2', Thiago Sant Anna, 29 Sep 2025
Referee #2,
I'm finishing the review of Referee #1 and will begin your review. In fact, since Referee #1 coherently suggested the inference and analysis of more storms, I have news and new results. Regarding the manuscript presenting a potentially useful application of LSTM+attention models to DST forecasting, you're right. I've already resolved it.
With best regards,
Moeda, T.
Bright be the blue dot.
Citation: https://doi.org/10.5194/egusphere-2025-2569-AC2 -
AC3: 'Reply on RC2', Thiago Sant Anna, 29 Sep 2025
Dear , I would like to thank you for the opportunity, but unfortunately, we will have to publish in one of the various journals that kindly offered to publish our work in full, as I realize that in today's world, bureaucracy is a way of stopping time. This is an old trick that dates back to the beginning of the last century, when a great scientist realized, through one of his peers, that the universe was not what he thought, because truth is only accepted when seen with one's own eyes. After this humble introduction, I realized the central issue is the availability of the model/algorithm.
If necessary, I can point to references where the authors do not make their code (open-source) available before and/or after the manuscript's publication. In my case, I am not a scientist at my institution, the use of the code needs to be documented in the institute's archives so that I can continue to serve in this role; I hope you understand. Brazil is not just soccer, it is bureaucracy too.
Citation: https://doi.org/10.5194/egusphere-2025-2569-AC3 -
AC4: 'Reply on RC2', Thiago Sant Anna, 02 Oct 2025
Dear Referee,
I would like to apologize for the previous responses. I understand your concerns, and they are all positive.
I am preparing a revision of the article for your consideration.
Kind regards,
Thiago Moeda
Citation: https://doi.org/10.5194/egusphere-2025-2569-AC4
-
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
817 | 49 | 19 | 885 | 39 | 44 |
- HTML: 817
- PDF: 49
- XML: 19
- Total: 885
- BibTeX: 39
- EndNote: 44
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
A long term-short memory recurrent neural network (RNN) model is developed using Dst index and solar wind plasma parameters between 1996 and 2022 as a training dataset. The model is validated with the help of the Dst index between 2023 and 2024, and the Dst convolutional neural network (Dst CNN) model. The comparative evaluation of the RNN model, using the Dst index, shows a better performance to the Dst CNN model for moderate storms looking at the correlation coefficients for both models. There is intention in future of enhancing and broadening the capability of the model for the predictions of other indices like Kp and AE.
Major comments:
The validation of the model lacks a reasonable number of moderate and major storms to ascertain its performance compared to the existing model Dst CNN. I suggest choosing between 6 to 10 moderate and major storms outside the period of the training dataset and calculating correlation coefficients and RMSEs between Dst index and predictions of models. A table listing the geomagnetic storms, correlation coefficients and RMSEs for RNN and Dst CNN can show easily the comparative performance of the models.
Some figures have labels that are hardly seen without zooming in the document. It is advisable to increase the font size of the labels. I would suggest that in the second panel of figures 6 and 8 the RNN prediction curve alone can be replaced with the difference between model predictions and Dst index. We can see where big or small differences are during three geomagnetic storm phases.
Minor comments:
Line 45: The citation may be replaced with “Nakano and Kataoka (2022)”.
Line 52: The citation may be replaced with “Sierra Porta et al. (2024)”.
Line 66: The citation may be replaced with “Yan et al. (2024)”.
Line 69: The citation may be replaced with “Yasser et al. (2022)”.
The font size of numbers and labels in Figure 4 may be increased for readability.
Line 221: h(xi).
Line 226: “In Figure 5, an RMSE value below 0.1”. It is not presented or clear in the figure.
Line 227: “in 14 epochs”. Which years are being referred to? I suggest replacing X-axis labels (0…14) with identifiable (real) epochs.
I think the unit for RMSE values in Figures 6-8 is nT. These RMSE values appear to be very small, not representing the visual differences between the curves. It is advisable to check their calculation.