the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Empirical evidence of spurious correlations among space weather variables
Abstract. This paper investigates the prevalence and identification of spurious correlations within space weather datasets, a critical concern given the complex inter-dependencies of nature of geophysical phenomena. This is carried out using daily-averaged galactic cosmic ray (GCR) datasets from MOSC and OULU neutron monitor (NM) stations analyzed separately, the large Forbush Decrease (FD) (FD > 3 %) and the small FD (FD ≤ 3 %) in each station, to account for the effects of 11-year solar cycle oscillations. For the first time, a statistical analytical method was employed to test the link between FD amplitudes and solar-geomagnetic variables in each dataset after the effects of 11-year solar cycle oscillations are filtered. We demonstrate that, while significant correlations between various space-weather indices and Forbush Decrease events are empirically observable, a meticulous analysis reveals that a subset of these relationships may not reflect true physical causality but rather arise from statistical artifacts or confounding factors inherent in the data. Specifically, analyses of Forbush Decreases often reveal varying correlation coefficients with geomagnetic and solar wind parameters, which can fluctuate significantly across time periods and cosmic-ray stations. For instance, correlations between Forbush Decrease amplitudes and interplanetary magnetic field strength, solar wind speed, and geomagnetic indices like Kp and Dst have been observed to exhibit both negative and positive trends, depending on the specific dataset and analytical approach employed. The results obviously show inconsistencies in the datasets for both MOSC and OULU stations for the large and small FDs, respectively – specifically, strong correlations were noticed for the parameters’ regression analyses after the effects of 11-year solar cycle oscillations were removed for both big and small Fds. These inconsistencies strongly suggest the influence of 11-year solar cycle oscillations on the FDs counted on both stations, thereby affecting the relationships between the FDs and the geomagnetic tested variables, echoing concerns about "spurious regression" in the stationary time series. Most of the results are statistically significant at a 95 % confidence level. The results obtained here imply that 11-year solar cycle oscillations have impacts on the GCR flux intensity.
- Preprint
(652 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-6494', Anonymous Referee #1, 18 Feb 2026
-
AC2: 'Reply on RC1', Costecia Ifeoma Onah, 10 Apr 2026
Dear Reviewer
Thank you very much for your invaluable time.
Manuscript Title: Empirical Evidence of Spurious Correlations Among Space Weather Variables
Recommendation: Major Revision
We are quite willing to take a comprehensive review of the manuscript. Many thanks for giving us the opportunity to do that.
Major Comments
•The methodology for removing solar-cycle oscillations is insufficiently described. A clear mathematical description of the filtering procedure is required for reproducibility.
You are right. We will pay adequate attention to it when revising the manuscript.
•Autocorrelation and non-stationarity in time-series data are not adequately addressed. This may itself produce spurious regression results.
We shall also take care of this during the revision.
•The dramatic change in FD counts before and after filtering requires deeper justification and
sensitivity analysis.
Detailed attention will be given to this before submitting the revised manuscript.
•The manuscript relies heavily on p-values without sufficient discussion of effect sizes and physical interpretation.
Thanks for the observation. We will take this into account in the revision.
•The claim of identifying spurious correlations should be moderated unless robustness tests (e.g., bootstrapping, cross-validation) are provided.
We will review the manuscript in view of the above comments.
•Only Solar Cycle 23 is analyzed. Extending to additional cycles would strengthen generalizability.
Thank you. The result of this will be presented in the revised manuscript.
•There is an inconsistency between the discussion of advanced methods (e.g., mutual information, AI) and the actual application of linear regression only.
We shall correct this in the revision.
Minor Comments
•Several grammatical and typographical errors require correction.
We will correct the grammar and typos.
•Table numbering should be carefully checked for consistency. Some references appear duplicated, and formatting should be standardized.
We have noted this and will take care of it in the revision.
Definitions of acronyms (e.g., SI, SSN) should be standardized at first use. Figures would benefit from additional quantitative statistical descriptors.
We agree to address the comments above in the revision.
Overall Assessment
The manuscript addresses an important problem in space weather analysis, namely, the potential inflation of correlations due to long-term solar-cycle modulation. The study presents interesting findings and may have a substantial impact after methodological strengthening. However, significant revisions are required to improve statistical rigor, reproducibility, and clarity of interpretation before the work can be considered for publication.
We are indebted to you for your pointers. We will do a comprehensive revision of the manuscript.
Citation: https://doi.org/10.5194/egusphere-2025-6494-RC1
-
AC2: 'Reply on RC1', Costecia Ifeoma Onah, 10 Apr 2026
-
RC2: 'Comment on egusphere-2025-6494', Anonymous Referee #2, 24 Feb 2026
I am not a native speaker of the language, but I have a reasonable command of it; however, I found it difficult to comprehend the manuscript's text. The article is written in a hard-to-understand language, with sentences constructed in a very complex manner. It gives the impression that AI assistance was used to make the phrases sound "more sophisticated."
I assumed that a team of authors, who have published dozens of articles, does not need to be taught how to properly insert references into the text. References should be made to those articles that directly investigate the topic at hand, not to those that only briefly mention it. Understanding the text is greatly hindered precisely by the fact that the reviewer has to check the references in the text, or more precisely, their relevance to the specific part of the text (e.g., it is unclear why references to the article by Chakraborty et al., 2023 on line 33, Papailiou et al., 2024 on line 42, or Okike et al., 2025 on line 38 and further in many places throughout the text have been inserted). It gives the impression that the references were inserted thoughtlessly or to articles that have no direct relation to the issue under discussion. Please treat the matter of references in the text more seriously.
Table 1 contains incorrect data.
Section 3 does not describe the techniques used in the work. It is absolutely unclear how the FD lists presented in Tables 2-9 were obtained. There are no explanations of how the influence of the 11-year solar cycle on the data was removed. For my own understanding, I plotted the graph (data from http://cr0.izmiran.ru/common/links.htm were used for the period from 20 Feb 1998 to 2 Mar 1998), on which the most significant decrease in CR is observed on August 26, 1998, but this event is not even present in any of your lists. I cannot further evaluate the results of the work, as I believe that the FD lists you provided do not correspond to reality.The article requires significant revision or rejection at the editor's decision. It is necessary to describe how the initial FD lists were obtained, how the influence of the 11-year solar activity cycle was removed, and to carefully verify the resulting data. Only after that should one proceed to consider various correlations.Citation: https://doi.org/10.5194/egusphere-2025-6494-RC2 -
AC1: 'Reply on RC2', Costecia Ifeoma Onah, 10 Apr 2026
Dear Reviewer
Thank you so much for your time.
REVIEW 2
I am not a native speaker of the language, but I have a reasonable command of it; however, I found it difficult to comprehend the manuscript's text. The article is written in a hard-to-understand language, with sentences constructed in a very complex manner. It gives the impression that AI assistance was used to make the phrases sound "more sophisticated."
Thank you very much for your kind observation. We promise to undertake a comprehensive review of the manuscript. If you have the patience to assist us to look at the work again, you will not find the same lapses as mentioned here. Thank you. I assumed that a team of authors, who have published dozens of articles, does not need to be taught how to properly insert references into the text. References should be made to those articles that directly investigate the topic at hand, not to those that only briefly mention it. Understanding the text is greatly hindered precisely by the fact that the reviewer has to check the references in the text, or more precisely, their relevance to the specific part of the text (e.g., it is unclear why references to the article by Chakraborty et al., 2023 on line 33, Papailiou et al., 2024 on line 42, or Okike et al., 2025 on line 38 and further in many places throughout the text have been inserted). It gives the impression that the references were inserted thoughtlessly or to articles that have no direct relation to the issue under discussion. Please treat the matter of references in the text more seriously. Thanks again for your advice here. We promise handle the matter of references very careful in the next round. Table 1 contains incorrect data. You are right. Thanks for the observation. We will correct them in the next round. Section 3 does not describe the techniques used in the work. It is absolutely unclear how the FD lists presented in Tables 2-9 were obtained. There are no explanations of how the influence of the 11-year solar cycle on the data was removed.
This will be given adequate attention in the next round.
For my own understanding, I plotted the graph (data from http://cr0.izmiran.ru/common/links.htm were used for the period from 20 Feb 1998 to 2 Mar 1998), on which the most significant decrease in CR is observed on August 26, 1998, but this event is not even present in any of your lists. I cannot further evaluate the results of the work, as I believe that the FD lists you provided do not correspond to reality.
This event is in Table 2. However, while the onset of this event happened on 26/08/1998 (see the IZMIRAN FD on the FEID (http://spaceweather.izmiran.ru/eng/fds1998.html) list that times the onset of Fds, we time the of FD minimum. The FD minimum happened on 27/08/1998. This is why the event appears on 27/08/1998 in Table 2.
Nevertheless, following the comments of the reviewer #1, we have extended the analyses to other solar cycles and the events presented in a more careful manner.
The article requires significant revision or rejection at the editor's decision. It is necessary to describe how the initial FD lists were obtained, how the influence of the 11-year solar activity cycle was removed, and to carefully verify the resulting data. Only after that should one proceed to consider various correlations.
Thank you very much. We will take care of all these if we are given the opportunity to revise the work.
-
AC1: 'Reply on RC2', Costecia Ifeoma Onah, 10 Apr 2026
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 138 | 37 | 19 | 194 | 14 | 22 |
- HTML: 138
- PDF: 37
- XML: 19
- Total: 194
- BibTeX: 14
- EndNote: 22
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Manuscript Title: Empirical Evidence of Spurious Correlations Among Space Weather Variables
Recommendation: Major Revision
Major Comments
•The methodology for removing solar-cycle oscillations is insufficiently described. A clear mathematical description of the filtering procedure is required for reproducibility.
•Autocorrelation and non-stationarity in time-series data are not adequately addressed. This may itself produce spurious regression results.
•The dramatic change in FD counts before and after filtering requires deeper justification and
sensitivity analysis.
•The manuscript relies heavily on p-values without sufficient discussion of effect sizes and physical interpretation.
•The claim of identifying spurious correlations should be moderated unless robustness tests (e.g., bootstrapping, cross-validation) are provided.
•Only Solar Cycle 23 is analyzed. Extending to additional cycles would strengthen generalizability.
•There is an inconsistency between the discussion of advanced methods (e.g., mutual information, AI) and the actual application of linear regression only.
Minor Comments
•Several grammatical and typographical errors require correction.
•Table numbering should be carefully checked for consistency. Some references appear duplicated, and formatting should be standardized.
Definitions of acronyms (e.g., SI, SSN) should be standardized at first use. Figures would benefit from additional quantitative statistical descriptors.
Overall Assessment
The manuscript addresses an important problem in space weather analysis, namely, the potential inflation of correlations due to long-term solar-cycle modulation. The study presents interesting findings and may have a substantial impact after methodological strengthening. However, significant revisions are required to improve statistical rigor, reproducibility, and clarity of interpretation before the work can be considered for publication.