Empirical evidence of spurious correlations among space weather variables
Abstract. This paper investigates the prevalence and identification of spurious correlations within space weather datasets, a critical concern given the complex inter-dependencies of nature of geophysical phenomena. This is carried out using daily-averaged galactic cosmic ray (GCR) datasets from MOSC and OULU neutron monitor (NM) stations analyzed separately, the large Forbush Decrease (FD) (FD > 3 %) and the small FD (FD ≤ 3 %) in each station, to account for the effects of 11-year solar cycle oscillations. For the first time, a statistical analytical method was employed to test the link between FD amplitudes and solar-geomagnetic variables in each dataset after the effects of 11-year solar cycle oscillations are filtered. We demonstrate that, while significant correlations between various space-weather indices and Forbush Decrease events are empirically observable, a meticulous analysis reveals that a subset of these relationships may not reflect true physical causality but rather arise from statistical artifacts or confounding factors inherent in the data. Specifically, analyses of Forbush Decreases often reveal varying correlation coefficients with geomagnetic and solar wind parameters, which can fluctuate significantly across time periods and cosmic-ray stations. For instance, correlations between Forbush Decrease amplitudes and interplanetary magnetic field strength, solar wind speed, and geomagnetic indices like Kp and Dst have been observed to exhibit both negative and positive trends, depending on the specific dataset and analytical approach employed. The results obviously show inconsistencies in the datasets for both MOSC and OULU stations for the large and small FDs, respectively – specifically, strong correlations were noticed for the parameters’ regression analyses after the effects of 11-year solar cycle oscillations were removed for both big and small Fds. These inconsistencies strongly suggest the influence of 11-year solar cycle oscillations on the FDs counted on both stations, thereby affecting the relationships between the FDs and the geomagnetic tested variables, echoing concerns about "spurious regression" in the stationary time series. Most of the results are statistically significant at a 95 % confidence level. The results obtained here imply that 11-year solar cycle oscillations have impacts on the GCR flux intensity.