the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
An Innovative Hybrid SG-CEEMDAN-ARIMA-LSTM Model for Forecasting Meteorological Drought: Trends and Forecasting
Abstract. Droughts are defined as extended periods of below-average rainfall resulting in a shortage of water and have significant impacts on ecosystems, agriculture, and water supplies. One of the most challenging aspects of addressing drought is trend patterns and developing accurate prediction models that will be crucial for efficient mitigation and resource management. Analyzing drought is inherently uncertain and complex due to the dynamic and evolving character of climate trends. This study used a special method called the modified Mann-Kendall (MMK) approach and a new trend analysis (ITA) to find trends and introduced a better way to make predictions by using the Standardized Precipitation Index (SPI) along with a combined model that takes advantage of the Savitzky-Golay filter and Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (SG-CEEMDAN) for preparation, plus Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) techniques. In terms of trend analysis of the SPIs, MK and MMK tests revealed a most statistically significant decreasing trend. For example, Pongolapoort Dam showed negative Z-score (p-values) for the SPI-6, SPI-9, and SPI-12 in the MK and MMK tests, which are represented as (−7.19 (6.12𝑒−13), −8.74(< 0.00), −9.83 (< 0.00) and −8.22 (2.22𝑒−16), −5.44 (5.40𝑒−8), −6.51 (7.41𝑒−11), respectively. Additionally, the ITA confirmed a significant downward trend across all time scales of the SPI. The SPI forecasting results show that the hybrid model, called SGCEEMDAN-ARIMA-LSTM, had the best prediction accuracy compared to all other models for every SPI time scale. The coefficient of determination (R2) values of the proposed hybrid model was notably high: 0.9839 for SPI-6, 0.9892 for SPI-9, and 0.9990 for SPI-12. This demonstrates that the hybrid model offers the best fit to the data and is the most suitable choice for forecasting short-to-long-term drought conditions in the uMkhanyakude district. Furthermore, the inclusion of decomposition techniques, such as SG, CEEMDAN, and SG-CEEMDAN, significantly enhances the performance of the model.
- Preprint
(6320 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-2733', Anonymous Referee #1, 18 Jul 2025
The manuscript presents an interesting and relevant study on forecasting SPI using a hybrid model that combines existing methodologies in a novel way. The approach, which integrates signal decomposition (SG, CEEMDAN) with traditional (ARIMA) and deep learning (LSTM) techniques, addresses a crucial topic with significant potential impact, particularly for data-scarce regions like uMkhanyakude, South Africa. Although the study does not introduce entirely new methods, the unique combination and application of established techniques offer valuable insights and could help advance drought forecasting.
Major Comments:
- Terminological Precision and Focus in Introduction: The manuscript frequently uses the term "drought" without clearly specifying its type until late in the introduction. The initial sections should explicitly state that the study focuses on meteorological drought, as defined by the Standardized Precipitation Index (SPI), which only reflects precipitation. This clarification is crucial to avoid confusion and help set the stage for the study’s objectives, contributions, and context within the broader field of drought research. Additionally, the introduction should discuss the scope and limitations, particularly noting that the study is a new method for SPI forecasting.
- Methodological Framing and Clarity: The Methods section should provide a clearer and more focused explanation of the hybrid modeling framework. This includes: the rationale behind combining SG filtering with CEEMDAN decomposition prior to modeling; how the decomposition into intrinsic mode functions (IMFs) enhances forecasting accuracy, as indicated by the improved RMSE and values shown in Table 4; the stepwise integration of the ARIMA and LSTM models on decomposed components, and how these components are recombined for final predictions; the comparative advantages of this hybrid method over standalone models or simpler combinations, evidenced by the superior performance of the SG-CEEMDAN-ARIMA-LSTM model across all SPI timescales, as shown in Figures 11-16. While a diagram is present, these aspects should be emphasized to highlight the novel integration strategy, rather than detailing standard approaches like ARIMA or LSTM. These well-known methods can be briefly summarized, with detailed descriptions moved to the appendix to lighten the paper and assist the reader.
- Streamlining Content: To improve readability, consider moving detailed descriptions of well-known methods to an appendix. This will allow the main text to focus more on the innovative aspects of the study and its implications.
- Justification for Methodological Choices: While the manuscript acknowledges the limitations of SPI, it should provide a more robust justification for its selection over SPEI, particularly under climate change conditions. Addressing this could strengthen the methodological rationale by discussing factors such as data availability or regional relevance.
- Literature Review Organization: The literature review should be reorganized to group studies thematically, highlighting insights that motivate the proposed model and clarifying the research gap that this study aims to address. This will provide a clearer context for the study's contributions.
- Abstract and Title Refinement: The abstract should be concise and precise, clearly outlining the study’s objectives and methods. Similarly, consider revising the title to avoid redundancy and focus on the paper’s core contributions.
In summary, the manuscript is potentially interesting and relevant, offering valuable insights through its novel combination of established methodologies. However, it would benefit from a rewrite to clarify key sections in the Introduction and Methods, and from streamlining redundant content to enhance readability and focus.
Citation: https://doi.org/10.5194/egusphere-2025-2733-RC1 - RC2: 'Comment on egusphere-2025-2733', Anonymous Referee #2, 26 Jul 2025
-
RC3: 'Comment on egusphere-2025-2733', Anonymous Referee #3, 31 Jul 2025
The manuscript addresses the topic related to drought trend analysis and forecasting, and it appears that the authors have invested considerable effort in applying a combination of statistical and machine learning techniques. However, the manuscript suffers from several critical issues that need to be addressed before it can be considered for publication. The methodology section lacks clarity, the figures are not adequately formatted for readability, and the introduction is poorly structured. I recommend major revisions to improve the overall clarity. Here are some potentially helpful suggestions:
Introduction:
The first paragraph could benefit from improved focus and clearer logic. While it introduces the general impacts of drought, the core message is somewhat diluted. The second paragraph seems only loosely connected to the main theme of the study. I suggest the authors focus more specifically on summarizing the strengths and limitations of various drought prediction methods, rather than listing a large number of references without clear synthesis. Additionally, the third and fourth paragraphs appear closely related and might be more effective if combined.Method:
The authors spend a significant amount of time explaining the algorithms or working principles of SG, CEEMDAN, ARIMA, and LSTM models, which are well-known techniques. What I would like to see is how these models are integrated together—whether they form a framework or are coupled in some way. I would also like to know how the parameters for these models were set.Results
1. The figure. 6 is not properly aligned and appear to be more suited for a report format. And I consider this figure is not the results of this paper.
2. Lines 490 – Lines 520: It appears that in-situ data from 1980–2014was used for training, and 2015–2023 for testing. This setup raises concerns about potential overfitting. To further demonstrate the model’s generalizability, I suggest the authors consider adding a transfer prediction experiment.
3. Given that parameter selection can significantly affect model performance, a more detailed explanation of the tuning procedures for each model would strengthen the methodological transparency.
Citation: https://doi.org/10.5194/egusphere-2025-2733-RC3
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
533 | 77 | 15 | 625 | 11 | 16 |
- HTML: 533
- PDF: 77
- XML: 15
- Total: 625
- BibTeX: 11
- EndNote: 16
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1