the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Multi-depth soil moisture dynamics to rainfall events: An automated machine learning approach
Abstract. This study proposes an integrated, event-based framework for quantifying soil moisture dynamics at multiple depths (10, 20, 30, and 40 cm) in response to rainfall events using an automated machine learning (AutoML) approach. At the observatory we record the hydrometeorological and soil moisture data at different depth below the ground surface at every 10-minute intervals. We use these datasets to capture both rapid single-peak and gradual multiple-peak soil moisture responses during diverse rainfall events.
Recognising that manual model selection and hyperparameter tuning are labour intensive and may not fully capture the complex, non-linear interactions among hydrometeorological variables. Here we propose an AutoML framework that leverages Bayesian optimisation to predict subsurface soil moisture at different depths. The model was evaluated under four temporal scenarios: S1 (March–May), S2 (March–June), S3 (March–July), and S4 (March–August), for the full dataset and rainfall-only instances, separately. This automatic selection and tuning of various regression models result in superior predictive performance as compared to benchmark algorithms. The coefficients of determination ranges from 0.88 to 0.98 and minimal root mean squared errors (1.6 %–3.4 %). Further, the global sensitivity analysis indicates that the atmospheric humidity and dew point strongly influence near-surface moisture. The solar radiance and evaporation drive moisture depletion, and soil temperature gradients play a critical role in the vertical profile of the soil column. These findings highlight the value of integrating advanced AutoML techniques with event‐based hydrological analysis to enhance our understanding of soil moisture variability, which has significant implications for water resource management, agricultural planning, and hazard mitigation in variable climatic regimes.
- Preprint
(34339 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 01 Jul 2025)
-
RC1: 'Comment on egusphere-2025-961', Anonymous Referee #1, 19 Jun 2025
reply
- The overall introduction & literature review is very poor and unacceptable. There is no any problem discussion or any research gap that justify the novelty and importance of the present study. How this kind of investigation was studied in the literature? Is there any similar published work? We cannot find any of these points in the manuscript.
- Description o the dataset is vague, basic in nature, and incomplete.
- The AutoML-SM bloc is unclear and it is poorly presented. The readers cannot understand how this kind of modelling farmwork can be an automated ML.
- The problem formulation seems to be standard and well known: linking an ensemble of input (relative humidity (RH), wind speed (WS), wind direction (WD), solar radiance (SR), evaporation (EV), rainfall (Rain), dew point (DP) and soil temperature at 10, 20, 30, and 40 cm depth (ST1, ST2, ST3 & ST4)) to a target variable (soil moisture values at different depth). The physic phenomena behind the soil moisture variation are highly related to the weather variables especially to the rainfall, and any regression model developed based on this assumption seem to be trivial and in fact a standard problem formulation.
- Section results is poorly formulated and unclear. Models’ evaluation ad comparison seem confusion. Necessary figures are missing, and numerical models’ performances are incomplete.
- Models Interpretability and Explainability is completely missing. At this level of publication, the use of such techniques, i.e., SHAP and LIME is mandatory.
- There is no any discussion and comparison of the results with previous published paers.
Citation: https://doi.org/10.5194/egusphere-2025-961-RC1 -
RC2: 'Comment on egusphere-2025-961', Anonymous Referee #2, 20 Jun 2025
reply
The manuscript presents an interesting approach using AutoML for soil moisture prediction. However, the abstract lacks an introductory statement and would be clearer in one paragraph. The introduction should better review existing studies and explain how this work improves on previous methods. The use of only one year of data and black soil limits the study’s generalizability, and the term "scalable" should be reconsidered without validation across multiple years and soil types. The formatting of subsections and capitalization consistency should be improved. Finally, the paper lacks references to related studies, which weakens its connection to existing literature. Addressing these points would strengthen the manuscript.
Comments:
- The abstract lacks an introductory statement that introduces the importance of the research topic. It begins with the methodology. As per journal abstract guidelines, a brief, general introduction requires in abstract.
- Also, it will be better to have an abstract in single paragraph rather than two.
- The introduction could benefit from a clearer review of existing studies to highlight the gaps your research is addressing. It would also be helpful to improve the flow of ideas by better connecting traditional methods, machine learning, and your method approach. Finally, emphasize more clearly why your study is needed and what unique contributions it makes.
- Line 135: The introduction mentions using an AutoML framework for soil moisture prediction at multiple depths but fails to discuss existing studies that have applied machine learning or deep learning for similar tasks. It would be helpful to clearly state how this study differs from or improves upon these approaches to highlight its contribution.
- In Section 3.1.1, it would be clearer to use "a)", "b)", etc., instead of the hyphen ("-") when starting subsections for better readability and consistency.
- The study focuses on black soil, which has unique characteristics that may not apply to other soil types, but this is not addressed when discussing infiltration dynamics (line : 175). The influence of black soil’s physical properties, such as clay content, on infiltration and moisture retention should be considered, as these factors are crucial for understanding the results and assessing their generalizability.
- The writing style of the subsections and sections should be consistent throughout the manuscript. If capitalizing section and subsection titles, it is important to maintain this style consistently across all sections. (Section 3.1.2)
- Results and Discussion:
- The study uses data from only one year (March to August 2024) to train and validate the model. This limits the model’s ability to generalize and capture the variability in soil moisture dynamics that could arise in different years, especially under varying climatic conditions such as extreme weather events such as droughts or unusually heavy rains. It needs to be mentioned.
- The study focuses exclusively on black soil, which has unique characteristics, including high moisture retention capacity and shrinkage behavior in alternate wetting and drying cycles. However, this is not adequately discussed when analyzing infiltration or moisture dynamics.
- While the study mentions the impact of small perturbations in input variables on soil moisture predictions, it lacks a detailed discussion or analysis of how these small variations affect the model’s accuracy.
- The model demonstrates high performance with high-intensity rainfall events, but may face challenges with lower-intensity, long-duration events. The discrepancy in prediction accuracy between these events suggests the model may be overfitting to extreme rainfall events.
- The manuscript lacks adequate reference of related studies in the introduction, methodology, and discussion sections. It introduces concepts such as the AutoML-SM framework for soil moisture prediction without contextualizing them within existing literature. This makes it challenging for readers to understand how the proposed approach aligns with or diverges from previous research in the field.
- Line 555: The study uses data from only one year and one soil type (black soil). Given this limited scope, it may not be entirely appropriate to refer to the approach as a "scalable" one. For a more accurate claim of scalability, it would be beneficial to include validation across multiple years, soil types, and diverse environmental conditions to ensure the model's broader applicability.
Citation: https://doi.org/10.5194/egusphere-2025-961-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
216 | 52 | 9 | 277 | 8 | 8 |
- HTML: 216
- PDF: 52
- XML: 9
- Total: 277
- BibTeX: 8
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1