the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Synergistic identification of hydrogeological parameters and pollution source information for groundwater point and areal source contamination based on machine learning surrogate-artificial hummingbird algorithm
Abstract. Effectively remediating groundwater contamination relies on the precise determination of its sources. In recent years, a growing research focus has been placed on concurrently estimating hydrogeological characteristics and locating pollutant origins. However, the identification of precise synergistic identification of point and areal contamination sources of groundwater and combined hydrogeological parameters has not been effectively solved. This study developed an inversion framework that integrates machine learning surrogates with the artificial hummingbird algorithm (AHA). The surrogate models approximating the simulation system were constructed using both backpropagation neural networks (BPNN) and Kriging techniques. The AHA was then employed to solve the optimized model, and its performance was benchmarked against particle swarm optimization (PSO) and the sparrow search algorithm (SSA). The applicability of this inversion framework was assessed by application to point sources of contamination (PSC) and areal source contamination (ASC). The robustness of the framework was verified through application to scenarios with different noise levels. The results showed that surrogate model constructed by the BPNN method provided estimates that were closer to those of the simulation model in comparison to the kriging method, coefficient of determination (R2) is 0.9994 and mean relative error (MRE) is 3.70 % in PSC, and R2 is 0.9989 and MRE is 4.48 % in ASC. The performance of the AHA exceeded those of the PSO and the SSA. In PSC, MRE of the identification result is 1.58 %; In ASC, MRE of the identification result is 2.03 %, with the AHA able to rapidly and accurately identify the global optimum and improve the inversion efficiency. The proposed inversion framework was demonstrated to apply to both groundwater PSC and ASC problems with strong robustness, providing a reliable basis for groundwater pollution remediation and management.
- Preprint
(1776 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 01 Jul 2025)
-
CC1: 'Comment on egusphere-2025-2083', Nima Zafarmomen, 07 Jun 2025
reply
The paper presents a novel and well-structured inversion framework combining BPNN surrogate modeling with the AHA optimization algorithm for groundwater contamination source identification. The methodology is sound and the results are promising. The paper is generally well-written, but could benefit from some improvements in organization, clarity, and depth of discussion in certain sections.
The introduction provides good background but could better highlight the novelty of the work compared to previous studies. What specific gaps does this study address that haven't been adequately covered before?
For the surrogate modeling section, it would be helpful to provide more details about the architecture of the BPNN (number of layers, nodes, etc.) and how these were determined.
The robustness analysis is good, but could be strengthened by showing how the errors distribute across different parameter types (e.g., are some parameters more sensitive to noise than others?).
The discussion of limitations is good but could be expanded. For example, how might the method perform with more complex, heterogeneous aquifers? What are the computational limits?
The practical implications section could be expanded. How would this method be implemented in real-world remediation projects?
While the proposed BPNN-AHA framework presents a robust approach, the authors may wish to consider and discuss alternative methodologies such as data assimilation techniques, which have shown promise in similar environmental modeling applications. For instance, data assimilation and cite paper such as Assimilation of sentinel‐based leaf area index for modeling surface‐ ground water interactions in irrigation districts
Citation: https://doi.org/10.5194/egusphere-2025-2083-CC1 -
RC1: 'Comment on egusphere-2025-2083', Anonymous Referee #1, 09 Jun 2025
reply
Comment on “Synergistic identification of hydrogeological parameters and pollution source information for groundwater point and areal source contamination based on machine learning surrogate-artificial hummingbird algorithms” by Luo et al.
Luo et al. present an inversion framework that combines BPNN surrogate modeling with the AHA optimization algorithm for groundwater contamination source identification, and they comprehensively evaluate the performance of different surrogate models. The work is generally well written. However, several significant issues must be addressed to improve the clarity of the paper. The most critical concern lies in the structure of the Introduction. Although the authors provide an extensive literature review, the research gap and the novelty of this study in relation to previous work are not clearly emphasized. Secondly, the Discussion section lacks depth, which substantially weakens the novelty and the implications of this study. Finally, the language throughout the manuscript should be thoroughly revised and polished before publication.
Specific comments:
Lines 127-135 The authors are recommended to reorganize the research objectives. The current unclear objectives obscure the novelty of the paper. This confusion is caused by an unclear summary of the research gap.
Line 151 MODFLOW and MT3DMS are not packages.
Line 305 Replace “inhomogeneous” by “Heterogeneous”
Lines 387-389 The authors are suggested to combine this sentence with the previous paragraph to create a clearer contrast, which would make the comparison more striking. Additionally, I am skeptical about the reported runtime for the 1000 iterations. Considering that the model in this study is at the field scale, consists of only a single model layer, and uses a rather coarse grid discretization, a runtime of 500 hours seems excessively long.
Lines 420-424 This section reads more like a repetition of the Introduction. It is recommended that the authors first present their own findings in the Discussion before comparing them with other studies. Additionally, emphasizing the implications of this study would greatly enhance the value of the paper.
Lines 438-440 Please specify the advantages more clearly.
Lines 483 Including the limitations is good. The authors are suggested to include limitations in a separate section.
Lines 501 The authors are encouraged to include more quantitative findings rather than just qualitative notifications.
Citation: https://doi.org/10.5194/egusphere-2025-2083-RC1 -
RC2: 'Comment on egusphere-2025-2083', Anonymous Referee #2, 12 Jun 2025
reply
While the manuscript addresses an important challenge in groundwater contamination source identification, its novelty is limited. The core contribution lies in introducing the Artificial Hummingbird Algorithm (AHA) into a simulation-optimization framework, which is not a fundamentally new algorithm nor specifically tailored to groundwater inverse problems. Furthermore, many techniques used—BPNN, Kriging, PSO, SSA—are already well-established in the literature.
Moreover, the reported simulation results show extremely high precision (e.g., R² > 0.999, MRE < 2%), which may suggest possible overfitting or idealized experimental setups. The study lacks rigorous testing of generalization under realistic uncertainty scenarios, such as sparse observations, complex geological heterogeneity, or parameter noise. Without such assessments, the practical robustness and transferability of the proposed framework remain questionable.
The paper would benefit from a deeper methodological insight into why AHA performs better in this specific problem context, rather than merely benchmarking its numerical results. The current framing gives the impression of "algorithm replacement" without substantive theoretical or application-driven innovation.
Citation: https://doi.org/10.5194/egusphere-2025-2083-RC2 -
CC2: 'Comment on egusphere-2025-2083', Giacomo Medici, 17 Jun 2025
reply
General comments
Good modelling research in the field of subsurface hydrology. Please, see my comments to fix the existing minor issues
Specific comments
Line 64. “Hydrogeological conditions”. Insert recent papers on high-resolution datasets for determanation of hydrogeological conditions at contamianted sites.
- Maliva, R. G., Herrmann, R., Coulibaly, K., & Guo, W. (2015). Advanced aquifer characterization for optimization of managed aquifer recharge. Environmental Earth Sciences, 73, 7759-7767.
- Medici, G., Munn, J. D., & Parker, B. L. (2024). Delineating aquitard characteristics within a Silurian dolostone aquifer using high-density hydraulic head and fracture datasets. Hydrogeology Journal, 32, 1663-1691.
Line 151. MODFLOW, which version?
Line 282. Specify the type of aquifer in terms of lithology.
Line 302. Same here, specify the type of aquifer in terms of lithology.
Lines 340-341. “Mean relative error”. I suggest Mean Absolute Relative Error because there is the modulus.
Line 521. Add a “take home message” for the researchers working in the field.
Figures and tables
Figure 5. Add the general flow direction with an arrow.
Figure 5. Alternatively, divide the figures in two parts (A and B) adding the piezometric surfaces.
Figure 6. I would add a spatial scale using a bar.
9 tables are too many. Some of them can go in the Supplementary Material?
Citation: https://doi.org/10.5194/egusphere-2025-2083-CC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
126 | 31 | 12 | 169 | 9 | 13 |
- HTML: 126
- PDF: 31
- XML: 12
- Total: 169
- BibTeX: 9
- EndNote: 13
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1