the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Real-time Monitoring of Petroleum Hydrocarbons in Groundwater using Hybrid Machine Learning Architectures
Abstract. Monitoring petroleum hydrocarbon (PHC) plumes in groundwater is essential for managing oil contamination but is often hindered by high costs. We evaluated machine learning (ML) frameworks that estimate concentrations of benzene, ethylbenzene, and xylenes (BEX), using affordable, in situ water quality parameters (iWQPs) as inputs: pH, dissolved oxygen, electrical conductivity, and oxidation-reduction potential. Due to a scarcity of field data, we trained and tested models on high-resolution virtual data generated by a reactive transport model. We compared a long short-term memory (LSTM) network against classical algorithms (multiple linear regression, random forest, support vector regression, XGBoost) and an LSTM-XGBoost hybrid. Model performance depended on the underlying geochemical relationship between iWQPs and BEX. Accurate predictions (R² ≥ 0.80, MAPE < 2.3 %) were achieved when iWQPs were strongly correlated with BEX degradation (e.g., as a primary electron donor); the LSTM model yielded predictions within a 5 % error margin for 70 % of the test cases. Performance declined sharply (R² < 0) during periods where iWQPs were correlated with non-volatile dissolved organic carbon, another component of dissolved PHC. Incorporating hydraulic head data improved accuracy by informing the model of groundwater flow dynamics. While the LSTM model struggled to extrapolate beyond its training data (e.g., during extreme flow events), it reliably detected the direction of concentration trends, providing a valuable trigger for adaptive monitoring. We also demonstrated how a hybrid Kalman filter could successfully capture concentration trends after source removal through recursive updating. Our proposed ML framework provides BEX level estimation for improved groundwater monitoring.
- Preprint
(1839 KB) - Metadata XML
-
Supplement
(2320 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- CC1: 'Comment on egusphere-2025-5842', Giacomo Medici, 13 Jan 2026
- RC1: 'Comment on egusphere-2025-5842', Massimiliano Schiavo, 18 Feb 2026
-
RC2: 'Comment on egusphere-2025-5842', Anonymous Referee #2, 31 May 2026
================
General comments
================
This manuscript is a description of a numerical test to asses the applicability of ML algorithms to estimate hydrocarbon concentrations in a thin aquifer from measurements of pH, dissolved oxygen, electrical conductivity, and oxidation-reduction potential.
The work presents some flaws, which are discussed in detail in the specific comments below. This prevents the results of this work to be considered of general validity; in other words, they are strictly related to the specific test cases.
In conclusion, I think that the manuscript cannot be considered for publication in its resent version, but requires an accurate major revision.
================
Specific comments
================
- I found it quite confusing: ML architectures are “modeling” tools, they cannot be considered as “monitoring” tools. I would suggest to change the title, possibly as “Hybrid Machine Learning Architectures to Estimate Petroleum Hydrocarbons Concentrations in Groundwater from Proxy Data”. Moreover, the whole text should be modified accordingly.
- Numerical tests are conducted with a 2D model. In other words, this assumes that the horizontal component of groundwater and contaminant flow is directed along a straight line and that all physical quantities are homogeneous along the perpendicular direction. What if a more realistic 3D flow and transport setup is considered? In particular, the scenario with a single injection well creates a 3D flow, unless it is assumed that an arrays of wells is installed, perpendicular to the flow direction. The very basic physical assumptions supporting the numerical flow and transport model should be presented in a better way, and the limitations should be discussed.
- The application of the Kalman filter refers to a single specific scenario. Therefore it is overall of minor relevance for the work.
- Line 205. Assuming a Gaussian distribution for measurement errors is common and generally acceptable. However, assuming a Gaussian distribution for process noise or modeling errors is more debatable, even if it is common practice in the scientific literature, but without a proper physical support.
- Line 222. A percentage error has a very different practical relevance for low and high concentrations. For high concentrations, a small percentage error may correspond to high values: this could impact also the overcome of thresholds to declare water drinkable. For low concentrations, higher percentage errors could be acceptable, because their practical impact could be less significant. Therefore, I am not sure that the use of MAPE is the best choice.
- Figure 3. The “90 days” sequence length seems to proved the worst results, doesn’t it? I could not find any comment about this.
- Figure 5. The behavior of the LSTM predictions for the simulation time from year 35 to year 41 is very strange: the behavior of the learning time (from year 35 to year 39) is quite regular: a clean signal with annual period and small amplitude around a constant average. The prediction shows a counterphase signal with a much higher amplitude. The explanation given at lines 284ff is quite confusing. The poor fit is for the first examined period, not the third one, and it refers to years 35 to 41, not 40 to 60. May be, I misunderstood the text. Similar comments could apply to figure 8.
=================
Technical comments
=================
- Line 67. “30-360 days” is equal to -330 days. I warmly suggest to follow the instructions of section “7.7 Clarity in writing values of quantities” of the NIST “Guide for the Use of the International System of Units (SI)” downloadable at https://physics.nist.gov/cuu/pdf/sp811.pdf.
- Lines 82, 200, 445. Is Simon (2006) the best reference? Why not recalling the seminal papers by Kalman? Or some of the textbooks and papers on the use of KF in (non-linear) hydrology?
- Figure 2. I think it would be useful to add a label, e.g., “Time (years)”, for the x axis. Why using two squares for each year? Wouldn’t it be more simple and clear to draw one cell for each year?
- Figure 3. I understand that the numerical simulation of flow and transport covered 100 years. I couldn’t find this information in the text: did I miss it?
Citation: https://doi.org/10.5194/egusphere-2025-5842-RC2
Interactive computing environment
Hybrid Machine Learning Models for Estimating Petroleum Hydrocarbon Concentration in Groundwater Chen Lester R. Wu et al. https://doi.org/10.4121/0a23147e-ba85-4ba2-a058-ba199c65d711
Virtual Experiments with Reactive Transport Modelling using FloPy: Transport and Degradation of Dissolved Petroleum Hydrocarbons in Groundwater Chen Lester R. Wu et al. https://doi.org/10.4121/f7742f02-ee3a-4a84-adf1-625b4a9fd703
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 1,475 | 775 | 113 | 2,363 | 247 | 102 | 106 |
- HTML: 1,475
- PDF: 775
- XML: 113
- Total: 2,363
- Supplement: 247
- BibTeX: 102
- EndNote: 106
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General comments
Good and robust research on contaminant transport in groundwater. The authors need to provide more detail before publication. See my specific comments to fix the issues.
Specific comments
Lines 26-27. “Groundwater contamination by petroleum hydrocarbons (PHCs) remains an environmental challenge, particularly in areas affected by historical spills or leaks”. General statement not backed up by references. Please, insert general literature on the topic.
- Agbotui, P. Y., Firouzbehi, F., Medici, G. 2025. Review of effective porosity in sandstone aquifers: insights for representation of contaminant transport. Sustainability, 17(14), 6469.
- Li, G., Huang, W., Lerner, D. N., Zhang, X. 2000. Enrichment of degrading microbes and bioremediation of petrochemical contaminants in polluted soil. Water Research, 34(15), 3845-3853.
Line 88. The aim of the research is clear. But what about the 3 to 4 specific objectives? Please, describe them by using numbers (e.g., i, ii, and iii).
Line 90. If you use MODFLOW-2005 you need much more detail on the boundary conditions.
Line 93. Provide more detail on the boundary conditions also for MT3DMS.
Line 220. Why only MAE? What about just Mean Error and Root Mean Squared Error?
Lines 490-510. I can see 5 bulletin points in your conclusions. Therefore, the specific objectives (see comment above) must be the same number to match.
Figures and tables
Would you like to add the flow field output of MODFLOW?
What about horizontal slices for the contaminant transport? I can see only a vertical one.
Figure 1. Important figure, make it larger.
Figure 1. What about a vertical scale in meters above the sea level?
Figure 3. Un-clear. Please, rise the graphic resolution.
Figure 5. Make the legend larger. You can use 3 lines.