Preprints
https://doi.org/10.5194/egusphere-2025-4049
https://doi.org/10.5194/egusphere-2025-4049
01 Sep 2025
 | 01 Sep 2025

A machine learning approach to driver attribution of dissolved organic matter dynamics in two contrasting freshwater systems

Daniel Mercado-Bettín, Ricardo Paíz, Valerie McCarthy, Eleanor Jennings, Elvira de Eyto, Angeles M. Gallegos, Mary Dillanee, Juan C. Garcia, José J. Rodríguez, and Rafael Marcé

Abstract. Predicting water quality variables in lakes is critical for effective ecosystem management under climatic and human pressures. Dissolved organic matter (DOM) serves as an energy source for aquatic ecosystems and plays a key role in their biogeochemical cycles. However, predicting DOM is challenging due to complex interactions between multiple potential drivers in the aquatic environment and its surrounding terrestrial landscape. This study establishes an open and scalable workflow to identify potential drivers and predict fluorescent DOM (fDOM) in the surface layer of lakes by exploring the use of supervised machine learning models, including random forest, extreme gradient boosting, light gradient boosting, catboosting, k-nearest neighbors, support vector regression and linear model. It was validated in two contrasting systems: one natural lake in Ireland with a relatively undisturbed catchment, and one reservoir in Spain with a more human-influenced catchment. A total of 24 potential drivers were obtained from global reanalysis data, and lake and river process-based modelling. Partial dependence and SHapley Additive exPlanations (SHAP) analises were conducted for the most influential drivers identified, with soil moisture, soil temperature, and Julian day being common to both study sites. The best prediction was found when using the CatBoost model (during hold-out testing period, Irish site: KGE > 0.69, r² > 0.51; Spanish site: KGE > 0.66, r² > 0.54). Interestingly, when only using drivers from globally accessible climate and soil reanalysis data, the prediction capacity was maintained at both sites, showcasing potential for scalability. Our findings highlight the complex interplay of environmental drivers and processes that govern DOM dynamics in lakes, and contribute to the modelling of carbon cycling in aquatic ecosystems.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share

Journal article(s) based on this preprint

20 Apr 2026
A machine learning approach to driver attribution of dissolved organic matter dynamics in two contrasting freshwater systems
Daniel Mercado-Bettín, Ricardo Paíz, Valerie McCarthy, Eleanor Jennings, Elvira de Eyto, Angeles M. Gallegos, Mary Dillane, Juan C. Garcia, José J. Rodríguez, and Rafael Marcé
Biogeosciences, 23, 2661–2685, https://doi.org/10.5194/bg-23-2661-2026,https://doi.org/10.5194/bg-23-2661-2026, 2026
Short summary
Daniel Mercado-Bettín, Ricardo Paíz, Valerie McCarthy, Eleanor Jennings, Elvira de Eyto, Angeles M. Gallegos, Mary Dillanee, Juan C. Garcia, José J. Rodríguez, and Rafael Marcé

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-4049', Thelma Panaïotis, 12 Dec 2025
    • AC1: 'Reply on RC1: Thelma Panaïotis', Daniel Mercado-Bettín, 01 Feb 2026
  • RC2: 'Comment on egusphere-2025-4049', Anonymous Referee #2, 12 Jan 2026
    • AC2: 'Reply on RC2: Anonymous Referee #2', Daniel Mercado-Bettín, 01 Feb 2026

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-4049', Thelma Panaïotis, 12 Dec 2025
    • AC1: 'Reply on RC1: Thelma Panaïotis', Daniel Mercado-Bettín, 01 Feb 2026
  • RC2: 'Comment on egusphere-2025-4049', Anonymous Referee #2, 12 Jan 2026
    • AC2: 'Reply on RC2: Anonymous Referee #2', Daniel Mercado-Bettín, 01 Feb 2026

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
ED: Reconsider after major revisions (02 Feb 2026) by Bertrand Guenet
AR by Daniel Mercado-Bettín on behalf of the Authors (10 Mar 2026)  Author's response   Author's tracked changes   Manuscript 
ED: Reconsider after major revisions (13 Mar 2026) by Bertrand Guenet
ED: Referee Nomination & Report Request started (13 Mar 2026) by Bertrand Guenet
RR by Shuo Chen (25 Mar 2026)
ED: Publish as is (31 Mar 2026) by Bertrand Guenet
AR by Daniel Mercado-Bettín on behalf of the Authors (07 Apr 2026)  Author's response   Manuscript 

Journal article(s) based on this preprint

20 Apr 2026
A machine learning approach to driver attribution of dissolved organic matter dynamics in two contrasting freshwater systems
Daniel Mercado-Bettín, Ricardo Paíz, Valerie McCarthy, Eleanor Jennings, Elvira de Eyto, Angeles M. Gallegos, Mary Dillane, Juan C. Garcia, José J. Rodríguez, and Rafael Marcé
Biogeosciences, 23, 2661–2685, https://doi.org/10.5194/bg-23-2661-2026,https://doi.org/10.5194/bg-23-2661-2026, 2026
Short summary
Daniel Mercado-Bettín, Ricardo Paíz, Valerie McCarthy, Eleanor Jennings, Elvira de Eyto, Angeles M. Gallegos, Mary Dillanee, Juan C. Garcia, José J. Rodríguez, and Rafael Marcé

Data sets

Data used in the manuscript for the first study site Daniel Mercado-Bettín https://github.com/danielmerbet/driver_attribution_fdom/tree/main/feeagh/data

Data used in the manuscript for the second study site Daniel Mercado-Bettín https://github.com/danielmerbet/driver_attribution_fdom/tree/main/sau/data

Model code and software

Codes used to obtain the results shown in the manuscript Daniel Mercado-Bettín https://github.com/danielmerbet/driver_attribution_fdom/tree/main

Daniel Mercado-Bettín, Ricardo Paíz, Valerie McCarthy, Eleanor Jennings, Elvira de Eyto, Angeles M. Gallegos, Mary Dillanee, Juan C. Garcia, José J. Rodríguez, and Rafael Marcé

Viewed

Total article views: 908 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
485 383 40 908 32 42
  • HTML: 485
  • PDF: 383
  • XML: 40
  • Total: 908
  • BibTeX: 32
  • EndNote: 42
Views and downloads (calculated since 01 Sep 2025)
Cumulative views and downloads (calculated since 01 Sep 2025)

Viewed (geographical distribution)

Total article views: 880 (including HTML, PDF, and XML) Thereof 880 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 01 May 2026
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
Understanding what shapes lake water quality is vital in a changing world. We studied dissolved organic matter, a key part of water quality in lakes and the carbon cycle, to analyse its environmental drivers and make predictions, by using machine learning. Tested in lakes in Ireland and Spain, it showed predictive potential, even when relying only on global climate and soil data. This helps explain how land and climate conditions influence freshwater resources. It can be reproduced worldwide.
Share