the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Groundwater-CO2 Emissions Relationship in Dutch Peatlands Derived by Machine Learning Using Airborne and Ground-Based Eddy Covariance Data
Abstract. Peatlands worldwide have been transformed from carbon sinks to carbon sources due to years of intensive agriculture requiring low water tables. In the Netherlands, carbon dioxide (CO2) emissions from drained peatlands mount up to 5.6 Mton annually and, according the Dutch climate agreement, should be reduced by 1 Mton in 2030. It is generally accepted that mitigation measures should include raising the water level, and the exact influence of water table depth has been increasingly studied in recent years. Most studies do this by comparing annual Eddy Covariance (EC) site-specific CO2 budgets to mean annual effective water table depths (WTDe). However, here we apply a different approach: we integrate measurements from 16 EC towers with EC measurements from 141 flights by a low-flying research aircraft, in an interpretable machine learning framework. We make use of the different strengths of tower and airborne data, temporal continuity and spatial heterogeneity, respectively. We apply time frequency wavelet analysis and a footprint model to relate the measured fluxes to the underlying surface. Using spatio-temporal data, we train and optimize a boosted regression tree (BRT) machine learning algorithm and use Shapley values and various simulations to interpret the model’s outputs. We find that emissions increase with 4.6 tonnes CO2 ha-1 yr-1 (90 % CI: 4.0–5.4) for every 10 cm WTDe up to a WTDe of 0.8 meter. For more drained conditions, emissions decrease again, following an optimum-based curve. Furthermore, we find that this effect is stronger in winter than in summer and that it varies between sites. This study shows the added value of using ML with different types of instantaneous data, and holds potential for future applications.
- Preprint
(6429 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-431', Stefan Metzger, 23 Mar 2025
General Comments
This manuscript presents a compelling and well-structured study that integrates ground-based and airborne eddy covariance (EC) measurements using machine learning to investigate the relationship between water table depth and CO2 fluxes in Dutch peatlands. The research question is posed clearly, and the methodology is meticulously applied. By challenging the traditional assumption that EC measurements are only applicable in uniform landscapes, the authors take a bold and necessary step toward making direct greenhouse gas (GHG) flux measurements accessible for real-world, heterogeneous environments.
The paper is highly relevant to the biogeosciences community, especially in the context of natural resource management, climate-smart agriculture, and ecosystem-scale monitoring. It combines methodological rigor with policy-relevant insights, particularly regarding the quantification of CO2 flux changes from groundwater management practices. The integration of explainable machine learning via Shapley values offers transparency into the model’s predictions, which is an important step for stakeholder trust and regulatory applications for effective policy implementation.
I find the manuscript to have good scientific significance, excellent scientific quality, and good presentation quality. The figures and tables are generally well constructed and the narrative is engaging and logically structured. I recommend publication in Biogeosciences following minor revisions.
Specific Comments
- Figure 2: Please add a scale bar to clarify the geographic extent of the EC tower network and flight tracks (i.e., are we looking at 10s, 100s, or 1,000s of km?).
- Figure 3: The ZEG_PT tower is hard to locate even in the inset. Consider increasing font size or adding an arrow pointing to it.
- Figure 4: Airborne fluxes seem to show lower amplitude compared to tower fluxes. Have the authors considered vertical flux divergence or other systematic differences in data capture and processing? A cluster analysis could help evaluate whether the two platforms yield comparable flux regimes after controlling for diurnal cycle and differing surface heterogeneity in the footprints. Please also see comment 13 below.
- Table 2: Please include EC tower measurement heights in the text to facilitate comparisons with remote sensing data resolutions. For example, a 250 m MODIS resolution would match a 250 m tall tower (e.g., Xu et al., 2017) [https://doi.org/10.1016/j.agrformet.2016.07.019], whereas shorter towers may be better represented by higher-resolution products such as Sentinel.
- Table 3: The regional model explains ~60% of the variance, which is not dramatically higher than global models (e.g., Jung et al., 2020 [https://www.biogeosciences.net/17/1343/2020/]). Please discuss this apparent incongruity, given the expected higher information density at the regional scale.
- Figure 5: The comparison between regional and local models is insightful but largely confirms expectations. Consider moving the detailed discussion and figures to the appendix and condensing the main text.
- Figure 6: Clarify the sign convention for water table depth (e.g., lower water tables as positive depth vs. negative height). The current color coding and SHAP interpretation may confuse readers without a consistent definition.
- Figure 7: This figure helps resolve the confusion in Figure 6 by explicitly stating water table depth in cm below ground. Consider using negative values (e.g., -150 to 0 cm) throughout for better intuitiveness also for non-subject matter experts.
- Figure 10: This simulation result is the centerpiece of the manuscript. Please revise the phrase "increased with 10 cm" to "increased by 10 cm" for clarity.
- Figure 11: The SHAP-derived functional relationships are particularly powerful for integrating direct measurements and nonlinear interactions. Consider extending the outlook to include the potential for assessing CH4 fluxes and albedo effects using the same EC framework, enabling full net radiative forcing (NRF) or CO2e tradeoff evaluations through a consistent methodology.
- Line 130: Clarify whether the "2 km windows" were applied in a moving window fashion, and if so, specify the step size, degree of overlap, and the import of those choices on downstream data integration and results.
- Line 138: Vaughan et al. (2021) and Serafimovich et al. (2018) used the Metzger et al. (2012) [https://doi.org/10.5194/amt-5-1699-2012] footprint model, not Kljun et al. (2015). Please correct.
- Line 178: Please describe how EC tower data processing (e.g., block averaging, de-spiking, spectral correction, density correction, quality flags) compares with airborne EC data processing (e.g., Wavelet decomposition). This will help readers assess the interoperability of the datasets. Please also see comment 3 above.
- Line 365 & 528: To my knowledge Metzger et al. (2022) [https://doi.org/10.5194/amt-14-6929-2021] were the first to use combined airborne and tower EC heat and water flux data in ML applications to optimize flight track placement. Clarify how the current work extends this to in-situ CO2 fluxes.
Citation: https://doi.org/10.5194/egusphere-2025-431-RC1 -
RC2: 'Comment on egusphere-2025-431', Inge Wiekenkamp, 04 Apr 2025
I have read the manuscript entitled “Groundwater–CO2 Emissions Relationship in Dutch Peatlands Derived by Machine Learning Using Airborne and Ground-Based Eddy Covariance Data”, written by Laura M. van der Poel and co-authors, considered for publication in Biogeosciences. The manuscript describes how a very large dataset with eddy covariance data (16 EC towers with EC measurements from 141 flights) is used in combination with ML (machine learning) models to understand groundwater- CO2 interactions. At the same time, this study also uses this approach to also identify other drivers that can explain the observed patterns in NEE (CO2) fluxes at regional scale.
I think that this study is very well written, uses a very impressive dataset and uses quite advanced machine learning techniques to study these groundwater- CO2 interactions for larger regions in the Netherlands (regional scale). I think that the research fits very well to the journal Biogeosciences, has societal relevance, is timely, and nicely puts its results in perspective to other studies (Figure 11).
I strongly believe that this manuscript has high scientific quality and can be accepted for publication after minor revisions based on the following suggestions:
General Comments:
* Captions: I find that some of the figure captions do not cover all relevant information (for example Figure 6) and sometimes miss info on the items that are described and/or on the abbreviations (see also Biogesciences info on figure captions – “The abbreviations used in the figure must be defined, unless they are common abbreviations or have already been defined in the text”). However, I do want to stress that other captions in the manuscript provide very good and detailed information about the content of the figure (for example Figure 11). I would suggest to make sure that all figures have captions with similar content and quality.
* Title of Results Sections: The titles of the results sections are more focused on the results of a particular method and are in general pretty generic. In order to guide the readers to a particular section of the results, I suggest using section titles that better reflect the content of each section. For example, instead of using the title "Simulations" I suggest using something like "CO₂ Flux Simulations” or "CO₂ Flux Simulations and Groundwater Dynamics". Mind that these are just suggestions to illustrate the change in content of such headers.
* Strengthen Key findings: Even though I find the most important key finding (groundwater effect of fluxes) pretty well positioned in the abstract (and also in the conclusion), sometimes other key findings (for example when talking about the transferability of the models, process understanding obtained from ML SHAP analysis hinting at emission driven by heterotrophic respiration and effects of e.g. PAR– Fig 7, looking at the importance of accurate groundwater information and spatial and temporal variability observed between stations), the main take-aways from the different results sections could be more prominently mentioned in the sections. Sometimes some unexpected findings could also be discussed more in terms of their implications (e.g. why PeatDepth was assigned as a very important aspect, but did not show clear effects on fluxes when looking at the SHAP analysis), but I can also see that quite a lot of discussion is already provided in the current manuscript.
* Code availability: Information on the code availability is missing in this manuscript. Similar to a data statement, a code availability statement should probably be part of the manuscript. At least to mention what packages are generally available and what information could be accessed on request.
* Conclusion: I think that the abstract clearly states the relevance of the study, also specifically related to governmental decisions: see text – “In the Netherlands, carbon dioxide (CO2) emissions from drained peatlands mount up to 5.6 Mton annually and, according the Dutch climate agreement, should be reduced by 1 Mton in 2030.” I think the best way to end the story is to return in the conclusion a statement related to the governmental plans and the effects of changing groundwater levels. This would really make the story very “round” and would probably be an excellent ending of your manuscript. I was wondering in the end how much a measure, such as groundwater level changes, would help to achieve such planned reductions.
Detailed Comments:
1) Abstract, Line 10: “Using spatio-temporal data, we train and optimize a boosted regression tree (BRT) machine learning algorithm …” In this particular sentence it is not clearly mentioned what aim that training has. Please describe (short) what the BRT was trained to simulate.
2) Abstract, Lines 11-12: “We find that emissions increase with 4.6 tonnes CO2 ha-1 yr-1 (90% CI: 4.0-5.4) for every 10 cm WTDe up to a WTDe of 0.8 meter.” Here, the authors formulate the water table depth as a positive number (both for the indicated depth of 0.8 m. and also for the 10 cm increment). I assume that you are talking about the water table depth below the surface (e.g., -0.8 m) and about an increase in emissions with an increase (negative) of every 10 cm in water table depth. I suggest to make sure that this is clear to all readers when reading the abstract.
3) Abstract, Lines 12-13: “For more drained conditions, emissions decrease again, following an optimum-based curves”. I am not really sure what the authors mean with “optimum-based curves” and would suggest to maybe use another wording here, to make sure that all readers understand what this refers to (forgive me if I am wrong, but I think it’s not a widely recognized or standard term). Alternative, the authors could refer to a paper that describes this concept/ wording clearly.
4) Abstract, Lines 13-14: “Furthermore, we find that this effect is stronger in winter than in summer and that it varies between sites.” Does this refer to an increase/decrease in fluxes with an increasing/decreasing water table depth? This was not fully clear to me.
5) Abstract, Line 14: “This study shows the added value of using ML…” ML Abbreviation was not introduced before, please define it here.
6) Reference(s) to Klimaatakkoord, 2019 (several locations in the document, starting with the introduction). In the manuscript, the authors refer to a Dutch website where information is provided about the Dutch climate agreement (klimaatakkoord). I had difficulties finding the information here that was mentioned in the paper and would therefore suggest to directly cite the relevant document (https://www.klimaatakkoord.nl/binaries/klimaatakkoord/documenten/publicaties/2019/06/28/klimaatakkoord/klimaatakkoord.pdf). Moreover, I would suggest to cite the English version of the document, as this is probably more relevant for the international community (https://www.government.nl/binaries/government/documenten/reports/2019/06/28/climate-agreement/Climate+Agreement.pdf). Finally, I assume it would be good to cite using the standard format (authors/organization - Government of the Netherlands) here.
7) Methods, comparison of airborne NEE and tower NEE fluxes for “Friesland”: I generally really like this comparison and like this approach, but I think the following points are important to also mention in the manuscript. (a) As far as I understand, the tower fluxes and airborne fluxes are not calculated in the same way. Airborne fluxes were calculated using the wavelet approach and tower data were calculated probably using a Reynolds decomposition, correct? If this is the case, this probably should be stressed that this is another reason why direct intercomparison is not fully possible. (b) You showed an example for Friesland where the fluxes have similar magnitudes. It would be great if the authors could mention that this was also the case for the other regions. (c) The authors use a circular footprint for the footprints of the towers, and do not use the Kljun et al., (2015) model here. I understand the reason for doing so, but are these circular footprints homogeneous (which is what I would assume and what would “justify” the use of such footprints)?
8) Methods, Line 91-93: “A preliminary analysis we conducted based on a subset of the data with this method showed promising results for combining airborne and tower data, corresponding to existing estimates “. Here, I was not 100% sure to which existing estimates this sentence refers to? Do the authors refer to existing estimate of changes in CO2 emissions related to X cm change in groundwater levels in peatlands here (existing literature)? I suggest to specify what estimates are meant here, plus perhaps also including references to the existing literature.
9) Figure 2 – I am assuming that you are using the straight lines of the flight tracks for airborne EC flux calculations, and not the turns, correct? I suggest that it would be great if the figure would just show the tracks that were used for calculating fluxes (other regions could also show up, but maybe with an indication that these were not used for the flux calculation). I also would suggest to add a scalebar to the figure, to show the extend of the regions more clearly. Additionally, it would have actually been very nice to see how the peatlands are distributed over the regions. Could the authors add peatland information to the (same) map or add another window that shows this additionally?
10) In the description of the airborne measurements (2.1.2 Airborne Flux Measurements), no explicit information about the sensors is given directly in this part of the section, which is important for comparability/ quality assessment etc. Only at the end of the section, after providing information about the airborne EC processing, a reference to Vellinga et al., (2013) is made: “For more detail on the aircraft and its equipment, see Vellinga et al. (2013).” I propose to refer to this publication directly after the full description of the aircraft and its equipment is given.
11) Methods, Line 131: “Further processing was done by following the framework of Foken et al. (2004).” Here, it is very unclear what further processing is done and one would need to look at the reference of Foken what specific processing step(s) the authors refer to. I assume this refers to QAQC analyses and would suggest that the authors add this information in one or more sentences to make this clear.
12) Figure 3: I like the idea of the image to show a typical flight leg over a region, and it demonstrates how you use five overlapping footprints for a 2km region. I also like the relationship to a tower and its footprint. I would, however, improve the text in the caption to make the connection to the manuscript text clearer. I also find the figure in its current state quite busy (a lot of lines and circles) and therefore more difficult to “read”. I am wondering if one could simplify the image and for example leave out the orange lines with the distances between the footprint’s maximum extend and the flight path. One could probably further simplify the footprints by only showing the 80% (?) footprint for each 2 km section (the 5 overlapping footprints) in general, and only show more detail when zooming in.
Other suggestions for improvements to the figure include to (1) again add a scale bar, (2) put either a legend with the meaning of the symbols and color (footprint) next to the figure or add information about the elements that are incorporated and their meaning in more detail to the caption, (3) add a wind rose to the image with average wind direction information for the flight.
I, assume that the footprint for the tower (ZEG_PT) is more of a yearly footprint and is not related to the particular wind direction of the “typical flight day” that you are showing here. It’s also important to either adjust the figure to make sure that all footprints are specific for that particular example flight (also for the tower data), or I suggest to mention this in a note in the caption.
13) Section 2.1.3. EC Towers: I was wondering if the towers that are used in this study are very similar or different in terms of sensor setup and if the processing of the data was all done in the same way (would be at least good to mention here). I assume they are not fully processed in the same way (one is done via wavelets, other probably using Reynolds decomposition?). Additionally, I think it is important to also mention here how the fluxes are calculated (at least mention how the half-hourly tower data was derived from the raw data). This does not need to be very lengthy, but int would at least be good to mention the software and perhaps some QAQC (e.g. EddyPro, REddyProc, TK3, your own processing pipeline), so one can compare the processing of tower and aircraft data easier without having to look at the referred report.
14) Figure 4: I personally find the figure a little bit small and not that easy to read. I would therefore propose to increase the size of all the text in the figure. I already think the figure clearly shows the point you are trying to make (that the airborne fluxes and tower fluxes have similar ranges). However, I would suggest to use distinct coloring between tower (one group of colors) and airborne data (a quite distinct and different color), so that it is even easier to see in the graph what the airborne data is showing and what the tower data is showing. One could potentially also add an n below each month showing how much data is available for that particular month.
15) Table 2: This table nicely shows the products that were used for the ML algorithm. I have two small remarks related to this table. (1) Can you provide information about the uncertainty of the variables described with these products (e.g. uncertainty in provided water table depth, height, peat depth, etc.)? This would be something that could be worthwhile to mention as I assume that these supplementary data sources also have info sheets where they provide information about the quality and uncertainty of their products. This would be good to add here. (2) Can you provide official references to the products that you have used and described in this table (often products come with a clear data citation)? Partly, these are provided in the text, but also not completely.
16) Appendix A2 Reclassification soil classes – the classes are provided here in Dutch and are probably not readable for an international audience. I would definitely add an English nomenclature connected to the Dutch names in the “New class” column. Both tables have a caption with the text “Figure A1 and Figure A2”. I propose to change these captions to Table A1 and Table A2.
17) Methods, Lines 188 – 189: “Using the collected information, some additional covariates were calculated, such as effective water table depth (WTDe) based on groundwater level and elevation, the percentage of all peat classes together present in the footprint (AllPeat), peat on sand, and peat on peat. Combining peat depth with WTDe, the peat exposed to air (‘exposed peat depth’) in cm was calculated.” This is very descriptive and does not provide the equations used in this particular case. Either provide a reference to the used equations or write down the equations in this part of the manuscript, so that it’s fully clear how these covariates are calculated.
18) Methods, Line 198: “and by selecting one every four weeks of tower data,” Please consider rephrasing this segment of the sentence.
19) Methods, 2.3, Line 225: “As we assume the underlying processes are the same, …” I think it is important that the authors explain here what processes they refer to (I assume to the physical processes in the peatland regions that steer the NEE fluxes?).
20) Methods, 2.3, Line 226: “However, we expect the Overijssel model to be different, because while the aircraft covers agricultural land, we do not have any agricultural tower sites in that area, such as in the Groene Hart and in Friesland.” Pease consider rephrasing the segment “while the aircraft covers agricultural land” and refer to the aircraft flux data instead. I was also wondering why the model would be different if it still includes agricultural fluxes from aircraft data. Can the authors elaborate why this would be different from having similar fluxes from towers with small footprints? Are the aircraft footprints too large to capture a purely “agricultural sites” signal? Do these fluxes provide a less specific groundwater signal?
21) Methods, Sections 2.2 – 2.3: I was a little bit lost in which features were used in the ML model, especially since section 2.3 already talks about the gapfilling of some of the features (see below), but the text does not go into detail on the features that were considered/ obtained after the tuning. While reading further, I understand that this information is provided rater in the results, but perhaps a little reference to the considered/ used features in this part of the text would also be nice and helpful for the reader (what features were considered in the beginning before the tuning).
22) Methods, Lines 241 – 243: “As our sites contained gaps in meteorological data, we used publicly available hourly data from Dutch Meteorological Institution (KNMI) and interpolated in time to obtain half-hourly values. We used station Cabauw for the Groene Hart area and station Hoogeveen for the Overijssel and Friesland areas.” Here, it would be important to mention which features you were gapfilling for your ML model (which meteo features did you consider). Please specify. Plus, do you talk here about the use of one “average” temperature, humidity etc. for each of the three regions (and no detailed spatial product - which is what I understood from your text)? Why did you not gapfill the meteorological data from the specific sites? In the following segment “interpolated in time to obtain” I assume you refer to a linear interpolation, correct? Please add the information about the interpolation method. On top of that, as mentioned in the comment before this segment comes a little bit “out of the blue” as the features that you are using in your model are not fully described in the methods section 2.2 before, but appear more prominent in the results.
23) Methods, Section 2.3 and last sentence: “We let the model predict every half-hour flux and constructed annual NEE balances based on the actual WTDe level, as well as for hypothetical situations where the WTDe is altered by ±10 cm.” In this section and in this segment, it is not clear where you are predicting your half-hourly NEE fluxes for. I assume you predict an average/ regional flux for all 3 regions and for all areas together? It would be very important to clearly specify this here in this part of the manuscript, so that the readers directly get what you are doing.
24) Results, 3.1: The section starts with “In this section we present the model optimization results.” Perhaps this sentence could be elaborated a little (what model? ML-based peatland NEE model - to do what?), so that people that might not read the whole manuscript can jump to this section and directly understand what the manuscript is about.
25) Results, Line 254: “For some soil classes such as pV and hV” Please directly mention in the text what these classes are, so that people can directly read further, without having to look at the Appendix of you paper.
26) Abbreviations ML features: In the manuscript, the features that were used in the ML model to predict the fluxes are often only mentioned as abbreviations. This is totally fine if all abbreviations are explained in the text, which is the case with a large part of the features (such as EVI, NDVI, WTDe), but not for others (PAR, RH etc.) Please be sure to have all of these explained in the manuscript. On top of that, I would suggest to give full names in the captions of relevant figures (for example Figure 6) and your Appendix (for example Appendix C, Figure C.1). Units (if relevant) for these features would also be good to clearly mention here.
27) Appendix C, Figure C.1: I suggest to make this figure larger in the Appendix to make it easier to read. Also, as mentioned before, I would explain the features in the caption, so that it’s clear to the reader what features they are looking at, without having to search for their meaning elsewhere. I also noticed that some features do not show up for a particular case (airborne/ tower/ merged) and assume this is because of their low correlation score in that particular case. Is that true? If that’s the case, it would be good to mention in the caption.
28) Table Appendix D1: there is a comma (“,”) at the end of the caption. Please replace with a “.29) Table 3: Based on this table’s R² values (bef. vs. aft.), the hyperparameter tuning improved the model only a little bit for most regions. However, the improvement for the Overijssel region is really large. I think this was not really discussed in the manuscript, but it would actually be interesting to see why the hyper tuning improved the model that much for this particular region.
30) Figure 10 and Results Section 3.3 (“Simulations”). First of all, it would be good to mention here, where the “Range in literature” comes from (one or multiple citations). This range is also potentially not always very visible. Probably one could work with a transparency to make the boxplots in the front and the “range in literature” both visible. Perhaps in this case violin plots, instead of boxplots would also give a clearer indication of the distribution of the data.
31) Figure 11: The comparison with other studies is interesting, I think it is also important to think about the fact that these studies all identify different functions based on different datasets using partly different measurement methods and data originating from different areas (locations). This should probably also be shortly addressed in the discussion.
32) 4.4 Implications for mitigation strategies: I suggest to make a link here to the plans of the Dutch government and what role your study results play here.Citation: https://doi.org/10.5194/egusphere-2025-431-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
163 | 52 | 5 | 220 | 5 | 6 |
- HTML: 163
- PDF: 52
- XML: 5
- Total: 220
- BibTeX: 5
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 66 | 30 |
Denmark | 2 | 34 | 15 |
Netherlands | 3 | 31 | 14 |
China | 4 | 12 | 5 |
Germany | 5 | 11 | 5 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 66