the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Detection and Tracking of Carbon Biomes via Integrated Machine Learning
Abstract. In the framework of a changing climate, it is useful to devise methods capable of effectively assessing and monitoring the changing landscape of air-sea CO2 fluxes. In this study, we developed an integrated machine learning tool to objectively classify and track marine carbon biomes under seasonally and interannually changing environmental conditions. The tool was applied to the monthly output of a global ocean biogeochemistry model at 0.25° resolution run under atmospheric forcing for the period 1958–2018. Carbon biomes are defined as regions having consistent relations between surface CO2 fugacity (fCO2) and its main drivers (temperature, dissolved inorganic carbon, alkalinity). We detected carbon biomes by using an agglomerative hierarchical clustering (HC) methodology applied to spatial target-driver relationships, whereby a novel adaptive approach to cut the HC dendrogram based on the compactness and similarity of the clusters was employed. Based only on the spatial variability of the target-driver relationships and with no prior knowledge on the cluster location, we were able to detect well-defined and geographically meaningful carbon biomes. A deep learning model was constructed to track the seasonal and interannual evolution of the carbon biomes, wherein a feed-forward neural network was trained to assign labels to detected biomes. We find that the area covered by the carbon biomes responds robustly to seasonal variations in environmental conditions. A seasonal alternation between different biomes is observed over the North Atlantic and Southern Ocean. Long-term trends in biome coverage over the 1958–2018 period, namely a 10 % expansion of the subtropical biome in the North Atlantic and a 10 % expansion of the subpolar biome in the Southern Ocean, are suggestive of long-term climate shifts. Our approach thus provides a framework that can facilitate the monitoring of the impacts of climate change on the ocean carbon cycle and the evaluation of carbon cycle projections across Earth System Models.
- Preprint
(8838 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-1369', Anonymous Referee #1, 25 Jul 2024
The authors employ machine learning techniques in order to identify and analyze marine carbon biomes over space and time. They apply the tool using a global biogeochemistry model and identify 7 unique biomes globally. Analysis of these biomes and the drivers of each allow for conclusions about seasonal variation at regional scales and how these climatic patterns are shifting over time. This tool will be publicly accessible, providing an important resource for future research to improve analysis of ocean carbon and carbon cycle projections. The paper provides an important scientific tool, but I do have some edits recommended before publication-ready.
My largest comment is that in the abstract, the authors mention observation of a 10% expansion of the subtropical biome in the North Atlantic over time, and a 10% expansion of the subpolar biome in the Southern Ocean. These are very interesting results, worthy of highlighting in the abstract, but I felt they weren’t actually expounded upon enough in the results section. We have one paragraph at the end discussing it, but I was curious about putting it in context a bit more with the impacts of climate change and what these changing biomes could imply for the future. Additionally, I felt there was no real visual representation of these shifts. Is there a way to emphasize or include it more clearly in figure 6, or perhaps even mention it in the figure caption, to allow the reader to absorb this information better?
My second comment has to do with clarification of the input data: this was all done using one single model and it’s output, correct? I think some supplementary discussion of the model itself’s strengths and weaknesses could be included—I know, for example, some models have unrealistic mixed layer depths when compared with observations. How would something like this impact these biome patterns? Could there even be a supplementary figure comparing some of this with observations? For example, the figure 7 showing the SST, SSS, and MLD for each biome—how well does this match observations that are for roughly the same geographical region as defined by the machine learning biomes? I believe the paper could benefit from a little added discussion about how this method is employed within a model, and how that applies to future research--does it need to be regenerated with selected observations (if so, what are the base requirements for the obs) or someone's own model to usefully apply the biomes, or can they use your defined biomes explored here, and how does that affect research decisions?
Overall, I do recommend this paper for publication, once these edits have been addressed. I believe it is of scientific importance and a useful contribution to the ocean carbon research community, with potential for serving as a baseline tool for future research, and therefore is an important contribution to the field.
Specific notes:
Line 146+: The authors mention for both fCO2 and DIC, they use natural components rather than contemporary. How are these separated? Also, the note ‘they are substantially similar when using contemporary DIC/fCO2”…does this imply that the influence of anthropogenic carbon is not impacting the biomes? I feel this could be explored with a sentence or two here
Line 151: The authors note they decided to build biomes on target-driver relationships rather than drivers themselves, because it’s better for the methodology. Did they test this, or how do they know this is better?
Line 249+: The authors select January 2009 as the training date for the FNN. They do address the sensitivity of this month selection, and acknowledge the caveats in the discussion, which is both good and necessary. However, they don’t really explain why January 2009 is chosen. What about the year 2009—it’s not in the middle of the analyzed time range, in fact it’s near the end. In addition, why the month of January? I think in the methods, this could be explained with more detail and justification.
Line 370: “Only a couple of years were found to be inconsistent with overall pattern” while looking at the figures, those years were pretty significantly outside the expected pattern. Any theories on why that might be? What was going on in those years? How did it bounce back so quickly, with no longer-term shifts on the biomes?
Figures 6&7: While I know the white box was labeled in a figure, I’d appreciate latitudinal/longitudinal boundaries for the NA and SO regions in both these figure captions
Line 438: “instead of directly environmental parameters,” I believe might be missing a word in this line
Line 484: Should be an extra line space between paragraphs
Citation: https://doi.org/10.5194/egusphere-2024-1369-RC1 -
AC1: 'Reply on RC1', Sweety Mohanty, 14 Sep 2024
Dear Reviewer,
Thank you for your time and effort in reviewing our work. We have attached two documents in the author_response_1.zip: i) author_response_1.pdf contains our detailed response to your comments, and ii) OceanScience_2024_CarbonBiomes_figures_tables.pdf includes a list of new/revised figures and tables.
Yours sincerely and on behalf of all co-authors,
Sweety Mohanty
-
AC1: 'Reply on RC1', Sweety Mohanty, 14 Sep 2024
-
RC2: 'Comment on egusphere-2024-1369', Anonymous Referee #2, 26 Jul 2024
Mohanty and coauthors present a novel approach using an ocean biogeochemistry model and machine learning algorithms to detect ocean biogeochemical provinces (or biomes) based on the relationship between the sea surface fugacity of CO2 and its environmental drivers. The authors further investigate the temporal evolution of the biomes to detect changes in the fCO2.
I very much enjoyed reading the manuscript and I believe it provides a clever way to simplify a non-trivial question: How does the air sea CO2 flux (represented here by the fCO2) change over time and what controls this change? I believe there are many applications for this approach thus, I recommend publication.
I do have, however, a couple of questions and comments, that I believe would strengthen the manuscript:
1) The method section - and in particular section 2.3 onward are difficult to read especially for folks that are not familiar with machine learning. This is the result of the many specific terms used (e.g. "merging at a higher height", "Ward variance", "Euclidian Distance", "Ward Linkage", ...). To make the methods section more accessible to the wider audience of the journal, I would suggest to provide less technical text in the main section and add the required detail and terminology to the appendix.
2) I dont find the argument about the choice of an MLR that convincing. Figure A1 clearly shows that the relationships are not "perfectly linear". Furthermore, the arguments provided on lines 178-180 that the MLR is faster and easier interpretable are only to a certain extend true. Using e.g. a simple single layer FFN instead of the MLR could account for the slight divergence from linearity without compromising on speed. For me, the main argument is interpretability. The single weights of the MLR are easier to interpret and process than the more complex weight Matrixes of a FFN.
3) This may be a misunderstanding on my end, but I am still puzzled why you need a FFN for the time variation in the biomes. I fully understand the approach and I endorse it, but would you not also get changing biomes by doing the MLR followed by the hierarchical clustering for each month/year separately? Thought he changing HC relationships, you would also get changes in the biomes, no?
4) A more general question I had that was not answered in the paper: is your approached that was designed from a single model easily adoptable for other models?
And a couple of smaller things:
.) line 20: please add "annual" to the 25% (the number refers to the present day uptake rate - historically, over the industrial period, the ocean uptake was larger)
.) line 130: Please provide more detail how the outlier removal was done
.)lines 255-270: The architecture of the NN are provided by no justification to why. Have you done some optimalization testing (e.g. on the optimal number of neutrons), or are these subjective choices?
.)line 471: "personality" is an odd choice of wording
.) lines 485-490 are a repeat from the introduction and can be removed in my view
Citation: https://doi.org/10.5194/egusphere-2024-1369-RC2 -
AC2: 'Reply on RC2', Sweety Mohanty, 14 Sep 2024
Dear Reviewer,
Thank you for your time and effort in reviewing our work. We have attached two documents in the author_response_2.zip - i) author_response_2.pdf contains our detailed response to your comments and ii) OceanScience_2024_CarbonBiomes_figures_tables.pdf includes a list of new/revised figures and tables.
Yours sincerely and on behalf of all co-authors,
Sweety Mohanty
-
AC2: 'Reply on RC2', Sweety Mohanty, 14 Sep 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
284 | 113 | 26 | 423 | 12 | 19 |
- HTML: 284
- PDF: 113
- XML: 26
- Total: 423
- BibTeX: 12
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1