Unsupervised classification identifies coherent thermohaline structures in the Weddell Gyre region
Abstract. The Weddell Gyre is a major feature of the Southern Ocean and an important component of the planetary climate system; it regulates air-sea exchanges, controls the formation of deep and bottom waters, and hosts upwelling of relatively warm subsurface waters. It is characterized by extremely low sea surface temperatures, ubiquitous sea ice formation, and widespread salt stratification that stabilises the water column. Observing the Weddell Gyre is challenging, as it is extremely remote and largely covered with sea ice. At present, it is one of the most poorly-sampled regions of the global ocean, highlighting the need to extract as much value as possible from existing observations. Here, we apply a profile classification model (PCM), which is an unsupervised classification technique, to a Weddell Gyre profile dataset to identify coherent regimes in temperature and salinity. We find that, despite not being given any positional information, the PCM identifies four spatially coherent thermohaline domains that can be described as follows: (1) a circumpolar class, (2) a transition region between the circumpolar waters and the Weddell Gyre, (3) a gyre edge class with northern and southern branches, and (4) a gyre core class. PCM highlights, in an objective and interpretable way, both expected and under-appreciated structures in the Weddell Gyre dataset. For instance, PCM identifies the inflow of Circumpolar Deep Water (CDW) across the eastern boundary, the presence of the Weddell-Scotia Confluence waters, and structured spatial variability in mixing between Winter Water and CDW. PCM offers a useful complement to existing expertise-driven approaches for characterising the physical configuration and variability of the Weddell Gyre and surrounding regions.
Dan(i) Jones et al.
Dan(i) Jones et al.
SO-WISE South Atlantic Ocean and Indian Ocean Observational Constraints https://doi.org/10.5281/zenodo.7468655
South Atlantic Ocean profile dataset: identification of near-Antarctic profiles using unsupervised classification https://doi.org/10.5281/zenodo.7465132
Model code and software
so-wise/weddell_gyre_clusters: First release https://doi.org/10.5281/zenodo.7465388
Dan(i) Jones et al.
Viewed (geographical distribution)
This manuscript applies a method of unsupervised machine learning called a profile classification model (PCM) to ocean profile observations in the Weddle Gyre region with the goal to identify and classify areas, or sub-regions, within the Weddle Gyre that share similar temperature and salinity characteristics.
This manuscript clearly highlights how unsupervised classification schemes, such as PCM, can be powerful tools when applied to poorly sampled regions as they are able to identify patterns within highly complex data with no user input. Importantly, the manuscript stresses that PCM is a complementary technique in addition to other types of analysis techniques and confirms previously known thermohaline structures as well as sheds new light on more subtle thermohaline patterns within the Weddle Gyre. The authors use PCM to identify and analyze four categories of ocean profiles within the Weddle Gyre as follows: i) the circumpolar class, ii) a transition class, iii) a gyre edge class, and iv) a gyre core class.
This manuscript is clearly written and highlights a powerful, yet under-utilized technique in oceanography and climate science. I think this work is very interesting and I recommend it is accepted for publication after the following concerns are addressed and several modifications are made.
-The authors stress one of the benefits of the PCM method is that it identifies “both expected and underappreciated structures” (line 499). However, the results are not clear about which structures are the novel, underappreciated, or previously unknown ones. The manuscript would benefit from a short discussion or clarification on which individual results from the PCM technique are the most critical or important to the research community and which results simply confirm already known patterns.
-The description of the training process for the PC model would benefit from more details. The spatial bias is carefully considering in the training process, however the data contains significant temporal biases as well. Summer months are more heavily observed, as well as a general pattern of increasing observations through time (with spikes in recent years as well as around year 2010). How are the seasonal and annual temporal biases accounted for in the training process? What impact may this have on the results? Additionally, it is not mentioned how large the training data set is, or what the ‘training’ process looks like for the PC model. Are the final PCM conclusions sensitive to how the PC model is trained?
-Section 3.2-3.4 Figure 6/7/8 – Are these figures showing the mean of all individual profiles assigned each class at every spatial grid box? Why do you show the mean for these metrics, yet describe the profile classifications with the median? How sensitive are these metrics or the profile classifications to outliers?
Furthermore, did you look at the seasonal variations in space for the depth of mixed layer depth/minimum/ maximum temperature? You suggest the patterns represent deep winter convection in the shelf waters, yet the data is averaged over time for each grid box, and the observations contain more observations during summer months. Figure 10 is helpful to understand the seasonal variability of the classes – but it would be interesting to also analyze the spatial distribution of the depth metrics by season. For example, are the MLD and min/max temperature depths distinct in wintertime vs summer over the near coast shelf? Do we even have observations in those regions in the wintertime to identify known wintertime signatures with this method?
-Line 351-351/Figure 12: Part A: The strongest upwelling does appear to be co-located with the circumpolar and transition classes in most of the domain, however there is some strong upwelling in the far western part of the domain (between 40-60W, south of the SBDY) that do not seem to overlap with the circumpolar or transition class profiles, and seems to overlap more with gyre edge profiles, yet the upwelling here is stronger than the general large-scale gyre class upwelling. Do you have an explanation why this region seems to be unique?
- Line 351-351/Figure 12: Part B) If more observations existed in the near coast downwelling region, would you expect to identify an additional ‘near coastal’ class in this region? Would the exceptionally large seasonal cycle in vertical mixing in the near coast shelf region impact the results?
- Section 4.6: Figure 14 shows several profiles which lie separately from both the transition class and gyre core class (in PC space) – this grouping is comprised primarily of both transition class and gyre core. Is this separation in PC space meaningful? Does this grouping have some traits in common that results in grouping them together in PC space? For example, are they co-located in space or time within the Weddell gyre region? Or do they have certain temperature/salinity traits that can be attributed to specific PC’s in common so that they are clustered and isolated in this PC space, yet are categorized in different classes? Does increasing the number of classes used in the PCM change how these ‘isolated’ profiles are categorized?
- Line 584: Please clarify. What process is applied 20 times for each value of K? It seems that the process of applying PCM can produce in multiple realistic results (a differnent answer for any given iteration). Why were 20 iterations chosen? Are the results sentive to the number of iterations?
- The grey profiles on the figures (for example: Fig 3; Fig 10) are very difficult to see. Recommend a darker shade of grey.
- Line 89: define ENSO
- line 135 – either define PF or spell it out.
- Figure 2a: add units/label to color bar. Figure 2b: There are only 11 bars plotted in the monthly chart.
- Line 189: define WW
- line 245-255- Figure 5 is never referenced
- line 251- the text specifies the ‘mean’ yet, the metric given is the median.
- line 290: typo – oC?
- Figure 11: make color bar labels+unit text larger
- line 428 – change ‘lighter’ to ‘less dense’
- Figure A1 caption: typo. quantity shown "is"
- The contents of Appendix A are very difficult to interpret since the methods are not presented until Appendix B. Add references to Eqns B1 and B2 for the AIC and BIC, and refer reader to Appendix B.
- Line 567 and Eqns B1, B2: K is never defined