the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Unsupervised classification of the Northwestern European seas based on satellite altimetry data
Abstract. From generating metrics representative of a wide region to saving costs by reducing the density of an observational network, the reasons to split the ocean into distinct regions are many. Traditionally, this has been done somewhat arbitrarily, using the bathymetry and potentially some artificial latitude/longitude boundaries. We use an ensemble of Gaussian Mixture Models (GMM, unsupervised classification) to separate the complex northwestern European coastal region into classes based on sea level variability observed by satellite altimetry. To reduce the dimensionality of the data, we perform a principal component analysis on 25 years of observations and use the spatial components as input for the GMM. The number of classes or mixture components is determined by locating the maximum of the silhouette score and by testing several models. We use an ensemble approach to increase the robustness of the classification and to allow the separation into more regions than a single GMM can achieve. We also vary the number of empirical orthogonal function maps (EOFs) and show that more EOFs result in a more detailed classification. With three EOFs, the area is classified into four distinct regions delimited mainly by bathymetry. Adding more EOFs results in further subdivisions that resemble oceanic fronts. To achieve a more detailed separation, we use a model focused on smaller regions, specifically the Baltic Sea, North Sea, and the Norwegian Sea.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(4300 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4300 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1468', Anonymous Referee #1, 05 Sep 2023
Review of “Unsupervised classification of the Northwestern European seas based on satellite altimetry data” by Poropat et al., 2023
In this work, Poropat and colleagues use a Gaussian Mixture Model (GMM) to identify coherent regions of sea-level variability in the Northwestern European Seas. They show how the number of EOFs and the number of classes (mixtures) in the models are important parameters that can result in different patterns, but the main classification remains the same, showing the robustness of their method. The work is focused on the method itself, and I personally missed a bit more discussion into what the identified patterns could actually mean. Nonetheless, I believe this is an important work and a good addition to the scientific community, which could be the base for more process-based studies in the future.
Major comments
- Number of EOFs X Number of Classes: The results from Section 3.1 are very interesting. But for me it wasn’t clear if the different classifications are from adding more EOFs or from changing the number of classes. The authors used a “non subjective” way to choose the number of classes, which is important for several reasons. But I do wander, if the results from Figure 3 would be the similar if they fixed the number of EOFs, and just changed the number of classes. Was this tested? Because during the entire Section 3.1, the authors presents the results as an effect of adding more EOFs, but it could just be due to adding more classes. So testing the classification for fixed number of EOFs and changing the number of classes would make their results and discussion more robust.
- Literature & Discussion: I missed the “discussion” section, but I don’t think it’s reasonable to ask the authors to add an entire discussion section, just maybe in some locations when describing the features identified, might be a good addition to refer to some papers that could bring some insights into the processes behind these patterns. For example, regarding the features with the classification, there are some works that could highlight some of the processes behind the identified patterns (e.g., Mangini et al, 2021; Hermans et al, 2020; Frederikse et al., 2018, Chafik et al (2023), Calafat et al (2013), among others). Also, I would expect the authors to acknowledge the works of Thompson & Merrifield (2014) and of Camargo et al (2023). Both works have performed classification of ocean regions based on sea level data, and seem relevant for the present work. The ocean regions from Thompson & Merrifield have been widely used in sea level studies. The work of Camargo et al (2023) used two classifying methods to identify coherent regions of sea level variability. One of the methods of Camargo et al (2023) was SOM, which Poropat et al mention on the introduction, and hence acknowledging this work there seems fitting.
Minor comments
- L47-50: Isn’t this true for other classification/clustering methods also? Once clusters are identified, it can be transformed in a mask to isolate regions…
- L93-94: It wasn’t clear for me if it’s common to use EOFs as input for GGMs, or if this was a “novel” approach that the authors found to reduce noise? Would be good to know in both cases.
- L101-102: Just a comment, but this is also true for SOM.
- L124: How would the mean values give information about processes associated? I can see that the classification will tell you about the dominant EOFs, but the part about which process, it would come from your interpretation of the results, no?
- L133: Add a reference here to ‘silhouette score’.
- L146: Reference for soft voting.
- Maybe add to the methods section which class number Ks are tested.
- Fig 3: Did you test if using a higher K value with the lower EOFs, would give a similar result? That is, using K=10 to all the EOFs combination. I understand that the K number was chosen by the silhouette score, but this test could further confirm if your results are dependent on the number of EOFs or on the K number (see Major Comment 1).
- L202: It splits only in 4 classes, because of the K number, not because of the EOF number per se. (see previous comment and Major Comment 1).
- L208-2010: Could this be an indication that you would need one more class to better represent your region? I.e., if you had k=5, then this border might be uniquely classified? (maybe not, because this border remains “difficult” in all other cases). So it might be a hint for an underlying mechanism in this region (for example, see Chafik et al (2023) and Calafat et al (2013))?
- L225: This can also be just because you have too many classes, not necessarily too many EOFs.
- L229-231: It’s not only bathymetry, but the fact that different processes dominate each of those regions. From a sea level perspective: Deeper waters have a significant steric expansion, while that is not present in shallow seas. Shallow seas, in specific the North Sea, is strongly influenced by winds, and that will not happen so much in the open ocean. If you go into a physical oceanographic perspective, then other processes become important.
- L256: And which one was the recommended number of classes according to the silhouette score? I think it would be good to have these results in the supplementary, so that the reader can see by themselves the difference between a K number that “works better” and one that doesn’t.
- L268-270: Some papers come to mind when reading these lines: Mangini et al (2021) and Hermans et al (2020;2022)
- Figure 5: I didn’t fully understand Figure 5, especially columns b to d. It can be ignorance from my side, but I think it’s worth adding a bit more explanation to it, since other readers might be confused as well. The first column is clear, as it shows in each row the first 7 EOFs. But the next three columns weren’t so clear to me. At each row the classification changes, but the number of EOFs should be the same for the entire column, so what exactly is changing in each row was not clear to me. The way I interpreted it, is that at each row, you added one more EOF to your classification, so the first row had only 1 EOF for the three models (k=10,k-6,k=1), and the second 2 EOFs, and so on… but I’m not sure if that’s the correct interpretation.
- L312: Can you give an example here of a novel idea about the spatial coherence your balance highlighted? (I know you discussed the identified features previously, but quite some of them seemed like you “expected” them…so would be nice to have an example here about a novel spatial structure shown by the GGMs).
- L321-323: And what is the significance of this “spread”? More variability in those classes?
- L351-358: This is just my opinion, so not a “requirement” as a reviewer. This entire paragraph is describing characteristics of spatial pattern classification methods in general. Most of it would also be true for SOM and K-means, for example. And it doesn’t seem to be the main take-away message of your article, but just characteristics of GGM. I would suggest ending with a stronger message about your study in specific.
Technical/editorial comments
- L18: “so” – suggest changing it for “thus” or “therefore”, to avoid repetition (L16), and less colloquial also.
- L38: I’m not sure if you can/should start a sentence with “therefore”.
- L80-86: you repeat “While” three times in these lines. Suggest to modify a bit to avoid repetition.
- L158: Add a comma after voting.
- L246: Referring here to Figure 5a was a bit bothersome for me, and I’m not sure if it’s necessary. I went down to check it, and then got a bit lost in the text.
- L334: Suggest adding “(classes)” after “mixture components”.
- Section 4: This is a “summary” not a “conclusion”.
References
- Calafat, F. M., Chambers, D. P., and Tsimplis, M. N.(2013), Inter-annual to decadal sea-level variability in the coastal zones of the Norwegian and Siberian Seas: The role of atmospheric forcing, Geophys. Res. Oceans, 118, 1287–1301, doi:10.1002/jgrc.20106.
- Camargo, C. M. L., Riva, R. E. M., Hermans, T. H. J., Schütt, E. M., Marcos, M., Hernandez-Carrasco, I., and Slangen, A. B. A.: Regionalizing the sea-level budget with machine learning techniques, Ocean Sci., 19, 17–41, https://doi.org/10.5194/os-19-17-2023, 2023.
- Chafik, L., Nilsson, J., Rossby, T., & Kondetharayil Soman, A.(2023). The Faroe-Shetland Channel Jet: Structure, variability, and driving mechanisms. Journal of Geophysical Research: Oceans, 128, e2022JC019083. https://doi.org/10.1029/2022JC019083
- Frederikse, T. and Gerkema, T.: Multi-decadal variability in seasonal mean sea level along the North Sea coast, Ocean Sci., 14, 1491–1501, https://doi.org/10.5194/os-14-1491-2018, 2018.
- Hermans, T. H. J., C. A. Katsman, C. M. L. Camargo, G. G. Garner, R. E. Kopp, and A. B. A. Slangen, 2022: The Effect of Wind Stress on Seasonal Sea-Level Change on the Northwestern European Shelf. Climate, 35, 1745–1759, https://doi.org/10.1175/JCLI-D-21-0636.1.
- Hermans, T. H. J., Le Bars, D., Katsman, C. A., Camargo, C. M. L., Gerkema, T., Calafat, F. M., et al. (2020). Drivers of interannual sea level variability on the northwestern European shelf. Journal of Geophysical Research: Oceans, 125, e2020JC016325. https://doi.org/10.1029/2020JC016325
- Mangini, F., Chafik, L., Madonna, E., Li, C., Bertino, L. and Nilsen, J.E.Ø., 2021. The relationship between the eddy-driven jet stream and northern European sea level variability. Tellus A: Dynamic Meteorology and Oceanography, 73(1), p.1886419.DOI: https://doi.org/10.1080/16000870.2021.1886419
- Thompson, P. R. and Merrifield, M. A.: A unique asymmetry in the pattern of recent sea level change, Geophys. Res. Lett., 41, 7675–7683, https://doi.org/10.1002/2014GL061263, 2014.
Citation: https://doi.org/10.5194/egusphere-2023-1468-RC1 -
AC1: 'Reply on RC1', Lea Poropat, 16 Nov 2023
We thank Referee #1 for their effort in reviewing our manuscript and for their positive evaluation. Their comments were invaluable for improving our manuscript. We respond to each of the Referee #1's questions and comments in the attached PDF file.
-
RC2: 'Comment on egusphere-2023-1468', Anonymous Referee #2, 20 Sep 2023
Dear authors, congratulations for the performed work.
The manuscript that you presented includes a really interesting and robust technique (GMM) that was used for oceanic regionalisation considering SSH and presenting accurate results. Additionally, I find this technique really promising because it can be applied considering other variables (SST, currents, SSS, chlorophyll concentration, turbidity, ...) from different databases (in situ, remote sensing, numerical models).
The manuscript is really well write, in good English, easy to follow and to understand. The figures are clear and necessary, the conclusions are in the line of the obtained results and the references are up-to-date.
I have some comments that I expect could help to improve the manuscript.
The mayor comment that I have is that I missed the explanation of why these regions were split (Figures 3 and 4) and if they correspond with bathymetry/hydrodynamic characteristics. The technique is really good and the results are promising, but I miss here a little about the physics of the regions, justifying why the classification method selected those regions and which specificities each one of them has that differs from the others, reinforcing and validating the obtained results.
Other comments:
Line 80: I would like to ask the authors why they selected the period 1995 - 2019. The authors explained why they start in 1995, but not why they end in 2019. The selected database is now available until August 2022. I agree to have entire years, so to not considered 2022. But, why the authors did not considered 2020 and 2021?
Line 84: The authors mentioned that "We also remove the seasonal cycle by subtracting the climatology calculated from the 25 years of data in order to focus on the non-seasonal variability.". And what about the trend? It is maintained? If yes, it could be possible that the existence of different trends in the study regions affect to the classification?
Line 133: For the Silhouette coefficient I recommend to include a reference. Here I suggested one, but it could be another one. Filaire, T., 2018. Clustering on mixed type data, a proposed approach using R. https://towardsdatascience.com/clustering-on-mixed-type-data-8bbd 0a2569c3
Line 136: What is the mean S? I don't understand very well how the S score was used. On my understanding, to define the number of components (K), normally S is iterated several times starting from K=2 to higher values, and then, the K that gives the best S value is selected. Please, if possible, include here a deep explanation.
Line 193: This part is not really clear for me. I understand that 11 EOFs represented the 85% of the variability. But why considering 11 EOFs when the Silhouette coefficient for this model present the lowest values of all the run models (Figure 2)?
Line 223: "The likelihood for the classification in the southern part of the North Sea is also significantly reduced, suggesting that the models struggle to properly classify this region, possibly because this many principal components introduce a lot of noise.". And it could not be related with the value of the S score that is the lowest of all the run models?
Figure 4: The manuscript did not include an explanation of why those K were selected. Additionally, the S score is not presented. I recommend to add this information.
Line 255: "Note that this number of classes is not chosen with the silhouette score.". If the S score is not used in this region, why the authors chose 4 classes?
Figure 5 is somewhat confusing. Can you please add more explanations to be fully understandable?
Citation: https://doi.org/10.5194/egusphere-2023-1468-RC2 -
AC2: 'Reply on RC2', Lea Poropat, 16 Nov 2023
We thank Referee #2 for their effort in reviewing our manuscript and for their positive evaluation. Their comments were invaluable for improving our manuscript. We respond to each of the Referee #2's questions and comments in the attached PDF file.
-
AC2: 'Reply on RC2', Lea Poropat, 16 Nov 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1468', Anonymous Referee #1, 05 Sep 2023
Review of “Unsupervised classification of the Northwestern European seas based on satellite altimetry data” by Poropat et al., 2023
In this work, Poropat and colleagues use a Gaussian Mixture Model (GMM) to identify coherent regions of sea-level variability in the Northwestern European Seas. They show how the number of EOFs and the number of classes (mixtures) in the models are important parameters that can result in different patterns, but the main classification remains the same, showing the robustness of their method. The work is focused on the method itself, and I personally missed a bit more discussion into what the identified patterns could actually mean. Nonetheless, I believe this is an important work and a good addition to the scientific community, which could be the base for more process-based studies in the future.
Major comments
- Number of EOFs X Number of Classes: The results from Section 3.1 are very interesting. But for me it wasn’t clear if the different classifications are from adding more EOFs or from changing the number of classes. The authors used a “non subjective” way to choose the number of classes, which is important for several reasons. But I do wander, if the results from Figure 3 would be the similar if they fixed the number of EOFs, and just changed the number of classes. Was this tested? Because during the entire Section 3.1, the authors presents the results as an effect of adding more EOFs, but it could just be due to adding more classes. So testing the classification for fixed number of EOFs and changing the number of classes would make their results and discussion more robust.
- Literature & Discussion: I missed the “discussion” section, but I don’t think it’s reasonable to ask the authors to add an entire discussion section, just maybe in some locations when describing the features identified, might be a good addition to refer to some papers that could bring some insights into the processes behind these patterns. For example, regarding the features with the classification, there are some works that could highlight some of the processes behind the identified patterns (e.g., Mangini et al, 2021; Hermans et al, 2020; Frederikse et al., 2018, Chafik et al (2023), Calafat et al (2013), among others). Also, I would expect the authors to acknowledge the works of Thompson & Merrifield (2014) and of Camargo et al (2023). Both works have performed classification of ocean regions based on sea level data, and seem relevant for the present work. The ocean regions from Thompson & Merrifield have been widely used in sea level studies. The work of Camargo et al (2023) used two classifying methods to identify coherent regions of sea level variability. One of the methods of Camargo et al (2023) was SOM, which Poropat et al mention on the introduction, and hence acknowledging this work there seems fitting.
Minor comments
- L47-50: Isn’t this true for other classification/clustering methods also? Once clusters are identified, it can be transformed in a mask to isolate regions…
- L93-94: It wasn’t clear for me if it’s common to use EOFs as input for GGMs, or if this was a “novel” approach that the authors found to reduce noise? Would be good to know in both cases.
- L101-102: Just a comment, but this is also true for SOM.
- L124: How would the mean values give information about processes associated? I can see that the classification will tell you about the dominant EOFs, but the part about which process, it would come from your interpretation of the results, no?
- L133: Add a reference here to ‘silhouette score’.
- L146: Reference for soft voting.
- Maybe add to the methods section which class number Ks are tested.
- Fig 3: Did you test if using a higher K value with the lower EOFs, would give a similar result? That is, using K=10 to all the EOFs combination. I understand that the K number was chosen by the silhouette score, but this test could further confirm if your results are dependent on the number of EOFs or on the K number (see Major Comment 1).
- L202: It splits only in 4 classes, because of the K number, not because of the EOF number per se. (see previous comment and Major Comment 1).
- L208-2010: Could this be an indication that you would need one more class to better represent your region? I.e., if you had k=5, then this border might be uniquely classified? (maybe not, because this border remains “difficult” in all other cases). So it might be a hint for an underlying mechanism in this region (for example, see Chafik et al (2023) and Calafat et al (2013))?
- L225: This can also be just because you have too many classes, not necessarily too many EOFs.
- L229-231: It’s not only bathymetry, but the fact that different processes dominate each of those regions. From a sea level perspective: Deeper waters have a significant steric expansion, while that is not present in shallow seas. Shallow seas, in specific the North Sea, is strongly influenced by winds, and that will not happen so much in the open ocean. If you go into a physical oceanographic perspective, then other processes become important.
- L256: And which one was the recommended number of classes according to the silhouette score? I think it would be good to have these results in the supplementary, so that the reader can see by themselves the difference between a K number that “works better” and one that doesn’t.
- L268-270: Some papers come to mind when reading these lines: Mangini et al (2021) and Hermans et al (2020;2022)
- Figure 5: I didn’t fully understand Figure 5, especially columns b to d. It can be ignorance from my side, but I think it’s worth adding a bit more explanation to it, since other readers might be confused as well. The first column is clear, as it shows in each row the first 7 EOFs. But the next three columns weren’t so clear to me. At each row the classification changes, but the number of EOFs should be the same for the entire column, so what exactly is changing in each row was not clear to me. The way I interpreted it, is that at each row, you added one more EOF to your classification, so the first row had only 1 EOF for the three models (k=10,k-6,k=1), and the second 2 EOFs, and so on… but I’m not sure if that’s the correct interpretation.
- L312: Can you give an example here of a novel idea about the spatial coherence your balance highlighted? (I know you discussed the identified features previously, but quite some of them seemed like you “expected” them…so would be nice to have an example here about a novel spatial structure shown by the GGMs).
- L321-323: And what is the significance of this “spread”? More variability in those classes?
- L351-358: This is just my opinion, so not a “requirement” as a reviewer. This entire paragraph is describing characteristics of spatial pattern classification methods in general. Most of it would also be true for SOM and K-means, for example. And it doesn’t seem to be the main take-away message of your article, but just characteristics of GGM. I would suggest ending with a stronger message about your study in specific.
Technical/editorial comments
- L18: “so” – suggest changing it for “thus” or “therefore”, to avoid repetition (L16), and less colloquial also.
- L38: I’m not sure if you can/should start a sentence with “therefore”.
- L80-86: you repeat “While” three times in these lines. Suggest to modify a bit to avoid repetition.
- L158: Add a comma after voting.
- L246: Referring here to Figure 5a was a bit bothersome for me, and I’m not sure if it’s necessary. I went down to check it, and then got a bit lost in the text.
- L334: Suggest adding “(classes)” after “mixture components”.
- Section 4: This is a “summary” not a “conclusion”.
References
- Calafat, F. M., Chambers, D. P., and Tsimplis, M. N.(2013), Inter-annual to decadal sea-level variability in the coastal zones of the Norwegian and Siberian Seas: The role of atmospheric forcing, Geophys. Res. Oceans, 118, 1287–1301, doi:10.1002/jgrc.20106.
- Camargo, C. M. L., Riva, R. E. M., Hermans, T. H. J., Schütt, E. M., Marcos, M., Hernandez-Carrasco, I., and Slangen, A. B. A.: Regionalizing the sea-level budget with machine learning techniques, Ocean Sci., 19, 17–41, https://doi.org/10.5194/os-19-17-2023, 2023.
- Chafik, L., Nilsson, J., Rossby, T., & Kondetharayil Soman, A.(2023). The Faroe-Shetland Channel Jet: Structure, variability, and driving mechanisms. Journal of Geophysical Research: Oceans, 128, e2022JC019083. https://doi.org/10.1029/2022JC019083
- Frederikse, T. and Gerkema, T.: Multi-decadal variability in seasonal mean sea level along the North Sea coast, Ocean Sci., 14, 1491–1501, https://doi.org/10.5194/os-14-1491-2018, 2018.
- Hermans, T. H. J., C. A. Katsman, C. M. L. Camargo, G. G. Garner, R. E. Kopp, and A. B. A. Slangen, 2022: The Effect of Wind Stress on Seasonal Sea-Level Change on the Northwestern European Shelf. Climate, 35, 1745–1759, https://doi.org/10.1175/JCLI-D-21-0636.1.
- Hermans, T. H. J., Le Bars, D., Katsman, C. A., Camargo, C. M. L., Gerkema, T., Calafat, F. M., et al. (2020). Drivers of interannual sea level variability on the northwestern European shelf. Journal of Geophysical Research: Oceans, 125, e2020JC016325. https://doi.org/10.1029/2020JC016325
- Mangini, F., Chafik, L., Madonna, E., Li, C., Bertino, L. and Nilsen, J.E.Ø., 2021. The relationship between the eddy-driven jet stream and northern European sea level variability. Tellus A: Dynamic Meteorology and Oceanography, 73(1), p.1886419.DOI: https://doi.org/10.1080/16000870.2021.1886419
- Thompson, P. R. and Merrifield, M. A.: A unique asymmetry in the pattern of recent sea level change, Geophys. Res. Lett., 41, 7675–7683, https://doi.org/10.1002/2014GL061263, 2014.
Citation: https://doi.org/10.5194/egusphere-2023-1468-RC1 -
AC1: 'Reply on RC1', Lea Poropat, 16 Nov 2023
We thank Referee #1 for their effort in reviewing our manuscript and for their positive evaluation. Their comments were invaluable for improving our manuscript. We respond to each of the Referee #1's questions and comments in the attached PDF file.
-
RC2: 'Comment on egusphere-2023-1468', Anonymous Referee #2, 20 Sep 2023
Dear authors, congratulations for the performed work.
The manuscript that you presented includes a really interesting and robust technique (GMM) that was used for oceanic regionalisation considering SSH and presenting accurate results. Additionally, I find this technique really promising because it can be applied considering other variables (SST, currents, SSS, chlorophyll concentration, turbidity, ...) from different databases (in situ, remote sensing, numerical models).
The manuscript is really well write, in good English, easy to follow and to understand. The figures are clear and necessary, the conclusions are in the line of the obtained results and the references are up-to-date.
I have some comments that I expect could help to improve the manuscript.
The mayor comment that I have is that I missed the explanation of why these regions were split (Figures 3 and 4) and if they correspond with bathymetry/hydrodynamic characteristics. The technique is really good and the results are promising, but I miss here a little about the physics of the regions, justifying why the classification method selected those regions and which specificities each one of them has that differs from the others, reinforcing and validating the obtained results.
Other comments:
Line 80: I would like to ask the authors why they selected the period 1995 - 2019. The authors explained why they start in 1995, but not why they end in 2019. The selected database is now available until August 2022. I agree to have entire years, so to not considered 2022. But, why the authors did not considered 2020 and 2021?
Line 84: The authors mentioned that "We also remove the seasonal cycle by subtracting the climatology calculated from the 25 years of data in order to focus on the non-seasonal variability.". And what about the trend? It is maintained? If yes, it could be possible that the existence of different trends in the study regions affect to the classification?
Line 133: For the Silhouette coefficient I recommend to include a reference. Here I suggested one, but it could be another one. Filaire, T., 2018. Clustering on mixed type data, a proposed approach using R. https://towardsdatascience.com/clustering-on-mixed-type-data-8bbd 0a2569c3
Line 136: What is the mean S? I don't understand very well how the S score was used. On my understanding, to define the number of components (K), normally S is iterated several times starting from K=2 to higher values, and then, the K that gives the best S value is selected. Please, if possible, include here a deep explanation.
Line 193: This part is not really clear for me. I understand that 11 EOFs represented the 85% of the variability. But why considering 11 EOFs when the Silhouette coefficient for this model present the lowest values of all the run models (Figure 2)?
Line 223: "The likelihood for the classification in the southern part of the North Sea is also significantly reduced, suggesting that the models struggle to properly classify this region, possibly because this many principal components introduce a lot of noise.". And it could not be related with the value of the S score that is the lowest of all the run models?
Figure 4: The manuscript did not include an explanation of why those K were selected. Additionally, the S score is not presented. I recommend to add this information.
Line 255: "Note that this number of classes is not chosen with the silhouette score.". If the S score is not used in this region, why the authors chose 4 classes?
Figure 5 is somewhat confusing. Can you please add more explanations to be fully understandable?
Citation: https://doi.org/10.5194/egusphere-2023-1468-RC2 -
AC2: 'Reply on RC2', Lea Poropat, 16 Nov 2023
We thank Referee #2 for their effort in reviewing our manuscript and for their positive evaluation. Their comments were invaluable for improving our manuscript. We respond to each of the Referee #2's questions and comments in the attached PDF file.
-
AC2: 'Reply on RC2', Lea Poropat, 16 Nov 2023
Peer review completion
Journal article(s) based on this preprint
Model code and software
GMM ensemble code Lea Poropat https://github.com/leapor/GMMensemble/blob/main/GMMensemble.ipynb
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
304 | 113 | 28 | 445 | 19 | 19 |
- HTML: 304
- PDF: 113
- XML: 28
- Total: 445
- BibTeX: 19
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Dan(i) Jones
Simon D. A. Thomas
Céline Heuzé
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4300 KB) - Metadata XML