the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
PPCon 1.0: Biogeochemical Argo Profile Prediction with 1D Convolutional Networks
Abstract. Effective observation of the ocean is vital for studying and assessing the state and evolution of the marine ecosystem, and for evaluating the impact of human activities. However, obtaining comprehensive oceanic measurements across temporal and spatial scales and for different biogeochemical variables remains challenging. Autonomous oceanographic instruments, such as Biogeochemical (BCG) Argo profiling floats, have helped expand our ability to obtain subsurface and deep-ocean measurements, but measuring biogeochemical variables such as nutrient concentration still remains more demanding and expensive than measuring physical variables. Therefore, developing methods to estimate marine biogeochemical variables from high-frequency measurements is very much needed. Current Neural Network (NN) models developed for this task are based on a Multilayer Perceptron (MLP) architecture, trained over punctual pairs of input-output features. However, MLPs lack awareness of the typical shape of biogeochemical variable profiles they aim to infer, resulting in irregularities such as jumps and gaps when used for the prediction of vertical profiles. In this study, we evaluate the effectiveness of a one-dimensional Convolutional Neural Network (1D CNN) model to predict nutrient profiles, leveraging the typical shape of vertical profiles of a variable as a prior constraint during training. We will present a novel model named PPCon (Predict Profiles Convolutional), which is trained over a dataset containing BCG Argo float measurements, for the prediction of nitrate, chlorophyll and backscattering (bbp700), starting from the date, geolocation, temperature, salinity, and oxygen. The effectiveness of the model is then accurately validated by presenting both quantitative metrics and visual representations of the predicted profiles. Our proposed approach proves capable of overcoming the limitations of MLPs, resulting in smooth and accurate profile predictions.
- Preprint
(2994 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2023-1876', Anonymous Referee #1, 24 Nov 2023
The paper demonstrates the application of a convolution neural network (coupled to a MLP) to predict nitrate, chlorophyll and backscattering (bbp700) from temperature, salinity, and oxygen as well as time and geographic coordinates. This work is an improved version of a previous work by the same authors published in 2023 in Applied Sciences (https://doi.org/10.3390/app13095634 )
A. My main comment regarding this paper, one would either go into more detailed comparison between the previous published method and for example show some example profiles where they provide a different answer and discuss them. So far the comparison is mostly limited to the RMSE error. Or the method should be compared to another reconstruction technique altogether. In all cases, one should use the same training/test/validation dataset (I assume that this is already the case, please confirm).
B. In general the structure of the neural network is rather “unorthodox”. It would be good to justify the approach (in particular the secession of convolutional layers; see below).
C. MLP vs CNN: the description is a bit shallow as mathematically a CNN is a special case of a MLP in the sense that a convolution operation is a special case of a matrix multiplication. CNNs assume translation invariance as an additional prior information, which is why they can outperform MLPs (when considering multiple channels a 1x1 convolution is actually an MLP over the channel dimension).
Other comments (mostly minor ones):
“Punctual”: Should it not be “pointwise” ? It seems that punctual has only the meaning as on-time (https://dictionary.cambridge.org/no/ordbok/engelsk/punctual )
Line 65: “This originates from the fact that MLPs are trained on individual
data points and provide pointwise outputs, which makes the generation of regular profiles challenging as the NN does not take into account the vertical neighbors of predicted variables.”
This description is a bit too simple. This all depends on how the MLP is trained. One could (theoretically) also feed a profile and get a profile in return. For long profiles however this would be computational and prohibitive.
Line 105: “we only selected profiles that were marked with quality flags (QFs) of 1, 2, or 8 for variables such as temperature, salinity, nitrate, and chlorophyll.” Please also provide the labels associated with these classes.
Table 1: I think you should mention that 32 is the batch size in the caption. Some authors omit the batch size in such tables as it is considered an adjustable hyperparameter.
Line 133: “the sampling date (specifically year and day), geolocation, and geographic coordinates (latitude and longitude)”:
What is geolocation if not geographic coordinates ?
Line 121: “MLPs employed to transform punctual data into a vectorial shape - necessary for the training of the convolutional component.”
Why did you not consider a more obvious approach to simply repeated the sampling data and coordinates along the depth dimension. After all, all values in the profiles are considered to be at the same time and date. With the depth information that, the NN can specially on depth and thus come an entirely convolutional neural network.
Section 3.2 and table 2:
The architecture of the convolutional layer is rather surprising. In a UNet or convolutional autoencoder one would expect to have first all the convolutional layers (with a stride of 2 or with pooling layers) followed by all the deconvolution layers to get to the original depth dimension. In this paper they are mixed. For the convolutional layers you use kernel size of 2, 3 and 4. Can you explain why you use different sizes?
Line 175: “Second, to mitigate overfitting
phenomena, a regularization term known as λ-regularization is employed, which penalizes complex curves in proportion to the square of the model’s weights (Zou and Hastie (2005))”
There is no square in equation (2). Can you clarify if you use L1 or L2 regularization?
Equation 2: I think you forgot the alpha coefficient in this equation. Also consider to use a different symbol as alpha is already used for something different in equation (1).
193: “validation sets were chosen as 80, 10, and 10.” Add %
Line 203: “Adadelta (Zeiler (2012)) is the algorithm that is selected as the optimizer for training the network due to its ability to dynamically adapt over time using only first-order information“:
Can you be more specific what “first-order information“ means here?
Table 3: please replace $1e^-7$ by $10^{-7}$. In equation $e$ means the Euler number and $e^{-7}$ would be $exp(-7)$ (also on Table 6, 7,8).
Line 346: “On the other hand, we have demonstrated that the RMSE for the PPCon architecture is 0.61” Which unit are you using?
Line 27: “These instruments are essential to advance our knowledge of the biogeochemical state of the ocean, as one of their principal advantages is the assimilation into ocean biogeochemical models”
Advantages -> use cases?
References:
Please add a DOI where one is available (see https://www.geoscientific-model-development.net/submission.html)
https://doi.org/https://doi.org/10.17882/42182, 2000 -> , https://doi.org/10.17882/42182, 2000
In general, replace Adadelta (Zeiler (2012)) by Adadelta (Zeiler, 2012) and similar. (this is \citep in latex).
Citation: https://doi.org/10.5194/egusphere-2023-1876-RC1 -
AC1: 'Reply on RC1', Gloria Pietropolli, 05 Jan 2024
We are grateful for the constructive comments and insightful suggestions provided by the Reviewer. Their feedback has been instrumental in refining our manuscript and improving our analysis. We have prepared a general response together with a point-by-point response to each of the Reviewer's comments in the attached file.
We remain open to any further suggestions or discussions that would contribute to the improvement of our work.
-
AC1: 'Reply on RC1', Gloria Pietropolli, 05 Jan 2024
-
RC2: 'Comment on egusphere-2023-1876', Anonymous Referee #2, 27 Nov 2023
The authors describe a method that predicts nitrate, chla and bbp vertical profiles from geolocated and dated vertical profiles of temperature, salinity, and oxygen. With the help of CNNs, they use vertical profiles (data values and shape) rather than point-wise data values for these predictions. The method is trained and evaluated on BGC-Argo profiling float data acquired in the Mediterranean Sea.
What is very interesting in this work is that -- in contrast to previous work -- they propose to take a stronger advantage of the data context by using a convolutional neural network together with entire profiles for prediction. This is novel and innovative. What requires attention and significant improvement, however, is the presentation and clarity of their work. The manuscript starts off with a well-detailed and well-written introduction, but attention to detail and readability suffer the further one goes into the later sections. Unfortunately to an extent, at which it in places becomes unclear to a reader what is meant by the authors, e.g.:
l.345 "BGC-Argo float data for oxygen, nitrate, and chlorophyll concentrations exhibit RMSE values evaluated at 5.1±0.8μmol/kg, 0.25±0.07μmol/kg, and 0.03±0.01mg/m3, respectively. On the other hand, we have demonstrated that the RMSE for the PPCon architecture is 0.61" -- ??? What's the 0.61 related to? No unit, no oxygen/nitrate/chla given.
This is unfortunate and definitely needs attention before a re-submission. Starting from Section 3 and onwards, a thorough proof-read and maybe re-write would be recommended.
I take it that this manuscript is part of a thesis, with later parts likely written in a rush. This is very understandable, so I would like the authors to take my comments as advise on how to get their excellent idea into a shape that mirrors it's worth.
*General comments*:
- While the method can be applied anywhere, the present study focuses on the Mediterranean Sea. This must be mentioned somewhere prominently/early, e.g., either in the title or abstract. A reader cannot be left searching for the regional coverage until somewhere late in section 4, if one happens to glance over the one single sentence in l. 105. With almost all previous text (Intro, Argo description, etc.) referring to the global system.
- In general, the manuscript would benefit from more clarity (one example: Make sure to use the same term for the same thing). And from more specifics throughout the text (one example: "nitrate, chla, and bbp" instead of "all inferred variables"; Does "output variable" in l.124 refer to output of the MLP or output of PPCon?). If you can name it, then name it and do not find an alternative description.
- Who is your target audience with this paper? I believe it's an oceanographic community? For this, it is surprising to not see a map of any sort, which would help a reader to follow along (and to put it into his/her oceanographic background context).
I'd therefore like to see a map to show all BGC-Argo float profile positions used for algorithm training, testing, and (independent) validation (e.g., each with a different colour), to show the specific float positions relating to the selected profile plots (e.g., with a different marker), and to show the different areas for the posterior analysis (e.g., as boxes). If you're limited in number of figures/tables, drop one of the existing ones (or combine figures into one), because a map is much more important than any (anecdotal) illustration.
- Argo (and BGC-Argo) is a living dataset, which constantly evolves both with new data being added but also with existing data being re-evaluated/newly adjusted. Which means that, e.g., today's 2016-sampled profile (even if in delayed mode) may look different from when you downloaded the same profile last year or from when you will download it in 1 year's time (https://argo.ucsd.edu/data/data-faq/#reD). It is therefore important to make one's own work traceable, e.g., by stating the date one downloaded the dataset and from which source (e.g., the GDAC). Even better would be, e.g., use of one of the monthly snapshots of the Argo GDAC doi, which specifically refer to the state of the Argo dataset at the given time (https://argo.ucsd.edu/data/data-faq/#DOI).
- Figure captions should provide a text description of what is presented so that their content can be understood without the main text (applies, e.g., to Figure 1, but all figures in general). Think of a lazy reader, who wants to get the main points of your paper just by looking through your figures. At least what is presented in which colour/marking and against what must become clear (e.g., Fig. A1: continuous vs. dashed lines??; Fig. 2: Which MLP is shown?).
- Check/reconsider the number of significant digits given for statistical metrics throughout the text, and make them coherent (and not more 'accurate' than realistic). (E.g., on the RMSE of nitrate, ... in tables 6-8)
*Section by section comments*: (focus on style/structure)
- The introduction is extensive and well-written with appropriate referencing (though referencing around global programs are a bit Med Sea centric; which is probably fine once the regional scope becomes clear).
- For the scope of the present paper, section 2 Dataset can be drastically reduced: Lines 87-104 can be condensed into two sentences: "We used data from the evolving BGC-Argo network, which uses profiling floats ... ." and "BGC-Argo data were accessed from the Argo GDAC." Then add which data mode you used (only delayed-mode adjusted data? Or also real-time adjusted data? Hopefully no unadjusted real-time data?), what kind of files (likely the s- rather than b-profiles?), and, especially important, add how many profiles you had in your data selection prior and past QC/preprocessing.
The size of the training dataset remains unclear to me in the present manuscript.
- section 3 PPCon: The first paragraph is a repetition of the introduction. It needs to be there only once and I would recommend it to be only in the introduction.
I assume the "32" in tables 1 and 2 is the minibatch size? As such, it shouldn't be listed as part of the layer size, which I find very confusing. (Consider the prediction step of one profile - no batch size is relevant here.)
- section 4.1 is again rather general (except for one paragraph), and I wonder whether the majority of its content should rather go to the introduction. Section 4.3 would largely benefit from a map, e.g., to illustrate the "uneven spatial and temporal distribution of the profiles" (l. 254) which remains elusive otherwise.
- Section 5 is more clearly written again, which is good!
A question I had while reading the RMSE values/patterns discussion (3rd paragraph) is how this relates to the range of values and variability of nitrate/chla/bbp, e.g., within a given season. Eventually, this information is touched upon, but I'd suggest to move the RMSE discussion, its temporal evolution and patterns (third paragraph) more closely to the last paragraph of part 5.0.
This last paragraph (of section 5.0) and discussion of Figure 5 contains a lot of (valuable) information! I would encourage the authors to spend more time/space on its discussion, so that it can be adequately digested by a reader, and I would like to see this presented more extensively to get better context, e.g., to the RMSE/performance vs. season vs. natural variability.
- Section 6 is logically structured, but the content needs careful inspection so that it contains all information intended/required to be transmitted. (E.g., units/which parameter in l. 346; l. 371: Must add "in the Med Sea" to have this sentence work; ...).
- The conclusion, again, is well-written, concise, and clear.Specifics: (focus on content)
- l. 10: "resulting in irregularities such as jumps and gaps when used for the prediction of vertical profiles".
I would challenge this and think that neither of this is true. For well-trained MLPs or neural networks in general, regularization causes them to give smooth outputs in general. Jumpy behaviour is only to be seen in case of overfitting, i.e., where an operator chose to fit the training/testing data set too closely (for apparently good performance statistics) but neglected the regularization term, so that the trained network does not generalize sufficiently (and therefore gives seemingly erratic/jumpy predictions). But this is not sth. to blame the MLP architecture for, but to blame the network training/operator implementation. It seems that this claim is mostly supported by citation of the same authors' work (Pietropolli et al., 2023). Given the way it is presented and that it should stand for MLPs in general, it would largely benefit (or rather require!) support by other people's/lab's work.
- Same comment on l.64-65. An MLP/point-wise prediction does not take into account neighboring data during prediction, true. -- But they do so during training due to their natural proximity in data state space and the MLP regularization. I.e., input data that are close together in data space (like from a single profile) do get output predictions that are smoothly transitioning from one to the next output value (if adequately regularized/without overfitting). Nonetheless, CNNs can (likely) take advantage of neighboring data also during prediction, so it's worth to study. (This provides sufficient motivation for CNNs from their potential for improvement; there is no need to claim negative aspects on MLPs/current methods beyond this for motivation.)
- Continued: The authors write that (l. 335f) "MLP architectures can provide good training and test errors" (as by Pietropolli and 3 more references for MLPs) while "they have been found to exhibit higher errors when predicting BGC-Argo profiles" (only by Pietropolli but none of the 3 other references for MLPs) -- This should make someone a bit suspicious and check double if this holds in general (the way it is presented here).
- l. 384 and l. 355-358 (btw. CANYON-MED with MLP architecture states a nitrate RMSE of 0.78 mmol m-3) will need correction, too.
I suggest to take out the MLP-irregularities-and-jumps claims entirely, as they are not sufficiently supported, and stick to the fact that CNNs can take benefit from data in their vicinity/neighorhood in a better way and explicitly during training (thanks to the conv/deconv layers).- l. 87: "Array for Real-time Geostrophic Oceanography" This is an interesting fit to match "Argo", but Argo is not an abbreviation. (It is inspired by Greek mythology: https://argo.ucsd.edu/about/)
- The architecture/design of PPCon and their CNN-approach is hard to understand. The authors refer/cross-refence to different elements of their approach, without clear, concise wording. E.g., there are several references to the four point-wise inputs, the seven-channel tensor, or three variables (e.g., l. 150, 145, 143, 139, 131f.) without a clear sentence like: "Per profile, we have 4 point-wise inputs, which are latitude, longitude, (decimal??) year, and year day. In addition, we have three 1x200 input vectors for temperature, salinity, and oxygen profiles, respectively."
- Can Figure 1 be modified so that it mirrors the informations from Table 1 and Table 2 on the specific PPCon architecture (e.g., layer sizes on the MLP; actual series of conv/deconv on the CNN)?
- l.232f: Why did the authors decide to not use an input normalization, which is a common approach, with the same advantages as mentioned for batch normalization? It would probably make the hyperparameters in Table 3 more similar to each other, too.
- Why did the authors chose to use a separate MLP for each of the 4 point-wise inputs, to transform it from 1x1 to 1x200 shape? A 200x replication so that, e.g., latitude becomes a 1x200 sized vector with (constant) latitude per profile would have sufficed to concatenate it together with the 3 input profile vectors, with the 1D CNN alone then tasked to find an optimal representation/fitting.
If I interprete Table 1 correctly (1 input, 80 neurons in 1st hidden layer, 140 in 2nd, 200 in 3rd, 200 in output layer), there are ca. 80.000 parameters for each of the 4 MLPs alone. Again, I struggle to understand the actual size of the dataset used for training, but with in total approx. 120.000/70.000 chla/bbp or nitrate BGC-Argo profiles worldwide, and given that we consider a training dataset within the Med Sea, I estimate the number of profiles to be somewhere around 3.000-5.000 or smaller. The MLPs seem to me like a very badly constrained task, even for machine learning and a decent dropout rate. It's an awfully complex MLP just to get sth. from 1x1 to 1x200 shape, which is then fed into yet another neural network.
- On a similar note: Can the authors please provide information on the amount of parameters (i.e., how flexible the entire PPCon is) vs. the number of profiles for training (i.e., data constraints) so that a reader gets a better idea of how well constrained PPCon is in general?
- Did the authors try to exclude the year from the inputs, with what effect? Given the training data covers only 6 years, it would be very surprising to me if the "year" input had a lot of explanatory power.
- Eq. 2: Is the hyperparameter alpha missing from the equation?
- There needs to be an evaluation against existing methods! Several ones are quoted in the well-written introduction, but they do not appear later in the manuscript. In particular, CANYON-MED, as being of similar scope and specifically trained on the Med Sea, too, is a prime candidate for comparison/evaluation (at least for nitrate). The authors evaluate PPCon against an MLP, the work of Pietropolli et al. 2023a, briefly mentioned, which is by the same authors? (1) This must be made clear on the figures/text and (2) at least CANYON-MED predicted nitrate profiles need to be added in the comparison, both on the individual examples as well as for the overall validation RMSE. (If there exist chla/bbp predictions suitable for the Med Sea, too, they should be added - but I am presently not aware of any.)
- l. 358: CANYON-MED states a nitrate RMSE of 0.78 mmol m-3.
- Please add the float cycle number (i.e., identification of which profile of a given float deployment) to the example profiles (Table 5 and Figure panel titles) so that it becomes clear (and easier to redo/recalculate) for a reader which profile was used.
- Figure 3: All of the Chla examples are of a deep Chla maximum (DCM) shape. At least one example should be a winter deep mixing example, which occurs in the Med Sea, for completeness. Otherwise one could argue that Chla should be at least as 'easy' to predict as nitrate, because the shape is always of a DCM-kind (-> l. 287: If Chla had always a DCM profile shape, then... ).
- l. 286f: "Higher quality in the prediction is achieved for nitrate, followed by chlorophyll and bbp700" - How was this judged/obtained?
- l. 371: [...] application on the GDAC's *BGC*-Argo *Med Sea* float dataset [...]
- l. 365: "cloud coverage" and "incomplete swaths"??Citation: https://doi.org/10.5194/egusphere-2023-1876-RC2 -
AC2: 'Reply on RC2', Gloria Pietropolli, 05 Jan 2024
We are grateful for the constructive comments and insightful suggestions provided by the Reviewer. Their feedback has been instrumental in refining our manuscript and improving our analysis. We have prepared a general response together with a point-by-point response to each of the Reviewer's comments in the attached file.
We remain open to any further suggestions or discussions that would contribute to the improvement of our work.
-
AC2: 'Reply on RC2', Gloria Pietropolli, 05 Jan 2024
Status: closed
-
RC1: 'Comment on egusphere-2023-1876', Anonymous Referee #1, 24 Nov 2023
The paper demonstrates the application of a convolution neural network (coupled to a MLP) to predict nitrate, chlorophyll and backscattering (bbp700) from temperature, salinity, and oxygen as well as time and geographic coordinates. This work is an improved version of a previous work by the same authors published in 2023 in Applied Sciences (https://doi.org/10.3390/app13095634 )
A. My main comment regarding this paper, one would either go into more detailed comparison between the previous published method and for example show some example profiles where they provide a different answer and discuss them. So far the comparison is mostly limited to the RMSE error. Or the method should be compared to another reconstruction technique altogether. In all cases, one should use the same training/test/validation dataset (I assume that this is already the case, please confirm).
B. In general the structure of the neural network is rather “unorthodox”. It would be good to justify the approach (in particular the secession of convolutional layers; see below).
C. MLP vs CNN: the description is a bit shallow as mathematically a CNN is a special case of a MLP in the sense that a convolution operation is a special case of a matrix multiplication. CNNs assume translation invariance as an additional prior information, which is why they can outperform MLPs (when considering multiple channels a 1x1 convolution is actually an MLP over the channel dimension).
Other comments (mostly minor ones):
“Punctual”: Should it not be “pointwise” ? It seems that punctual has only the meaning as on-time (https://dictionary.cambridge.org/no/ordbok/engelsk/punctual )
Line 65: “This originates from the fact that MLPs are trained on individual
data points and provide pointwise outputs, which makes the generation of regular profiles challenging as the NN does not take into account the vertical neighbors of predicted variables.”
This description is a bit too simple. This all depends on how the MLP is trained. One could (theoretically) also feed a profile and get a profile in return. For long profiles however this would be computational and prohibitive.
Line 105: “we only selected profiles that were marked with quality flags (QFs) of 1, 2, or 8 for variables such as temperature, salinity, nitrate, and chlorophyll.” Please also provide the labels associated with these classes.
Table 1: I think you should mention that 32 is the batch size in the caption. Some authors omit the batch size in such tables as it is considered an adjustable hyperparameter.
Line 133: “the sampling date (specifically year and day), geolocation, and geographic coordinates (latitude and longitude)”:
What is geolocation if not geographic coordinates ?
Line 121: “MLPs employed to transform punctual data into a vectorial shape - necessary for the training of the convolutional component.”
Why did you not consider a more obvious approach to simply repeated the sampling data and coordinates along the depth dimension. After all, all values in the profiles are considered to be at the same time and date. With the depth information that, the NN can specially on depth and thus come an entirely convolutional neural network.
Section 3.2 and table 2:
The architecture of the convolutional layer is rather surprising. In a UNet or convolutional autoencoder one would expect to have first all the convolutional layers (with a stride of 2 or with pooling layers) followed by all the deconvolution layers to get to the original depth dimension. In this paper they are mixed. For the convolutional layers you use kernel size of 2, 3 and 4. Can you explain why you use different sizes?
Line 175: “Second, to mitigate overfitting
phenomena, a regularization term known as λ-regularization is employed, which penalizes complex curves in proportion to the square of the model’s weights (Zou and Hastie (2005))”
There is no square in equation (2). Can you clarify if you use L1 or L2 regularization?
Equation 2: I think you forgot the alpha coefficient in this equation. Also consider to use a different symbol as alpha is already used for something different in equation (1).
193: “validation sets were chosen as 80, 10, and 10.” Add %
Line 203: “Adadelta (Zeiler (2012)) is the algorithm that is selected as the optimizer for training the network due to its ability to dynamically adapt over time using only first-order information“:
Can you be more specific what “first-order information“ means here?
Table 3: please replace $1e^-7$ by $10^{-7}$. In equation $e$ means the Euler number and $e^{-7}$ would be $exp(-7)$ (also on Table 6, 7,8).
Line 346: “On the other hand, we have demonstrated that the RMSE for the PPCon architecture is 0.61” Which unit are you using?
Line 27: “These instruments are essential to advance our knowledge of the biogeochemical state of the ocean, as one of their principal advantages is the assimilation into ocean biogeochemical models”
Advantages -> use cases?
References:
Please add a DOI where one is available (see https://www.geoscientific-model-development.net/submission.html)
https://doi.org/https://doi.org/10.17882/42182, 2000 -> , https://doi.org/10.17882/42182, 2000
In general, replace Adadelta (Zeiler (2012)) by Adadelta (Zeiler, 2012) and similar. (this is \citep in latex).
Citation: https://doi.org/10.5194/egusphere-2023-1876-RC1 -
AC1: 'Reply on RC1', Gloria Pietropolli, 05 Jan 2024
We are grateful for the constructive comments and insightful suggestions provided by the Reviewer. Their feedback has been instrumental in refining our manuscript and improving our analysis. We have prepared a general response together with a point-by-point response to each of the Reviewer's comments in the attached file.
We remain open to any further suggestions or discussions that would contribute to the improvement of our work.
-
AC1: 'Reply on RC1', Gloria Pietropolli, 05 Jan 2024
-
RC2: 'Comment on egusphere-2023-1876', Anonymous Referee #2, 27 Nov 2023
The authors describe a method that predicts nitrate, chla and bbp vertical profiles from geolocated and dated vertical profiles of temperature, salinity, and oxygen. With the help of CNNs, they use vertical profiles (data values and shape) rather than point-wise data values for these predictions. The method is trained and evaluated on BGC-Argo profiling float data acquired in the Mediterranean Sea.
What is very interesting in this work is that -- in contrast to previous work -- they propose to take a stronger advantage of the data context by using a convolutional neural network together with entire profiles for prediction. This is novel and innovative. What requires attention and significant improvement, however, is the presentation and clarity of their work. The manuscript starts off with a well-detailed and well-written introduction, but attention to detail and readability suffer the further one goes into the later sections. Unfortunately to an extent, at which it in places becomes unclear to a reader what is meant by the authors, e.g.:
l.345 "BGC-Argo float data for oxygen, nitrate, and chlorophyll concentrations exhibit RMSE values evaluated at 5.1±0.8μmol/kg, 0.25±0.07μmol/kg, and 0.03±0.01mg/m3, respectively. On the other hand, we have demonstrated that the RMSE for the PPCon architecture is 0.61" -- ??? What's the 0.61 related to? No unit, no oxygen/nitrate/chla given.
This is unfortunate and definitely needs attention before a re-submission. Starting from Section 3 and onwards, a thorough proof-read and maybe re-write would be recommended.
I take it that this manuscript is part of a thesis, with later parts likely written in a rush. This is very understandable, so I would like the authors to take my comments as advise on how to get their excellent idea into a shape that mirrors it's worth.
*General comments*:
- While the method can be applied anywhere, the present study focuses on the Mediterranean Sea. This must be mentioned somewhere prominently/early, e.g., either in the title or abstract. A reader cannot be left searching for the regional coverage until somewhere late in section 4, if one happens to glance over the one single sentence in l. 105. With almost all previous text (Intro, Argo description, etc.) referring to the global system.
- In general, the manuscript would benefit from more clarity (one example: Make sure to use the same term for the same thing). And from more specifics throughout the text (one example: "nitrate, chla, and bbp" instead of "all inferred variables"; Does "output variable" in l.124 refer to output of the MLP or output of PPCon?). If you can name it, then name it and do not find an alternative description.
- Who is your target audience with this paper? I believe it's an oceanographic community? For this, it is surprising to not see a map of any sort, which would help a reader to follow along (and to put it into his/her oceanographic background context).
I'd therefore like to see a map to show all BGC-Argo float profile positions used for algorithm training, testing, and (independent) validation (e.g., each with a different colour), to show the specific float positions relating to the selected profile plots (e.g., with a different marker), and to show the different areas for the posterior analysis (e.g., as boxes). If you're limited in number of figures/tables, drop one of the existing ones (or combine figures into one), because a map is much more important than any (anecdotal) illustration.
- Argo (and BGC-Argo) is a living dataset, which constantly evolves both with new data being added but also with existing data being re-evaluated/newly adjusted. Which means that, e.g., today's 2016-sampled profile (even if in delayed mode) may look different from when you downloaded the same profile last year or from when you will download it in 1 year's time (https://argo.ucsd.edu/data/data-faq/#reD). It is therefore important to make one's own work traceable, e.g., by stating the date one downloaded the dataset and from which source (e.g., the GDAC). Even better would be, e.g., use of one of the monthly snapshots of the Argo GDAC doi, which specifically refer to the state of the Argo dataset at the given time (https://argo.ucsd.edu/data/data-faq/#DOI).
- Figure captions should provide a text description of what is presented so that their content can be understood without the main text (applies, e.g., to Figure 1, but all figures in general). Think of a lazy reader, who wants to get the main points of your paper just by looking through your figures. At least what is presented in which colour/marking and against what must become clear (e.g., Fig. A1: continuous vs. dashed lines??; Fig. 2: Which MLP is shown?).
- Check/reconsider the number of significant digits given for statistical metrics throughout the text, and make them coherent (and not more 'accurate' than realistic). (E.g., on the RMSE of nitrate, ... in tables 6-8)
*Section by section comments*: (focus on style/structure)
- The introduction is extensive and well-written with appropriate referencing (though referencing around global programs are a bit Med Sea centric; which is probably fine once the regional scope becomes clear).
- For the scope of the present paper, section 2 Dataset can be drastically reduced: Lines 87-104 can be condensed into two sentences: "We used data from the evolving BGC-Argo network, which uses profiling floats ... ." and "BGC-Argo data were accessed from the Argo GDAC." Then add which data mode you used (only delayed-mode adjusted data? Or also real-time adjusted data? Hopefully no unadjusted real-time data?), what kind of files (likely the s- rather than b-profiles?), and, especially important, add how many profiles you had in your data selection prior and past QC/preprocessing.
The size of the training dataset remains unclear to me in the present manuscript.
- section 3 PPCon: The first paragraph is a repetition of the introduction. It needs to be there only once and I would recommend it to be only in the introduction.
I assume the "32" in tables 1 and 2 is the minibatch size? As such, it shouldn't be listed as part of the layer size, which I find very confusing. (Consider the prediction step of one profile - no batch size is relevant here.)
- section 4.1 is again rather general (except for one paragraph), and I wonder whether the majority of its content should rather go to the introduction. Section 4.3 would largely benefit from a map, e.g., to illustrate the "uneven spatial and temporal distribution of the profiles" (l. 254) which remains elusive otherwise.
- Section 5 is more clearly written again, which is good!
A question I had while reading the RMSE values/patterns discussion (3rd paragraph) is how this relates to the range of values and variability of nitrate/chla/bbp, e.g., within a given season. Eventually, this information is touched upon, but I'd suggest to move the RMSE discussion, its temporal evolution and patterns (third paragraph) more closely to the last paragraph of part 5.0.
This last paragraph (of section 5.0) and discussion of Figure 5 contains a lot of (valuable) information! I would encourage the authors to spend more time/space on its discussion, so that it can be adequately digested by a reader, and I would like to see this presented more extensively to get better context, e.g., to the RMSE/performance vs. season vs. natural variability.
- Section 6 is logically structured, but the content needs careful inspection so that it contains all information intended/required to be transmitted. (E.g., units/which parameter in l. 346; l. 371: Must add "in the Med Sea" to have this sentence work; ...).
- The conclusion, again, is well-written, concise, and clear.Specifics: (focus on content)
- l. 10: "resulting in irregularities such as jumps and gaps when used for the prediction of vertical profiles".
I would challenge this and think that neither of this is true. For well-trained MLPs or neural networks in general, regularization causes them to give smooth outputs in general. Jumpy behaviour is only to be seen in case of overfitting, i.e., where an operator chose to fit the training/testing data set too closely (for apparently good performance statistics) but neglected the regularization term, so that the trained network does not generalize sufficiently (and therefore gives seemingly erratic/jumpy predictions). But this is not sth. to blame the MLP architecture for, but to blame the network training/operator implementation. It seems that this claim is mostly supported by citation of the same authors' work (Pietropolli et al., 2023). Given the way it is presented and that it should stand for MLPs in general, it would largely benefit (or rather require!) support by other people's/lab's work.
- Same comment on l.64-65. An MLP/point-wise prediction does not take into account neighboring data during prediction, true. -- But they do so during training due to their natural proximity in data state space and the MLP regularization. I.e., input data that are close together in data space (like from a single profile) do get output predictions that are smoothly transitioning from one to the next output value (if adequately regularized/without overfitting). Nonetheless, CNNs can (likely) take advantage of neighboring data also during prediction, so it's worth to study. (This provides sufficient motivation for CNNs from their potential for improvement; there is no need to claim negative aspects on MLPs/current methods beyond this for motivation.)
- Continued: The authors write that (l. 335f) "MLP architectures can provide good training and test errors" (as by Pietropolli and 3 more references for MLPs) while "they have been found to exhibit higher errors when predicting BGC-Argo profiles" (only by Pietropolli but none of the 3 other references for MLPs) -- This should make someone a bit suspicious and check double if this holds in general (the way it is presented here).
- l. 384 and l. 355-358 (btw. CANYON-MED with MLP architecture states a nitrate RMSE of 0.78 mmol m-3) will need correction, too.
I suggest to take out the MLP-irregularities-and-jumps claims entirely, as they are not sufficiently supported, and stick to the fact that CNNs can take benefit from data in their vicinity/neighorhood in a better way and explicitly during training (thanks to the conv/deconv layers).- l. 87: "Array for Real-time Geostrophic Oceanography" This is an interesting fit to match "Argo", but Argo is not an abbreviation. (It is inspired by Greek mythology: https://argo.ucsd.edu/about/)
- The architecture/design of PPCon and their CNN-approach is hard to understand. The authors refer/cross-refence to different elements of their approach, without clear, concise wording. E.g., there are several references to the four point-wise inputs, the seven-channel tensor, or three variables (e.g., l. 150, 145, 143, 139, 131f.) without a clear sentence like: "Per profile, we have 4 point-wise inputs, which are latitude, longitude, (decimal??) year, and year day. In addition, we have three 1x200 input vectors for temperature, salinity, and oxygen profiles, respectively."
- Can Figure 1 be modified so that it mirrors the informations from Table 1 and Table 2 on the specific PPCon architecture (e.g., layer sizes on the MLP; actual series of conv/deconv on the CNN)?
- l.232f: Why did the authors decide to not use an input normalization, which is a common approach, with the same advantages as mentioned for batch normalization? It would probably make the hyperparameters in Table 3 more similar to each other, too.
- Why did the authors chose to use a separate MLP for each of the 4 point-wise inputs, to transform it from 1x1 to 1x200 shape? A 200x replication so that, e.g., latitude becomes a 1x200 sized vector with (constant) latitude per profile would have sufficed to concatenate it together with the 3 input profile vectors, with the 1D CNN alone then tasked to find an optimal representation/fitting.
If I interprete Table 1 correctly (1 input, 80 neurons in 1st hidden layer, 140 in 2nd, 200 in 3rd, 200 in output layer), there are ca. 80.000 parameters for each of the 4 MLPs alone. Again, I struggle to understand the actual size of the dataset used for training, but with in total approx. 120.000/70.000 chla/bbp or nitrate BGC-Argo profiles worldwide, and given that we consider a training dataset within the Med Sea, I estimate the number of profiles to be somewhere around 3.000-5.000 or smaller. The MLPs seem to me like a very badly constrained task, even for machine learning and a decent dropout rate. It's an awfully complex MLP just to get sth. from 1x1 to 1x200 shape, which is then fed into yet another neural network.
- On a similar note: Can the authors please provide information on the amount of parameters (i.e., how flexible the entire PPCon is) vs. the number of profiles for training (i.e., data constraints) so that a reader gets a better idea of how well constrained PPCon is in general?
- Did the authors try to exclude the year from the inputs, with what effect? Given the training data covers only 6 years, it would be very surprising to me if the "year" input had a lot of explanatory power.
- Eq. 2: Is the hyperparameter alpha missing from the equation?
- There needs to be an evaluation against existing methods! Several ones are quoted in the well-written introduction, but they do not appear later in the manuscript. In particular, CANYON-MED, as being of similar scope and specifically trained on the Med Sea, too, is a prime candidate for comparison/evaluation (at least for nitrate). The authors evaluate PPCon against an MLP, the work of Pietropolli et al. 2023a, briefly mentioned, which is by the same authors? (1) This must be made clear on the figures/text and (2) at least CANYON-MED predicted nitrate profiles need to be added in the comparison, both on the individual examples as well as for the overall validation RMSE. (If there exist chla/bbp predictions suitable for the Med Sea, too, they should be added - but I am presently not aware of any.)
- l. 358: CANYON-MED states a nitrate RMSE of 0.78 mmol m-3.
- Please add the float cycle number (i.e., identification of which profile of a given float deployment) to the example profiles (Table 5 and Figure panel titles) so that it becomes clear (and easier to redo/recalculate) for a reader which profile was used.
- Figure 3: All of the Chla examples are of a deep Chla maximum (DCM) shape. At least one example should be a winter deep mixing example, which occurs in the Med Sea, for completeness. Otherwise one could argue that Chla should be at least as 'easy' to predict as nitrate, because the shape is always of a DCM-kind (-> l. 287: If Chla had always a DCM profile shape, then... ).
- l. 286f: "Higher quality in the prediction is achieved for nitrate, followed by chlorophyll and bbp700" - How was this judged/obtained?
- l. 371: [...] application on the GDAC's *BGC*-Argo *Med Sea* float dataset [...]
- l. 365: "cloud coverage" and "incomplete swaths"??Citation: https://doi.org/10.5194/egusphere-2023-1876-RC2 -
AC2: 'Reply on RC2', Gloria Pietropolli, 05 Jan 2024
We are grateful for the constructive comments and insightful suggestions provided by the Reviewer. Their feedback has been instrumental in refining our manuscript and improving our analysis. We have prepared a general response together with a point-by-point response to each of the Reviewer's comments in the attached file.
We remain open to any further suggestions or discussions that would contribute to the improvement of our work.
-
AC2: 'Reply on RC2', Gloria Pietropolli, 05 Jan 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
335 | 160 | 35 | 530 | 29 | 29 |
- HTML: 335
- PDF: 160
- XML: 35
- Total: 530
- BibTeX: 29
- EndNote: 29
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1