the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Refining Marine Net Primary Production Estimates: Advanced Uncertainty Quantification through Probability Prediction Models
Abstract. In marine ecosystems, Net Primary Production (NPP) is pivotal, not merely as a critical indicator of ecosystem health, but also as an integral component in the global carbon cycling process. This study introduces an advanced probability prediction model to refine the precision of NPP estimation and to deepen our comprehension of its inherent uncertainties. A comprehensive comparative analysis is undertaken, juxtaposing a Bayesian probability prediction model, predicated on empirical distribution, with a probability prediction model anchored in deep learning. The objective is to meticulously quantify the uncertainty associated with NPP. The findings underscore the applicability of probability prediction in investigating the uncertainty of marine NPP. Both models proficiently delineate the dynamic trends and inherent uncertainties in NPP, with the neural network model exhibiting superior accuracy and dependability. Additionally, these probability prediction models are adeptly applied to prognosticate NPP in specific marine regions, efficaciously elucidating the interannual trends in NPP variation. This research contributes not only a more precise method for quantifying NPP uncertainty but also bolsters scientific support for the stewardship of marine ecosystems and the preservation of environmental integrity.
- Preprint
(3794 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 13 Jan 2025)
-
RC1: 'Comment on egusphere-2024-3221', Anonymous Referee #1, 17 Dec 2024
reply
Please find the comments in the attached PDF.
-
RC2: 'Comment on egusphere-2024-3221', Anonymous Referee #2, 18 Dec 2024
reply
The manuscript presents a comparative analysis of Bayesian and neural network-based probability prediction models for estimating Net Primary Production (NPP) at a location near Weizhou Island (though this spatial focus is not clearly stated in the abstract or introduction). While the study demonstrates interesting methodological approaches to uncertainty quantification, it requires major revisions and clarifications.
general comments
The spatial scope and context of the study need to be clearly defined in the abstract and introduction. The location or spatial extent of the study is not mentioned in the title, abstract or introduction, suggesting a global analysis of marine NPP, when in fact the study focuses on a specific (point) location near Weizhou Island off the Chinese coast. Given the large number of inputs required for the Neural Network (NN) and Bayesian technique used in the study, it would not be easy to scale the approach to a larger region.
A critical limitation of the study is the data used for training the NN and the Bayesian model. The models are trained on outputs from existing NPP models (VGPM, CbPM, and CAFE) rather than directly on NPP data. Effectively, the NN and Bayesian model serve as emulators of the NPP models, inheriting their underlying errors and biases. Thus, the uncertainty estimates reported in the manuscript reflect the uncertainty in emulating the output, but not the uncertainty in estimating actual NPP. Furthermore, as shown in Fig. 3, estimates from VGPM, CbPM, and CAFE differ strongly, and it is not clear which output is more accurate. These points need to be explicitly acknowledged in the manuscript, as it means the reported uncertainty estimates do not represent true NPP estimation uncertainty.
The differences between VGPM, CbPM, and CAFE output raise questions about which model provides the best NPP estimates and the most reliable training data. The current version of the manuscript initially does not mention which of the 3 models provided the output used to generate the full time series of NPP estimates near Weizhou Island in Section 3.3. Section 4 finally reveals that CAFE was used to generate the NPP training data, but that choice appears to have been motivated by results showing that the NN and the Bayesian model can emulate CAFE output well and not that CAFE output best represents true NPP.
In the context of the above comments, it would be interesting for the reader to know what inputs VGPM, CbPM, and CAFE used to generate their results. If the NN or the Bayesian model require more or more difficult to measure input data than VGPM, CbPM, or CAFE, why use them at all? Similarly, it would be interesting to investigate which of the inputs to the NN or the Bayesian model are actually required to obtain good performance.
The manuscript's writing style suggests the use of AI-assisted writing, which, while not problematic in itself, has led to the use of emphatic language and filler words (such as "pivotal", "integral", "advanced", "comprehensive", "indispensable", "paramount", etc.). The manuscript would benefit from removing these words in places and rewording passages.
A few passages in the manuscript appear to suggest surprise in discovering periodicity in NPP values: "Upon visualizing the values of the three NPP products (VGPM, CbPM, and CAFE) (Fig. 3), it became evident that each exhibits a distinct periodicity" (l 198). "The analysis of the annual change of NPP shows a clear periodicity, which means that the change of NPP is not random, but follows certain laws and patterns." (l 571). Even at 21 degrees north, one can expect seasonal patterns in marine primary production - this context should be provided in the text.
specific commentsL 117: What are "stochastic optimization" and "advanced chance constraints"? They are only used here and nowhere else in the manuscript. It would be useful to describe relevant new concepts to the reader right away, or not mention them when they are not used or described in the manuscript.
L 149: What does "sea accumulation" mean?
L 149: "Surrounded by the sea on all sides, Weizhou Island ...": I think this is the definition of an island.
L 168: "For the analysis of three NPP algorithms - namely, VGPM, CbPM, and CAFE - we acquired datasets at an eight-day temporal resolution ...": Here it is unclear to the reader if the "acquired datasets" are the input required to run the algorithms or their output. I assume it is the latter, but that should be made more explicit.
L 177/Table 2: Just listing the numbers of missing entries is not very informative. At which frequency were they recorded?
L 198: "Upon visualizing the values of the three NPP products (VGPM, CbPM, and CAFE) (Fig. 3), it became evident that each exhibits a distinct periodicity, with the fluctuation ranges remaining stable yet markedly varied among them." What exactly does this mean? Do the signals not have an underlying annual periodicity?
L 311: Samples are mentioned here for the first time and need a better introduction.
Eq. 3: This looks like a recursive definition of CRPS, I would suggest using different names for the "CRPS" used in Eq. 2 and 3.
Eq. 4: The notation is inconsistent: In Eq. 2 and 3, x denotes the observed value and y the predicted value, but in Eq. 4 and 5, y is used for the actual/observed value and y-hat for the predicted value.
L 501: The test data distribution for CAFE NPP does not look similar to that of the train data distribution, suggesting that the test data may not be well-represented by the train data.
L 503: What is the difference between the values shown in Table 5 and Fig. 4? Why not combine the two?
Fig. 2 and 3: The date label locations 2007/1/1, 2008/3/13, 2009/5/25, ... make it difficult to interpret the plot and detect seasonality.
Fig. 4: The caption mentions "input variables". Are these inputs to VGPM, CAFE, and CbPM?
Fig. 5: Why does the y-axis go past 0.8 in panels a, b, d and e, when the values all stay below 0.4? Also, the units are missing.
Fig. 8 and 9: The NPP units here are incorrect. The data appears to have been normalized, but why? Without normalization, it would be easier to interpret for which NPP ranges the NN and the Bayes model over- or underestimate VGPM NPP.
Citation: https://doi.org/10.5194/egusphere-2024-3221-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
133 | 28 | 6 | 167 | 2 | 2 |
- HTML: 133
- PDF: 28
- XML: 6
- Total: 167
- BibTeX: 2
- EndNote: 2
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1