the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Optimizing a large number of parameters in a Biogeochemical Model: A Multi-Variable BGC-Argo Data Assimilation Approach
Abstract. The predictive accuracy of marine biogeochemical models is fundamentally limited by uncertainty in their parameter values. We present a parameter optimization framework using iterative Importance Sampling (iIS) to constrain the PISCES model by leveraging the rich, multi-variable dataset provided by Biogeochemical-Argo (BGC-Argo) floats. Using data from a BGC-Argo float in the North Atlantic, we assimilate a comprehensive suite of 20 biogeochemical metrics to constrain all 95 parameters of the PISCES model within a 1D vertical configuration. Our global sensitivity analysis (GSA) identifies parameters controlling zooplankton dynamics as the dominant source of model sensitivity for this specific site. We compare three strategies: (1) optimizing a subset of parameters for their strong direct influence (Main effects); (2) optimizing a larger subset that also includes parameters influential through non-linear interactions (Total effects); and (3) simultaneously optimizing all 95 parameters. All three approaches achieve a statistically indistinguishable and significant improvement in model skill, reducing Normalized Root Mean Square Error (NRMSE) by 54–56 %. The rich, multi-variable dataset provides sufficient orthogonal constraints to yield posterior parameter distributions with negligible inter-correlation, shifting the long-standing challenge of correlated equifinality to uncorrelated equifinality, where a range of optimal parameter sets can be found independently. Parameter uncertainty is reduced by 16–41 %, and the optimized ensembles demonstrate strong portability. While all strategies produce a similar, tightly constrained predictive spread for the assimilated variables, they differ significantly in computational cost and in their estimation of uncertainty for unobserved parts of the model. The prerequisite GSA was ~40 times more computationally expensive than the optimization, while the All-parameters strategy, by exploring the full parameter space, provides a more comprehensive and robust quantification of the model's uncertainty in unassimilated variables. We therefore conclude that directly optimizing all model parameters is the recommended strategy. This work delivers a validated, parameter set for the North Atlantic and demonstrates a scalable framework to advance biogeochemical modeling from using static, globally-uniform parametrization to developing a map of regionally-tuned parameters.
- Preprint
(2740 KB) - Metadata XML
-
Supplement
(4492 KB) - BibTeX
- EndNote
Status: open (until 15 Nov 2025)
- RC1: 'Comment on egusphere-2025-4369', Anonymous Referee #1, 08 Oct 2025 reply
-
RC2: 'Comment on egusphere-2025-4369', Anonymous Referee #2, 17 Oct 2025
reply
This paper presents a method for parameter optimisation that determines posterior probability distributions rather than single optimal parameter values.
The aim is to find a suitable probability density function (PDF) for the parameters of a BGC model, given real-world observations for multiple tracers and assumptions about their associated error sources.The authors consider the PISCES BGC model in a 1D setting.
They leverage extensive observational data obtained from BGC-Argo floats and apply iterative Importance Sampling (iIS) to determine a posterior parameter probability distribution (a fine-grained discretisation of its PDF) that is most consistent with the observational data, starting from uniform prior parameter distributions.The available data allows definition of 20 different tracer state measures, subdivided by the productive zone and the remaining deeper section of the water column.
These measures are applied to both observations and model simulations (interpolated to the same spatio-temporal grid) and used to derive corresponding model-data misfits.A key finding is that, due to the comprehensive BGC-Argo float data and the corresponding information provided by the derived measures, iIS successfully identifies a posterior parameter distribution that significantly reduces model-data misfit in the productive layer of the ocean while maintaining uncorrelated parameter distributions.
However, similar model-data misfits are still observed for different parameter vectors, a situation that the authors call "uncorrelated equifinality"
MAYOR COMMENTS
The approach to leverage the upcoming enrichment of BGC data by BGC-Argo sounds promising, since a thorough calibration of coupled marine BGC models was so far limited due to a lack of comprehensive observational data for all state variables addressed by BGC (cf. Kriest et al., 2023; Petric et al., 2022; Rohr et al., 2023).
According to the results, the applied iIS works quite well for the productive layer of the 1D setup of PISCES, where an ensemble misfit reduction of 50% is reached and model state trajectories are closer to the observational time-series from BGC-Argo.
In the deeper water column, however, both the misfit and most tracer trajectories (except POC) stay close to the reference run, a fact that is attributed to impacts of the ocean circulation that dominate and act on longer time scales than the assimilated time-series in the 1D model setup.
Another emphasis of the paper is the use of a preliminary global sensitivity analysis using Sobol' indices to define three different subsets of the parameters used for optimisation: (1) all parameters, (2) parameters filtered by first order Sobol' indices (parameters that are most sensitive alone), (3) parameters filtered by all-order Sobol' indices (parameters that are most sensitive in combination with other parameters).
Here, the corresponding iIS results are found to be statistically similar for all considered subsets and --- as calculating the Sobol' indices is computationally more expensive than the application of iIS --- optimising all parameters at once is stressed to be the best strategy.
This is quite amazing since I would expect the incorporation of many different parameters to impede convergence of optimisation.
Related to my amazement, as a non-statistician I would like to see a little bit more about the details how the measures (mean concentrations of DIC, TA, O₂, NO₃⁻, PO₄³⁻, Si, and POC) enter the quantities of interest (QoI) and the iIS algorithm.
Are these measures actually replacing the large vectors y and s (observed and simulated tracer states) in the iIS?
My first and second guesses at first glance where that you consider a sum of cost terms over all tracers, and that you probably apply iIS in a multi-stage setup w.r.t. disjoint subsets of the parameters.
Therefore, I would like to see more details about it in an Appendix section.
Despite having good statistical properties, the posterior parameter distribution does not represent well a validation BGC-Argo data set taken in a different ocean basin.
The authors conclude that parameter distributions should be location-dependent.
It would also be interesting to see whether good data assimilation of multiple locations simultaneously is possible.
Since the PISCES model has an elaborate representation of zooplankton processes, it would be worthwhile to consider derived zooplankton observations (cf. Petrik et al., 2023; Rohr et al., 2022) in conjunction with BGC-Argo data and other data sources.
SPECIAL COMMENTS:
Lines 345, 360:
Could you provide an explicit formula for p(y|x) as used in equations (2) and (4) in this study, perhaps in the Appendix?
This will make the methodology clearer to readers, together with the code availability section.
Line 356:
The phrase "which has as the main objective restraint the sampling to regions of the state space of high probability" does not sound natural to me.
Do you mean "The main objective is to restrict sampling to regions of the state space with high probability"?
Lines 643 ff:
The variables y^M_t and ε^M_t already appear in equation (13) but are only explained after equation (16).
In contrast, p_j and s^M_{j,t} are explained both after equation (16) and before their first use.
I think redundancy is beneficial here, as readers need to understand the meaning of symbols immediately after each formula.
Therefore, y^M_t and ε^M_t should also be defined after equation (13).
REFERENCES
Kriest, I. and Getzlaff, J. and Landolfi, A. and Sauerland, V. and Schartau, M. and Oschlies, A. (2023). Exploring the role of different data types and timescales in the quality of marine biogeochemical model calibration, Biogeosciences 20 (13): 2645--2669. https://doi.org/10.5194/bg-20-2645-2023
Petrik, C. M., Luo, J. Y., Heneghan, R. F., Everett, J. D., Harrison, C. S., & Richardson, A. J. (2022). Assessment and constraint of mesozooplankton. In CMIP6 Earth system models. Global Biogeochemical Cycles, 36, e2022GB007367. https://doi. org/10.1029/2022GB007367
Rohr, T., Richardson, A. J., Lenton, A., Chamberlain, M. A., & Shadwick, E. H. (2023). Zooplankton grazing is the largest source of uncertainty for marine carbon cycling in CMIP6 models. Communications Earth & Environment, 4(1), 212. https://doi.org/10.1038/s43247-023-00871-wCitation: https://doi.org/10.5194/egusphere-2025-4369-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
362 | 47 | 10 | 419 | 27 | 6 | 7 |
- HTML: 362
- PDF: 47
- XML: 10
- Total: 419
- Supplement: 27
- BibTeX: 6
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Review of the manuscript “Optimizing a large number of parameters in a Biogeochemical Model: A Multi-Variable BGC-Argo Data Assimilation Approach” by Hyvernat et al.
Summary: The study presents a parameter optimization framework using iterative Importance Sampling to constrain the poorly known model parameters in the PISCES biogeochemical model in a 1-dimensional framework. The optimization is based on measurements of a Biogeochemical-Argo during the year 2015 in the North Atlantic. The author test different fitting strategies (i.e. fitting all poorly known model parameters or just a part of it) which all lead to similar, statistically indistinguishable model improvements. Despite having the highest computational cost, the authors still rate the approach of fitting all poorly known model parameters simultaneously as best option because it gives the "most honest" estimate of the uncertainty of unobserved prognostic variables. Further the study implies that it could overcome the long-standing challenge of parameter overfitting and parameter (un)identifiability
Major Comments:
Parametric uncertainties are a major issue in biogeochemical ocean modelling and the application of Iterative Importance Sampling while testing different estimation strategies is a nice approach. I specifically like that the parameter fits are tested on independent observations which were not used during the fitting process. This should be common, but isn’t in ocean biogeochemical modelling due to a general lack of independent observational data. However, in the presented study the analysis of the test data sets appears somewhat hidden and two somewhat contradictory conclusions are drawn from it. The authors use two data sets of seasonal cycles and argue, with a relatively good fit of the optimized model to the first test data set, that the fitted models are “portable” (Ln 944, 985, 988), while the rather poor fit to the second test data set is regarded as indication that parameters should be estimated depending on location (Ln. 929, 1019ff).
These aspects need some attention, because the authors draw very interesting and far reaching conclusions by implying that they could overcome the long-standing challenge of parameter overfitting (Matear,1995) and parameter (un)identifiability (Ln. 20ff, 995ff, 1005). The authors relate their solution to the "richness" of Biogeochemical Argo Floats (Ln.19, Ln 80ff, 970ff). As somewhat contradictory conclusion, however, it is stated already in the Abstract that different optimal parameter sets lead to statistically indistinguishable model results (Ln. 18; cf. also Tab.4). This shows that solutions for an optimal parameter set are not unique which might point towards overfitting. Thus, the overfitting-statement needs in my eyes stronger evidence. E.g. it should be ruled out that the calibration may be over-confident due to too few noisy data during fitting (e.g. Hermans et al. 2022 and Yang & Zhu, 2018 illustrate the problem of too tight posteriors (small variance) and over-confident calibration for such cases; it seems straight-forward that also correlations will be impacted). This holds especially, because the presented results rely on a single seasonal cycle while the authors state that their 1D representation of the ocean has major flaws (e.g., Ln 400ff) and some of the observations were “pseudo” (e.g., Ln 165ff). This might be achieved by presenting the performances of all optimal models on independent test data more convincingly. Note, that the test data should preferably cover different nutrient/light regimes, seasons and mixing conditions in accordance to what may occur during model application. In case the test data do not suffice to cover this range, It might be interesting to explore if the different optimal model solutions diverge under changing external conditions (cf. Taucher & Oschlies, 2011; Löptien & Dietze, 2017). Also, it would be nice if the authors could briefly explain somewhere prominently how correlations among parameters relate to overfitting and parameter identifiability to reach a wider audience.
In summary, I think that the study has very nice aspects, but some conclusions need in my eyes more evidence or the authors could tone down a bit. Further, I found the manuscript partly very hard to follow and it could be much more concise to gain it’s full potential. This holds especially true for the Method and Data Sections, where large parts could be moved to an Appendix (e.g. extensive formula). Currently, this part covers 20 pages and I find it very hard to keep overview in the many details that partly occur in to me unexpected subsections. The authors could also be more explicit on the underlying ideas.
Specific Comments:
=> Abstract
Ln 12: Plead add: PISCES-model in a 1-dimensional framework
Ln 13: using seasonal observations of the year 2015
Ln 13: Better? Replace “metrics” by “observed biogeochemical tracers and related properties” or introduce what is meant by “metrics”.
Ln 14: Better? within a “1-dimensional representation of the ocean“ instead of “1D vertical configuration”
Ln 18: add “all 95 poorly known model parameters”
Ln 19: reducing the NRSME relative to what?
Ln 19ff: These sentences need in my eyes more evidence. It’s especially confusing, since the authors stated just before that different fitting strategies lead to “statistically indistinguishable results” (cf. Tab.4 shows very different parameter sets).
Ln 19: I don’t see why the one seasonal cycle considered here should be “richer” than using long-term observing stations (which has been done extensively since decades; e.g. Hunt et al. 1996; Schartau & Oschlies, 2003; Ward et al. 2013). I guess the authors refer to the number of observed prognostic tracers while some are "pseudo". Please clarify.
Ln 22: Please “add of the seasonal cycle of the year 2015”.
Ln 22: Please add what is the reference for parameter uncertainty which has been improved.
Ln 23: This statement contradicts somehow the conclusions drawn from the other data set: “the optimized ensembles demonstrate strong portability.”
Ln 23: the “tightly constrained predictive spread” should refer to an independent test data. Otherwise reformulate “lead to similar fits to the observations”.
Ln 26ff: I am not sure where it has been shown that this fitting strategy is more "robust"
Ln 28: The amount of required model simulations would not be feasible in a full 3-dimensional ocean model – delete or reformulate “scalable”.
Ln 29/30: This should be formulated as an outlook
=> Introduction:
Ln 49: structural uncertainty is another problem which could be mentioned
Ln 109/110: Please specify. It should be mentioned that the data refer to a single seasonal cycle and that the PISCES-model is initialized with these observations before starting the approx. 1-year simulation (or please correct me in case I misunderstood).
Ln 131ff: The main idea of iIS is not conveyed.
Ln 230: where does the vertical turbulent diffusion come from?
=> Data and Model Configuration
Ln 130ff BGC-Argo Data
These subsections could be much more concise. It should be stated right in the beginning which time period and location have been considered and how the model is initialized and which independent data have been used for testing. Then indicate which biogeochemical tracers were considered and list the data sources. All other details could be moved to an Appendix (such as the float numbers or properties of other floats and reasons why these were not considered).
=> Model description, framework and configurations
Ln 200ff: I find it very confusing to start with the 3-dimensional ocean model which is not used in the parameter estimation experiments. Otherwise, this subsection describes only PISCES. Rename to “Biogeochemical Model description”?
=> 1D configuration
Benefits and challenges of the 1-dimensional setup do not become clear and I did not understand how ocean mixing was derived. Some important information is scattered; e.g. that some metrics were not usable because the ocean model is too simple. Ultimately, all biogeochemical prognostic variables depend heavily on ocean mixing such that this point needs more careful discussion (this might impose huge uncertainties and has the potential to lead to the situation described by Hermans et al. 2022 and Yang & Zhu, 2018 with too tight posteriors and over-confident calibration).
Ln 217ff: I rather expected now a description on how the ocean is represented in 1D. This part rather refers to Section 3.1.
=> 3D configuration
Ln 260ff: This subsection is a bit confusing since no 3d model has been involved in the fitting process. It might be deleted and the well-known referenced setup could be mentioned when it comes to error estimates.
=> Metrics for sensitivity analysis and parameter optimization
Again, the section should clearly and briefly explain in the beginning what has been done and come to all the details later.
Ln 269, 271ff: I find the use of “metrics” sometimes confusing and it should be clearly distinguished between “biogeochemical tracers and related properties” and objective functions (or whatever the authors want to call it). I am aware that many different name giving conventions exist and it might thus be helpful to clearly introduce what is meant in the beginning. I guess here the “20 metrics” were combined to one objective function (NRMSE)? Please clarify.
Ln 313: It does not get clear which prior estimate for the parameter uncertainty has been chosen. What is the considered range and where does it come from?
=> Parameter optimization method: Iterative Importance Sampling
Ln 330ff: This section is really hard to follow for people not familiar with iIS. It would be important to convey the main idea while most other parts could be moved to the supplement.
=> Sobol Indices
Ln 432ff Again this subsection is far too extensive and a bit unexpected since Sobol indices were already mentioned before. Again, it is key to clearly convey the main ideas.
Ln 634: I guess I overlooked something. What was the baseline?
Ln 638ff: Could you briefly explain (here or elsewhere) how the correlations among parameters relate to overfitting and parameter identifiability to maintain a broader readership?
Ln 641: I would rather have the main ideas explained. The extensive formula could go to an Appendix. Most readers will we familiar with an RMSE and the concept of normalization. The thoughts behind using RCRV do not become clear and an easier metric that compares well to the fitting process might be considered.
=> Results
Ln 780: I still don’t know which reference the improvements refer to and as such all the information has no value. Please clarify.
Ln 898: Please check the wording. This paragraph cannot refer to the predictive skill if no independent data were considered (i.e. not used during fitting). I would say, it rather refers to the goodness of fit to the observational data used during the fitting procedure which can be obtained by the optimization.
Ln 916: Currently, I find the analysis of the test data not convincing to rule out overfitting and proofe what the authors term “portability” (this might become more clear when using a comparable metric as used during the fitting process for all optimal solutions and also some more discussion is needed why the Mediterranean float shows conflicting results). It does not become clear to me how much the "pseudo observations" contribute.
=> Discussion;
Pros and Cons of the presented approach could be discussed more clearly. E.g., how does the chosen 1D setup with focus on seasonal processes and perfect initial conditions relate to 3D coupled biogeochemical ocean models used for projections and run for decades or more (where e.g. the global distribution of nutrients is key)? Also, it would be interesting to discuss the mentioned shortcomings in the 1D representation of the ocean and the planed implementation into a 3-dimensional model (Ln 1041ff) (in a 3-dimensional context it has been shown that biogeochemical model parameters can be tuned to compensate for ocean model differences – while this is not without problems for projections (Löptien & Dietze, 2019; Pasquier et al. 2023). From these studies one might conclude that differing parameters will be optimal in the 1D and 3D setups (unless the mixing matches fairly well) and I would be very interested on some thoughts on this.
Ln 995ff: Not agreed (yet?).
Ln 1001ff: I was not aware by the model description that several metrics were optimized simultaneously and assumed that all so called "metrics” were merged into a single NRMSE. To use more than one objective function is certainly possible, but not trivial (cf. Sauerland et al. 2019). Please clarify.
Ln 1005ff: As outlined in my major comment, some conclusions need more evidence – especially given the fact that very different parameter sets (Abstract, Tab.4) lead to optimal statistically indistinguishable model results.
=> Supplement:
Please add some introductory text what is shown.
References:
Hermans, J., Delaunoy, A., Rozet, F., Wehenkel, A., & Louppe, G. (2022). A crisis in simulation-based inference? beware, your posterior approximations can be unfaithful. Transactions on Machine Learning Research.
Hurtt, G. C., & Armstrong, R. A. (1996). A pelagic ecosystem model calibrated with BATS data. Deep Sea Research Part II: Topical Studies in Oceanography, 43(2-3), 653-683.
Löptien, U., & Dietze, H. (2017). Effects of parameter indeterminacy in pelagic biogeochemical modules of Earth System Models on projections into a warming future: The scale of the problem. Global Biogeochemical Cycles, 31(7), 1155-1172.
Löptien, U., & Dietze, H. (2019). Reciprocal bias compensation and ensuing uncertainties in model-based climate projections: pelagic biogeochemistry versus ocean mixing. Biogeosciences, 16(9), 1865-1881.
Matear, R. J. (1995). Parameter optimization and analysis of ecosystem models using simulated annealing: A case study at Station P.
Pasquier, B., Holzer, M., Chamberlain, M. A., Matear, R. J., Bindoff, N. L., & Primeau, F. W. (2023). Optimal parameters for the ocean's nutrient, carbon, and oxygen cycles compensate for circulation biases but replumb the biological pump. Biogeosciences, 20(14), 2985-3009.
Sauerland, V., Kriest, I., Oschlies, A., & Srivastav, A. (2019). Multiobjective calibration of a global biogeochemical ocean model against nutrients, oxygen, and oxygen minimum zones. Journal of Advances in Modeling Earth Systems, 11(5), 1285-1308.
Schartau, M., & Oschlies, A. (2003). Simultaneous data-based optimization of a 1D-ecosystem model at three locations in the North Atlantic: Part II—Standing stocks and nitrogen fluxes.
Taucher, J., & Oschlies, A. (2011). Can we predict the direction of marine primary production change under global warming?. Geophysical Research Letters, 38(2).
Ward, B. A., Schartau, M., Oschlies, A., Martin, A. P., Follows, M. J. & Anderson, T. R. (2013). When is a biogeochemical model too complex? Objective model reduction and selection for North Atlantic time-series sites. Progress in Oceanography, 116, 49-65.
Yang, Z, & Zhu, T. (2018). Bayesian selection of misspecified models is overconfident and may cause spurious posterior probabilities for phylogenetic trees. Proceedings of the National Academy of Sciences, 115(8), 1854-1859.