the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
SeapoPym v0.1: Implementation of the SEAPODYM low and mid trophic levels in Python with a flexible optimisation framework
Abstract. SEAPODYM-LMTL is a global advection-diffusion-reaction model that simulates age-structured zooplankton and micronekton populations driven by physical and biogeochemical forcing. This study introduces SeapoPym, a simplified version of this model that decouples biological dynamics from physical transport and incorporates a Genetic Algorithm (GA) for stochastic parameter estimation within the Python scientific ecosystem. Comparisons with SEAPODYM-LMTL show that omitting transport produces notable discrepancies in highly dynamic warm regions and cold environments with long zooplankton life cycles. However, SeapoPym remains suitable for simulating mesozooplankton across most warm- and temperate-ocean regions. Sobol sensitivity analysis identifies mortality parameters as key drivers of biomass magnitude and variability, with strong parameter interactions. Twin experiments highlight challenges in estimating recruitment-timing parameters and emphasize the importance of data from cold, contrasting environments. SeapoPym provides a flexible, low-cost framework for exploring parameter estimation, designing observational strategies, and addressing challenges in zooplankton model assessment, with the potential to integrate with circulation models or machine-learning emulators.
- Preprint
(2854 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-711', Anonymous Referee #1, 04 May 2026
-
CEC1: 'Comment on egusphere-2026-711', Juan Antonio Añel, 16 May 2026
Dear authors,
Checking the Code and Data Availability section of your manuscript, and assessing its compliance with the policy of the journal, I would like to note that in order to improve both compliance and the replicability of your manuscript, it would be good that you include in the Zenodo repository the notebooks that you state are available under request. There is no reason to keep such barrier that only makes more difficult to get access to them.
Therefore, please, include the notebooks in the repository, and reply to this comment when you have done it.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2026-711-CEC1 -
AC1: 'Reply on CEC1', Jules Lehodey, 20 May 2026
Dear Juan A. Añel,
Thank you for your guidance on the Code and Data Availability section of our manuscript.
Following your recommendation, we have prepared a dedicated reproducibility deposit that provides the notebooks, scripts, and configuration files necessary to reproduce the experiments and figures presented in the manuscript. The deposit is openly available, with no further request barrier, at:
- GitHub: https://github.com/SeapoPym/SeapoPym-v0.1-Reproducibility (release v1.0.0)
- Zenodo: https://doi.org/10.5281/zenodo.20305631
The Code Availability section of the manuscript has been updated to point to both URLs and the new DOI. The wording "available upon request" has been removed.
We remain at your disposal for any further clarification.
Sincerely,
Jules Lehodey, on behalf of all co-authors
Citation: https://doi.org/10.5194/egusphere-2026-711-AC1
-
AC1: 'Reply on CEC1', Jules Lehodey, 20 May 2026
-
RC2: 'Comment on egusphere-2026-711', Anonymous Referee #2, 18 May 2026
General comments:
I find the paper interesting and well written
and found the motivation very clear to understand. Moreover,
I appreciate the usage of several state-of-the-art python libraries,
and also the clear structure of (i) numerical model evaluation, (ii) studying
the influence of transport (or its missing), (iii) sensitivity analysis
and (iv) parameter optimization.Specific comments:
I miss a clear statement what quantity the model actually computes.
There are symbols in the Appendices, I suggest to refer to them also earlier in
the paper.I would appreciate a little bit more details on the GA realization in Section 2.2.
GAs are used by many people in quite different configurations, an explanation
(e.g. of the term "tournament") would help. At some points the authors also use
the term evolutionary algorithm, and in the discussion (line 295) they refer to
CMAES, an evolutionary strategy, which works different. Specifically for readers
not aware of these differences or relations I would prefer clarity here.The explanation of the model parts in Appendices 2 and 3 was difficult to
understand to me. I would appreciate more details on the dependence on the age
\tau.To me it was not clear if the forcing was temperature and NPP only or additionally
the euphotic layer depth (see technical corrections below).
If the first is the case, I wonder why the authors use the spatial domain
for computations. To me it would be also or even more reasonable to take the
2-D domain of occurring temperature and NPP values as computational domain,
since the pure spatial location itself does not seem to have any influence then.I wonder why the GA optimization was stopped before a stagnation in the parameters
was reached.Technical corrections:
* line 11 abstract: at this point it might be not clear to what
kind of "machine learning emulators" the authors are referring
* line 65: "all data management (including I/O and intermediate states)"
it is not clear what the authors mean by "intermediate states" here: the I/O or storage of
intermediate states? the computation of intermediate states?
* line 79: "traditional methods": I assume gradient methods are meant, but
it is not clear at this point, to many people also GAs are "traditional", too
* line 92: here the authors use the term "evolutionary algorithm" (EA),
I agree that GA and EA have many things in common, and often the two terms are used
as synonyms. But I suggest to stick to one term and maybe add a remark
on the two terms in the introduction or somewhere appropriate. LAter the author use
alos CMAES, an evolutional "strategy", again another term.
* line 92: what do the authors mean by "orchestration"?
(I assume hyperparameter choice and/or tuning), but this could be
made more precise.
* line 95: here the authors state the forcings are temperature and NPP,
in line 61 there also euphotic layer depth was mentioned.
* line 124: typo: SI units are not to be written in italics
* line 151: "synthetic plot" please explain
* line 158: "stopping criterion": this is a bit misleading, the authors are using
a maximal number of iterations/generations of 20. It is common
practice in GA optimization to use the number of iterations with no
improvement in the cost function as a stopping criterion, which is a different thing.
In Fig. 7 it can be seen that this was not used here. An obvious question would be:
why not? computational effort?
* line 345: I assume that the values T_env are the given forcing data here?
* Appendix A2 in general: this section is a bit difficult to understand: for somebody not
familiar with the model the appearance and importance of \tau is not explained,
on the other hand: for people not familiar with the maths the derivation of the
analytic solution in (A.7) is not clear. I did not consult the mentioned references,
but I suggest to include some more explanations in here.
(A.4) is not quite a standard PDE, so I think a little bit of discussion would help.
* line 365: "Assuming \mu is integrable": \mu is a step function depending
on \tau which depends on T_env. It may have many jumps, but will it ever become
not integrable?
* line 365: "characteristic lines": in the meaning of characteristics of partial
differentail equations?
* Appendix A1-A3: I cannot see the dependence of R on p here in the equations.
The authors describe it later in Appendix A4 in the discrete setting. Thus, it
should also appear somewhere in the continuous equations.* code availability: I appreciate the availability of the code. It might be
usable as it is, but it is very difficult to understand, and it lacks comments,
at least in the github version.Citation: https://doi.org/10.5194/egusphere-2026-711-RC2 -
RC3: 'Comment on egusphere-2026-711', Anonymous Referee #3, 22 May 2026
Review of "SeapoPym v0.1: Implementation of the SEAPODYM low and mid trophic levels in Python with a flexible optimisation framework" by Lehodey et al.
This ms presents SEAPOPYM, a python implementation of a simplified, 0D version of SAPODYM without physical transport. The authors compare SEAPOPYM and SEAPODYM to analyse the importance of transport and find that it can be neglected where transport is weak. SEAPOPYM is calibrated with a genetic algorithm and a (simplistic) cost function.
General evaluation
This ms makes a half-baked impression at best. The authors use a whole host of terms (DEAP, NSGA, etc.) which are not explained. The same applies to SEAPODYM. Since SEAPOPYM is based on SEAPODYM, I think it is necessary to explain clearly the structure of SEAPODYM. All I could find in the provided references was an improved version of SEAPODYM, again without a proper explanation of the original SAPODYM. Then, SEAPOPYM is never compared directly to observations in any of the figures. We are only given the NRMSE and MAPE costs. This figures only compare SEAPOPYM and SEAPODYM, but what if both were bad? Another important but unanswered question is how far the parameter optimisation can compensate for the missing transport. The model equations are very difficult to follow, appear partly inconsistent, and are obviously incomplete. Several references to web-sites appear wrong or too generic. A major problem of these deficiencies is that they make it extremely hard to assess the scientific relevance of this work.
Also, the cost functions appear rather simplistic and are far behind the current state of the art (see below).
What is urgently needed is the following:
1. SEAPODYM must be explained thoroughly. Also, is the SEAPODYM code available?
2. A glossary and a table of abbreviations are needed to explain the many abbreviations and concepts, as mentioned below.
3. Regarding the comparison between SEAPODYM and SEAPOPYM: What is missing is a comparison of the optimised SEAPOPYM with the reference SEAPODYM. This could indicate how far the parameter optimisation may compensate for the missing circulation effects.
4. The model equations need to be made complete, e.g., the effects of DVM and interaction among different zooplankton groups, and better explained (see below).
5. A figure comparing the model results (SEAPODYM and SEAPOPYM with the reference and optimised parameters both globally and per station) with the actual observations used for calibration.Specific points for revising the ms
L 35: the link https://data.marine.copernicus.eu is too generic, no reference to SEAPODYM seems to exist there
L 38: current C++ implementation: is the SEAPODYM code available?
L 80: operators proposed by the DEAP library: which are these and what is the DEAP library?
L 82: Appendix A describes the model equations, the parameters are listed in Table 1: add this info to Table 1 and refer to it here
L 86: Please define or explain what a Sobol sequence is. I don't know and could not find out from the provided references.
L 89: What are "standard NSGA-II meta-parameters"? You need to explain briefly at least the concept.
L 108: Please refer to Section 2.4.3 as a reference explaining what a Sobol analysis is.
L 128: CMEMS product (Ref. QUID, 2024): need explanation and a better reference, the URL quoted on L 395 does not show any LMTL simulation and searching for "QUID, 2024" also does not help
L 149, 150: SALib library: please specify which functions
L 157: section 2.4.2: did you mean 2.4.3?
L 161: "by experiment" should be "for the GA" (or something similar)
L 169–174: please explain the symbols in the equations (n, yi, ŷi, σ)
L 179: The cost function is the metric. Not clear what you want to say here.
L 198, 199: The low RMSE may be misleading in these regions. Large MAPE values could well co-occur with small RMSE, simply because of the low biomass concentrations.
L 201–206: I suspect that this reflects a shortcoming of the Sobol' analysis, as it is a common result often seen in large ensemble simulations, when several parameters are changed simultaneously to obtain a better coverage of the parameter space. What may appear to indicate effects of parameter interaction is in reality just the result of changing several parameters at a time. If the Solbol' analysis somehow accounts for this problem, this needs to be explained.
L 224–225: The NRMSE is certainly not the best and also not the most robust cost function for parameter optimisation, since it rests on the assumption of normally distributed observations. This is also indicated by the discrepancy between the lowest MAPE and NRMSE in Fig. 7. See, e.g., Thorarinsdottir et al., SIAM/ASA Journal on Uncertainty Quantification 1 (2013):522, for better alternatives.
Section 3.4.2: Please list the optimised parameter estimates for all stations and globally, e.g., as additional columns in Table 1. This is important information for assessing the efficiency of the GA and the cost function, and also the appropriateness of the reference parameter values.
L 259–261: This seems to be a circular conclusion, since the influence on model output is gauged by the NRMSE. It could well be that a better cost function would result in a different parameter sensitivity.
L 290: How is the DVM implemented? It is not mentioned in the model equations. Also, it is unclear to me how this is done in a 0-D modelling framework (e.g., L 323).
L 294–297: But what about different cost functions (see above)?
L 304: "SeapoPym is specifically designed to integrate this increasing data heterogeneity": How is this achieved? This remains completely unclear from the model description in this ms.
L 314, 315: "the modular architecture allows for straightforward extension to micronekton groups": Again, how is this implemented remains unclear from the presented equations.Appendix A: The equations are apparently only for a single zooplankton group. The main text stresses the ability of SEAPOPYM to accommodate multiple groups, but then the equations should somehow allow for interactions among these groups, e.g., if one eats the other etc.
It remains also unclear how many age classes are resolved.
Eq. (A.4): I don't understand this equation. L 348–349 mention source and sink terms but (A.4) has only one (product) term on the right. The left-hand-side looks like a partial differential equation but the term ∂p/∂τ seems to make no sense, since τ (age class) is a discrete variable (according to L 349) and hence ∂τ appears meaningless. This needs to be explained clearly (or corrected). Also, μ is termed recruitment rate on L 349 but transfer rate on L 358.
L 369: p was defined as production on L 347. Now you say here that for total absorption, p(t,τr) = 0. But if nothing is produced, nothing could be transferred, which seems illogical. I may missing something here but please explain this clearly.
L 374: Did you mean Numerical Solution?
L 376: What is the Courant-Friedrichs-Lewy (CFL) condition? What is Δτ (τ being a discrete variable)?
Fig. 3: The figure shows at least 3 dashed lines but the legend only 2, Asymptote 0°C and Theoretical asymptote (whose colour doesn't appear to match any of the dashed lines).
Fig. 4: Please add a panel showing the percentage error MAPE. The RMSE may be more relevant for assessing the model output as input (e.g., food) for higher-trophic-level models, but the MAPE also provides essential information on model performance, particularly in oligotrophic regions.
Citation: https://doi.org/10.5194/egusphere-2026-711-RC3
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 290 | 118 | 32 | 440 | 16 | 25 |
- HTML: 290
- PDF: 118
- XML: 32
- Total: 440
- BibTeX: 16
- EndNote: 25
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The manuscript describes SeapoPym, a simplified version of SEAPODYM-LMTL. The authors primarily demonstrate the model’s performance and provide a comparison between SeapoPym and SEAPODYM-LMTL. The development of a simplified model focused on zooplankton represents a valuable contribution to understanding/modelling zooplankton dynamics.
That said, the manuscript would benefit from revisions to the written presentation, as well as improvements to the figures. The Methods, Results, and Discussion sections sometimes contain very short paragraphs (‘one sentence’) or subsections consisting of only a single paragraph, which do not fully convey the scope of the work or the significance of the results. Given the amount of work presented, these sections could be reorganized and expanded to better communicate the authors’ findings. I give some detailed comments below.
The most challenging aspect of the manuscript is the lack of comparison with observational data for zooplankton biomass. Although this is a model development study, it is not sufficient to compare only two modeled zooplankton outputs. Including comparisons with observational datasets would significantly strengthen the study. This is an essential point that would also enrich both the Results and Discussion sections. I therefore encourage the authors to include more systematic comparisons, both within model results and against observations.
Detailed comments:
Introduction
Line 35: The reference is not traceable. It appears to be a general website address, but a proper, citable source is needed.
Lines 38-40: The sentence beginning with “The existing …” requires a reference.
Paragraphs 3-4: These paragraphs could be merged. The transition to “addressing these implementation challenges…” is difficult to follow, as the previous paragraph focuses on zooplankton group descriptions in SEAPODYM-LMTL and vertical layer structure in last sentences. Clarifying what “these challenges” refers to, and combining the paragraphs, would improve readability and flow.
Methods
The Methods section contains many short subsections and dense text. A schematic figure summarizing the workflow would greatly improve clarity, especially since multiple analyses and simulations are conducted.
Line 63: There are several one-sentence paragraphs throughout the manuscript. These should either be merged with surrounding text or expanded for clarity.
Table 1: Are there reference values for these parameters available in the literature? If so, please cite them.
Lines 87–90: Another one-sentence paragraph, consider revising.
Figure 2: The light blue dots (39,853) in the caption. What are these?
Table 2: This table only lists latitude and longitude of stations that are not analyzed in detail. Its inclusion as a separate table may not be necessary. This information can be also given somewhere else. Instead, you can consider giving a table summarizing the simulation that have been conducted.
Line 128: The reference “REF. QUID, 20224” cannot be found in the reference list. Please check and correct.
Subsection 2.5: This section includes three equations but minimal written explanation, along with several very short paragraphs. If these equations are important enough to include, they should be properly explained in the text.
Results
Similar to the Methods section, there are many short paragraphs and some subsections consisting of only one paragraph. Consider reorganizing the Results around the main messages for better readability. Some sentences are difficult to connect with the corresponding figures.
Line 187 (“Impact of transport”): This section compares zooplankton biomass from SeapoPYm and the original model (SEAPODYM-LMTL), but the comparison is difficult to interpret. It can be considered: (1) Presenting both results in the same figure (Figure 4), (2) Showing absolute differences. In the same figure RMSE subfigure can be kept as it is.
Additionally, comparing model results with zooplankton datasets (e.g., MAREDAT, COPEPOD, KRILLBASE) would strengthen the manuscript. A comparison with outputs from global biogeochemical models (e.g., CMIP datasets) would also provide useful context for discussion.
Line 226 (Subsection 3.4): This paragraph seems more appropriate for the Methods section.
Figure 7: The caption should be improved for clarity.
Line 250: The term “HDI” is introduced without explanation. This should be defined earlier, likely in the Methods section.
Discussion
The Discussion would benefit from further development. In particular, there is little comparison between model results and observational data. Including such comparisons would significantly strengthen the manuscript and its impact. There is a one-sentence paragraph, these should be revised for better flow.
Line 303: “UPV” likely refers to “UVP”