the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Prediction of present and future spatial occurrence of cyanobacteria and the toxin nodularin in the Baltic Sea
Abstract. Blooms of filamentous cyanobacteria are recurrent phenomena in the brackish Baltic Sea. These blooms often include toxin producing species, however, predicting and modeling the toxins spatial distribution poses great challenges. In addition, projected rising temperature due to climate change is expected to increase the occurrence of cyanobacterial blooms, making it vital to understand the distribution of the blooms and the associated cyanotoxins across ecosystems. Herein, we integrated measured concentration of the cyanotoxin nodularin, abundance of the toxin producer Nodularia spumigena, and environmental variables using Empirical Bayesian Kriging (EBK) regression prediction, ensemble learning, and stacked species distribution modeling (SSDM). This setup was used to predict and interpret the current and future area distribution of N. spumigena and nodularin across the Baltic Sea. Predictions were based on results from biogeochemical models describing current and projected future concentrations of near surface chlorophyll, nitrate, phosphate, salinity, and temperature along with nitrate-to-phosphate ratio and a geographical variable of distance to shore. Prediction for the future distribution was performed using projected climate change scenarios in the year 2100. Findings show that the predicted area distribution of nodularin is determined by concentrations and interaction effects of salinity, temperature, phosphate, nitrate to phosphate ratio, and distance to shore, and is associated with the predicted area distribution of N. spumigena. Predicted site distribution shows increased nodularin occurrences in the Eastern and Western Gotland Sea, the Northern Baltic Proper, southern parts of the Bothnian Sea, and in the Arkona basin. By the year 2100, area distribution of nodularin is predicted to increase in the northern part of the Eastern Gotland Sea, Northern Baltic Proper, Åland Sea, southern parts of the Bothnian Sea, Arkona Basin, and slightly into the Bothnia Bay in response to projected climate change scenarios. Our developed modeling approach is useful for risk assessment and management of cyanotoxins where toxicological data are insufficient.
- Preprint
(2372 KB) - Metadata XML
-
Supplement
(1678 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-3290', Anonymous Referee #1, 13 Aug 2025
-
AC2: 'Reply on RC1', Mohanad Abdelgadir, 29 Jan 2026
RC1
Review of the manuscript “Prediction of present and future spatial occurrence of cyanobacteria and the toxin nodularin in the Baltic Sea” by Mohanad Abdelgadir, Bengt Karlson, Elin Dahlgren, Malin Olofsson
Summary: The authors use Empirical Bayesian Kriging (EBK) regression prediction, ensemble learning, and stacked species distribution modelling (SSDM) to predict and interpret the current and future area distribution of Nodularia sp. and the toxin nodularin across the Baltic Sea. The underlying data base consists of 139 observed samples, combined with numerical model data from various sources. Predictions for the future distribution of Nodularia sp. blooms and nodularin are based on projected climate change scenarios in the year 2100.
Major comments:
The subject of the study is of importance and general interest. The overall approach is interesting and all the work that went into this study is greatly appreciated.Re: We greatly appreciate the effort made by the reviewer (RC1) on our manuscript.
Unfortunately, however, I found it very difficult to rate methods and results of the presented study because I got lost in many details while I still lack a description of important key aspects. For my feeling the authors try to do too much in just one publication and I lack a clear aim of all their experiments.
Re: In the revised version of the manuscript, we will add a short paragraph early in the introduction part clearly states the primary aim of the study and explains how the different analyses and methods fit together to address the study aim.
I thus strongly recommend to rather focus on one specific aspect. Potential candidates I could imagine are: (1) a comparison of different techniques for enhancing the output of climate models to resolve local cyanobacteria blooms and toxins (here I don’t regard the Baltic Sea as a perfect candidate because many important processes are not resolved in climate models). (2) a comparison of different methods to interpolate nodularin measurements in space (if the available data do not suffice considerations could be added on how many extra data would be required and where) or (3) suggestions how to implement nodularin into a prognostic Baltic Sea model which already contains cyanobacteria. This list is certainly not comprehensive.
Re: We do understand that this model setup can seem complex and unnecessarily complicated, and we will, in the revised version of the manuscript, aim to clarify this setup and its benefits. As reviewer 2 (RC2) did not raise this issue, we leave it to the editor to decide if this major change would be needed for the future version of the manuscript.
As the study is designed now, I have several major concerns which in my eyes need to be addressed:
(1) There seem to be very few measurements of the toxin nodularin available (as I understood 139 observed sample were investigated while most of them are clustered at some coasts). Relating these few measurements to a multitude of environmental factors and their interactions will most likely lead to overfitting. While the authors state that they used part of the data for testing, I did not find clear evidence which could rule out this concern. I would expect something like a direct comparison plot of the best prediction versus observed nodularin for some independent test data.Re: That’s true. We have 139 observations across the study area. To overcome the overfitting issue, we generated what are called “pseudoabsence data points,” which represent those unsampled sites across the study area. The general idea behind it is to generate points in the study area that will be used to compare the observed environment (represented by the presence) against what is available. Those points are NOT to be considered as absences and rather represent the available environment (see Barbet-Massin et al., 2012). Moreover, when learning dependence from data, to avoid overfitting, it is important to divide the data into the training set and the testing set. We first train our model on the training set, and then we use the data from the testing set to gauge the accuracy of the resulting model. Here, the dataset is split into two segments: 70% training and 30% testing data, which allows the model to learn on one subset of the data (training set) and evaluate its performance on another subset (testing set). This ensures that the model generalizes well to new data, making it more robust and reliable. For more information, see below how the model was generated in R code:
#Current prediction
#Herem I add the pseudo-absence data to the background. I add 10000 pseudo-points as background using "bg" argument
dc <- sdmData(species~., train=newssp, predictors= env.c, bg=list(n=10000))
#let's fit the model by adding different algorithm methods, replication techniques and K-fold cross-validation
mc <- sdm(species ~ . , dc, methods=c('GBM','RF','GLM','MAXENT','MARS','CART'), replication=c("boot"),k=10,test.percent=30)
#Future prediction
#Here, I add the pseudo-absence data to the background. I add 10000 pseudo-points as background using "bg" argument
df <- sdmData(species~., train=newssp, predictors= env.f, bg=list(n=10000))
#let's fit the model by adding different algorithm methods, replication techniques and K-fold cross-validation
mf <- sdm(species ~ . , df, methods=c('GBM','RF','GLM','MAXENT','MARS','CART'), replication=c("boot"),k=10,test.percent=30)
Abbreviations: dc/df= data object (current c/future f), bg=background data object consists of 10000 randomly pseudoabsence points, newssp= data object contains occurrences, species= column of the data object contain the species (Nodularia sp.), methods= set of algorithms used, replication= method technique which is “bootstrapping”, test.percent= percentage of data set used for testing which is 30%, k= 10-fold bootstrapping, mc/mf= model setting (current c/future f).
Furthermore, our study design was intended to handle such a limited dataset with consideration to the large spatial extent of the study area and the moderate clustering of the points. For example, EBK regression prediction methods are known to generate more robust and better accuracy than other kriging techniques both for small datasets and even when data is locally moderately non-stationary. In addition, EBK regression prediction can accurately simulate these regional variations and take regional influences into consideration.
In the revised version of the manuscript, we will add an additional statement describing how the method resolved the limited number of samples. We will also provide the code if needed and extra figures in supplementary illustrating a comparison between different datasets.
(2) For the occurrence of Nodularia Sp. biogeochemical model data of the ERGOM model are combined with the available observations. This approach might lead to inconsistencies and since this is the key aspect I would like to see at least some quality assessment of both, the numerically simulated and predicted Nodularia sp. blooms. Since Satellite data for Nodularia sp. are provided e.g. by SMHI, it would be fairly easy to show 2-dimensional plots of the extent of a particularly large and small bloom during the recent years as simulated (by the numerical model) and as predicted (by the methods of this study) in direct comparison to satellite observations. Additionally, the in-situ samples for Nodularia sp. could be plotted against the respective model data to ensure that these data sources can be combined without problems.Re: We are fully aware of the available satellite data on the bloom and also from SMHI, yet our study focus was mainly concerned about what drives toxin concentration at specific sampling sites and not detecting the bloom per se. Technically, our approach integrates the Ensemble Learning (multiple models instead of a single model) that is known to quantify uncertainty in long-term projections. In addition, using climate change scenarios coupled with real-time monitoring data from satellite imagery altogether allows for long-term trends to be estimated. By combining the in-situ data, the nodularin concentration along with previous, further improves the accuracy of models. This altogether allows for accurate predictions for the next 75 years (2100 climate scenario) and reveals uncertainty in long-term projections regardless of how long the period is. We also considered having the same high-resolution scenarios for the current selected variables from all sources (near surface chlorophyll, nitrate, phosphate, salinity, and temperature) to ensure the data integrity and that the model is internally consistent. Along with integrity and accuracy, our study design ensures consistency that the transition from current data to forecasted data (future predictions) is seamless and logical, preventing abrupt, unrealistic shifts in the model’s trajectory. This can clearly be observed in model accuracy and variable contribution, where our modeling approach captured the spatial changes in nodularin occurrence over the next 75 years with excellent performance.
In the revised version of the manuscript, we will add in the supplementary data 2-dimensional plots showing the assessment of both ERGOM and Nodularia Sp. datasets.
(3) The authors then combine many different data sources from global to regional models which are almost for certain inconsistent and might make it very difficult to draw robust conclusions. E.g. global models do not resolve coastal upwelling and it is very unlikely that these models capture the complex salinity dynamics, nutrient inputs or sediment processes of the Baltic Sea. These aspects need at least attention.Re: We fully agree about the challenge of combining data from different sources. However, our study addresses this challenge. We decided on having the same high-resolution scenarios for the currently selected variables from all sources (near-surface chlorophyll, nitrate, phosphate, salinity, and temperature) to ensure the data integrity and that the model is internally consistent. Along with integrity and accuracy, our study design ensures consistency that the transition from current data to forecasted data (future predictions) is seamless and logical, preventing abrupt, unrealistic shifts in the model’s trajectory. This can clearly be observed in model accuracy and variable contribution, where our modeling approach captured the spatial changes in nodularin occurrence over the next 75 years with excellent performance.
Moreover, to handle such complex datasets of different origins, the approach integrates the power of Kriging interpolation and ensemble learning. The first method is known to generate robust and better accuracy than other kriging techniques, both for small datasets and even when data is locally moderately non-stationary and in the presence of autocorrelation. The second uses multiple models instead of a single model to quantify uncertainty in long-term projections. Please refer to Goovaerts, 1997; Krivoruchko and Gribov, 2019and in the introduction part lines 67-74.
In the revised version of the manuscript, we will add an additional statement in the introduction part highlighting how our approach can resolve the inconsistency and increase the prediction performance. Such an statement can be read as “In addition, using climate change scenarios coupled with real-time monitoring data from satellite imagery altogether allows for long-term trends to be estimated. By combining the in-situ data, the nodularin concentration, along with previous data, further improves the accuracy of models. This altogether allows for accurate predictions for the next 75 years (2100 climate scenario) and revels uncertainty in long term projections regardless of how long the period is”.
(3) I did not find an independence test for the predictors. Then, I am surprised that distance to the shore has been used as predictor. I am well aware of a prominent study which uses this factor when investigating the onset of blooms – still blooms may then drift to the shore and frequently do so.
Re: Our model finding suggests the importance of including distance to shore as a geographical variable in future modeling of cyanobacteria and cyanotoxin, particularly in nutrient-depleted areas across the Baltic Sea. When combined with temperature, chlorophyl, nutrients, and salinity, having ‘distance to shore’ as a predictor tends to reflect the spatial, physical, and ecological conditions that promote nodularin expansion. This finding is clearly observed when ‘distance to shore’ has shown an interacting effect with other variables in promoting nodularin occurrence (See Table 1A, Figure 3E, & Figure 4G).
Intendance test for each predictor was performed was provided the distributions of the cross-validation statistics of kriging (estimated using kernel density) showing the prediction regression scatterplot, the regression function, and the measured (blue line) and predicted (red line) values of nodularin (µg l-1). These figures (see Figures S1, S2, and S3) also contain the equation of the regression function. For ensemble learning we provided both current and future model performances for each algorithm quantified by area under curve (AUC) and the True Skill Statistic (TSS); see Figure S4. All predictions, either by kriging or ensemble, showed high prediction capability indicted by high values in regression function, AUC, TSS, Kappa, …etc. and all statistical metrics in the study.
In supplementary figures (see Figure S5) we also provided the response curve that demonstrates the independence effect of each predictor on nodularin concentration.
In the revised version of the manuscript, and in addition to figures mentioned above in supplementary, we will add 2-diemntaional plots as an independence test for the predictors.
(5) I did not find convincing evidence for reliable predictions for any of the many methods. I would like to see clear visual comparisons to independent test data.
Re: In supplementary figures, we provided the distributions of the cross-validation statistics of kriging (estimated using kernel density) showing the prediction regression scatterplot, the regression function, and the measured (blue line) and predicted (red line) values of nodularin (µg l⁻¹). These figures (see Figures S1, S2, and S3) also contain the equation of the regression function. For ensemble learning we provided both current and future model performances for each algorithm quantified by area under curve (AUC) and the True skill statistics (TSS), see Figure S4. All predictions, either by kriging or ensemble, showed high prediction capability indicted by high values in regression function, AUC, TSS, Kappa, …etc. and all statistical metrics in the study.
In the revised version of the manuscript, and in addition to figures mentioned above in supplementary, we will add 2-diemntaional plots as comparisons to independent test data.
(6) Even if the authors revise and illustrate robust relations to predictors for Nodularia Sp. bloom occurrence and nodularin under present climate conditions, it is still not at all guaranteed that these could be extrapolated to a much warmer climate (e.g. species composition and competitive advantages might well change). I do not at all recommend to base predictions or even recommendations for politics on such uncertain ground.
Re: We value your recommendation, yet, if carefully applied, this approach can be tested for future prediction and potential implication for management/guidance. These approaches could also help prioritize surveillance and implement earlier sampling efforts in areas predicted to have high cyanotoxin concentration. We have already highlighted this in the conclusion section; see line 334. We are fully aware that species composition and toxin occurrence might well change over the long term under future scenarios and increased warming. In a way to resolve prediction problems associated with the long-term projection, our approach integrates the ensemble learning (multiple models instead of a single model) that is known to quantify uncertainty in long-term projections. In addition, using climate change scenarios coupled with real-time monitoring data from satellite imagery altogether allows for long-term trends to be estimated. By combining the in-situ data, there is nodularin concentration, further improving the accuracy of models. This altogether allows for accurate predictions for the next 75 years (2100 climate scenario) and reveals uncertainty in long-term projections regardless of how long the period is. We also have decided to have the same high-resolution scenarios for the currently selected variables from all sources (near-surface chlorophyll, nitrate, phosphate, salinity, and temperature) to ensure the data integrity and that the model is internally consistent. Along with integrity and accuracy, our study design ensures consistency so that the transition from current data to forecasted data (future predictions) is seamless and logical, preventing abrupt, unrealistic shifts in the model’s trajectory.
In the revised version of the manuscript, we will clarify in a statement how the approach applied here can resolve species composition and toxin occurrence changes over time and space.
Specific comments:
Ln 13: change “blooms often include toxin producing species” to “ blooms can contain toxin producing stains,”Re: We will change it to “ blooms can contain toxin producing strains,” in the revised version of the manuscript.
Ln 14: change “climate change is expected to increase the occurrence of cyanobacterial blooms” to something like that “climate change may increase the occurrence of cyanobacterial blooms”Re: Wil change it to “climate change may increase the occurrence of cyanobacterial blooms” in the revised version of the manuscript.
Ln 17-18: The choice of each method should be motivated.Re: The choice of each method, pros and cons of each, and why both methods were integrated were detailed described in lines 71-89. In the revised version of the manuscript, this section will be clarified, and the methods will be motivated.
Ln 29: I do not recommend any risk assessment or management decisions at current state.Re: Risk assessment and management will be removed from the sentence and will be changed to “Our developed modeling approach is useful where toxicological data such as cyanotoxins are insufficient”, in the revised version of the manuscript.
Ln 64: Empirical Bayesian Kriging (EBK) regression prediction depends heavily on the density and distribution of the underlying samples. In Figure 1 it appears that most of the few samples are clustered at some coasts. Also, I doubt that all samples were taken at the same time ad it is not clear which state of the system the Kriging refers to. It would be good if the time aspect was clarified and I would like to see a comparison to independent test data.Re: That is true; kriging depends on the distribution of sampling points. However, kriging perfectly handles spatially and clustered points. In detail, applying semi-variogram geostatistical tool in Kriging algorithm allows quantifying the spatial autocorrelation by determining the range or the distance within which points are correlated, and the sampling error or fine-scale variation called "nugget". Moreover, kriging assigns weights to neighboring points based on their spatial correlation, meaning that kriging firsthand considers the autocorrelation of points, not just their physical distance. By utilizing autocorrelation, kriging removes spatially correlated noise and creates a fine and less smeared kriged map. Kriging also ensures that the final residuals are not autocorrelated by modeling the main predictors and creating the best linear unbiased predictions. Taking together, we believe that kriging is able not just to resolve the spatial autocorrelation in the dataset but also to create fine unbiased predictions. It is worth mentioning that the sampling period spanned from June to September 2023, and the predictors, chlorophyll, NO₃ , PO₄ ,salinity, and temperature, were downloaded from databases and limited to the same period.
In the revised version of the manuscript, we will add an additional paragraph like the above one clarifying how kriging algorithm resolves and handles spatially autocorrelated and clustered points. We will also add 2-diromnetal plots in supplementary illustrating a comparison to independent test data.
For the intended test of the data and predictor, please refer to my response to points #3 and #5 in your list of questions.
Ln 84: It does not become clear how these approaches could overcome the above-mentioned problems.Re: Please read my response to your previous questions on how Kriging and machine learning handling complex, spatially autocorrelated dataset, I merge them and copy-paste here:
Kriging perfectly handles spatially and clustered points. In detail, applying semi-variogram geostatistical tool in Kriging algorithm allows quantifying the spatial autocorrelation by determining the range or the distance within which points are correlated, and the sampling error or fine-scale variation called "nugget". Moreover, kriging assigns weights to neighboring points based on their spatial correlation, meaning that kriging firsthand considers the autocorrelation of points, not just their physical distance. By utilizing autocorrelation, kriging removes spatially correlated noise and creates a fine and less smeared kriged map. Kriging also ensures that the final residuals are not autocorrelated by modeling the main predictors to create the best linear unbiased predictions.
The second method of ensemble learning generates what are called “pseudoabsence data points,” which represent those unsampled sites across the study area. The general idea behind it is to generate points in the study area that will be used to compare the observed environment (represented by the presence) against what is available. Those points are NOT to be considered as absences and rather represent the available environment (see Barbet-Massin et al., 2012). Moreover, when learning dependence from data, to avoid overfitting, it is important to divide the data into the training set and the testing set. We first train our model on the training set, and then we use the data from the testing set to gauge the accuracy of the resulting model. Here, the dataset is split into two segments: 70% training and 30% testing data, which allows the model to learn on one subset of the data (training set) and evaluate its performance on another subset (testing set). This ensures that the model generalizes well to new data, making it more robust and reliable.
In the revised version of the manuscript, we will add a paragraph clarifying how both methods and the overall approach implemented in the study overcome the mentioned issue. The paragraph will be same to the above-provided response/statement and also to review question marked with Ln64.
Ln 87: Again, I do not at all recommend to base future predictions on such uncertain ground.Re: See my response to your question #6 “(6) Even if the authors revise and illustrate robust relations to predictors for Nodularia Sp. bloom occurrence and nodularin under present climate conditions, it is still not at all guaranteed that these could be extrapolated to a much warmer climate (e.g. species composition and competitive advantages might well change). I do not at all recommend to base predictions or even recommendations for politics on such uncertain ground.”
Ln 91: How many of the samples did contain nodularin? Where all samples taken during Nodularia bloom conditions? Where other variables, such as nutrients, salinity or temperature, measured as well?
Re: Yes, the data were collected during Nodularia bloom conditions (June to September). The predictors (nutrients, salinity or temperature) were downloaded from databases at same sampling sites during the same period of time. This will be clarified in the revised version of the manuscript.
Ln 105ff: Blending the nutrients from the SMHI forecast with chlorophyll_a simulated by the ERGOM-model needs some good motivation because the respective fields will not be consistent.Re: Yes, we fully agree, yet our study tends to resolve possible inconsistency by integrating two different approaches with different procedures. For how each method handles the data, please refer to my previous responses to your questions #3, #6, and the question related to Ln84. This will also be clarified in the revised version of the manuscript.
Ln 118: I could not find a meaningful comparison in the supplement.Re: The supplement figures, Figures S2 and S3, illustrated that geographical estimations and maps produced by the kriging and ensemble-based models in this study are aligned despite the differences in approaches. The supplement was provided for the reader to compare, e.g., Figure S2 with Figure 4 and Figure 7 in the main text. In other words, this explains how nodularin spatially expands in response to same set of predictors but from two different databases. These will be referred to in the text in the methods section in the revised version of the manuscript.
Ln 120: It does not become clear how the predictions were “tested” based on the mentioned Copernicus data.
Re: Please see supplementary figures and tables. Multiple figures were presented to show how nodularin and Nodularia sp. respond to different predictors from Copernicus (Figure 3, 4 and 5) from ERGOM (in Figure S2, S3, S4). These figures were supported by statistical matrices and regression function equations. We will in the revised version of the manuscript motivate and clarify it in the text.
Ln 123: I would be very interested to see 2-dimensional plots on predicted Nodularia Sp. and nodularin when using the climate models under present climate conditions in comparison the predictions based on the combination of numerical Baltic Sea models.Re: Please refer to previous response to your question Ln 123. Check supplementary figures and tables. Multiple figures were presented to show how nodularin and Nodularia sp. respond to different predictors from Copernicus (Figure 3, 4 and 5) from ERGOM (in Figure S2, S3, S4). These figures were supported by statistical matrices and regression function equations.
Ln 185 Something went wrong here.Re: Latin text (from the journal template) will be removed in the revised manuscript. The sentence in the revision will start by “Prediction of nodularin spatial occurrence…”.
Ln 187ff: From here on I am somehow lost in many details, different methods (all with different design choices) and various feature selections while I am lacking a clear purpose of all the experiments.Re: I assume you refer to analysis of Bayesian linear regression. To interpret the results and statistical terms of the Bayesian linear regression, please refer to Table 1. In brief, The Bayes factor BF10 is a ratio which quantifies evidence in favor of an effect (represented by “1”) versus no effect (represented by “0”). If BF10 > 1 indicates evidence in favor of an effect. 0 < BF10 < 1 indicates evidence in favor of no effect. The P(M) indicates that the prior probabilities of the other models are equal, and P(M|data) refers to the posterior probability of each model after seeing the data, while BFM compares each model to the average P(M|data) of the other models.
If needed or requested by the reviewer RC1, an additional paragraph can be provided in the table caption in the revised version explaining each statistical term in detail. We can also add a paragraph in the methods section, if requested, clarifying the purpose of using each method.
Citation: https://doi.org/10.5194/egusphere-2025-3290-AC2
-
AC2: 'Reply on RC1', Mohanad Abdelgadir, 29 Jan 2026
-
RC2: 'Comment on egusphere-2025-3290', Anonymous Referee #2, 30 Nov 2025
The research topic is very interesting, and the model formulation is novel and conceptually strong. The things I find problematic are the limited data points the model is based on. There are 139 points from 54 locations in 1 year. I think the model will benefit from atleast a multi-year data set. Is the data from previous years not available? Why was it limited to one year. I have serious doubts about conclusions drawn from a year’s worth of data to interpolate results for 75 years from now. The current prediction period is too long a time frame to predict using modelling, especially for something so dynamic as algal blooms which are sensitive to a lot of factors. Why not use a shorter time frame like 2040 or 2050. Also, could the reason for similar results from all GCMs be because the model has some flaws that remain unaddressed.
In general, kriging requires data that is spatially and temporally well distributed. The data here is pretty limited with 54 locations across an over 300,000 sq km area. What is the spatial correlation between data? How was the variogram like? Are the model results biased because of availability of more data in one section of the study area such as West and East Gotland Sea. I would have like to seen more about the limitations of the model structure over the results for 75 years which can vary substantially from the current modelled scenarios. Applying this model framework in other locations, needs context on strengths and weaknesses of the formulation so that it can be modified accordingly to meet the requirements of that problem.
Additionally, the results are presented in a confusing manner, there is too much back and forth between too many scenarios with numbers presented in both tables and text. I had a hard time following the narrative. I would recommend the question be stated clearly in the introduction and results focused around answering those particular questions. Maybe present the numbers in table and text in the narrative with numbers being brought up where it is critical for the narrative. Could the reason for no nodularin in Bothnian and Southern Kattegat be the data not detecting any nodularin at the time of data collection. I have serious doubts about a model developed using such high accuracy developed using such limited data.
If the paper discusses the causes for limited data, potential fallacies in its current formulation and the limitations of the results in its current state, I think it would be much more beneficial than just discussing how great the model performance is.
There are also some grammatical and structural corrections which can be fixed pretty easily. Attached below are specific locations I noticed.
Specific corrections:
Line 13: species. However,
Line 75: This could be because the phenomenon….
Line 78 : … identification of patterns … ??
Line 80 : multiple variables
Line 141: Would be better is this line is written differently. Starting a line with how is not recommended.
Figure 2. Good consolidation of information. Resolution needs to be improved.
Line 185: Is it starting with latin ?
Line 314: ….ensemble learning….
Citation: https://doi.org/10.5194/egusphere-2025-3290-RC2 -
AC1: 'Reply on RC2', Mohanad Abdelgadir, 29 Jan 2026
RC2
The research topic is very interesting, and the model formulation is novel and conceptually strong.
Re: We greatly appreciate the effort made by the reviewer (RC2) on our manuscript.
The things I find problematic are the limited data points the model is based on. There are 139 points from 54 locations in 1 year. I think the model will benefit from atleast a multi-year data set. Is the data from previous years not available? Why was it limited to one year. I have serious doubts about conclusions drawn from a year’s worth of data to interpolate results for 75 years from now.
Re: That’s true. We have 139 observations across the study area, and no data is available from the previous year, unfortunately. The study design was however intended to handle such limited dataset with consideration to large spatial extent of the study area and the moderate clustering of the points. For example, EBK regression prediction methods are known to generate more robust and better accuracy than other kriging techniques both for small datasets and even when data is locally moderately non-stationary. In addition, EBK regression prediction can accurately simulate these regional variations and take regional influences into consideration.
In detail, see also my response to your next question: “In general, kriging requires data that is spatially and temporally well distributed. The data here is pretty limited with 54 locations across an over 300,000 sq km area. What is the spatial correlation between data? How was the variogram like? Are the model results biased because of availability of more data in one section of the study area such as West and East Gotland Sea.”
The 139 data points represent moderate clustering according to Moran’s I value = 0.22, and this is why we applied the variogram geostatistical tool implemented in the Kriging algorithm. In detail, applying the semi-variogram geostatistical tool in the Kriging algorithm allows quantifying the spatial autocorrelation by determining the range or the distance within which points are correlated and the sampling error or fine-scale variation called "nugget." Moreover, kriging assigns weights to neighboring points based on their spatial correlation, meaning that kriging firsthand considers the autocorrelation of points, not just their physical distance. By utilizing autocorrelation, kriging removes spatially correlated noise and creates a fine and less smeared kriged map. Kriging also ensures that the final residuals are not autocorrelated by modeling the main predictors and creating the best linear unbiased predictions. Taken together, we believe that kriging is able not just to resolve the spatial autocorrelation in the dataset but also to create fine, unbiased predictions.
For the second pathway of the analysis (i.e., the ensemble learning), and to overcome the overfitting issue, we generated what are so-called “pseudoabsence data points,” which represent those unsampled sites across the study area. The general idea behind it is to generate points in the study area that will be used to compare the observed environment (represented by the presences) against what is available. Those points are NOT to be considered as absences and rather represent the available environment (see Barbet-Massin et al., 2012). Moreover, when learning dependence from data, to avoid overfitting, it is important to divide the data into the training set and the testing set. We first train our model on the training set, and then we use the data from the testing set to gauge the accuracy of the resulting model. Here, the dataset is split into two segments: 70% training and 30% testing data, which allows the model to learn on one subset of the data (training set) and evaluate its performance on another subset (testing set). This ensures that the model generalizes well to new data, making it more robust and reliable.
In the revised version of the manuscript, we will clarify this approach and benefit of the model setup and that we ensure that it works even on small datasets. We will add a paragraph, like the above-provided statement, clarifying how different methods resolve the issue of limited sampling size (i.e., number of observations) and that the overall approach managed to handle autocorrelation and moderate clustering of points.
The current prediction period is too long a time frame to predict using modelling, especially for something so dynamic as algal blooms which are sensitive to a lot of factors. Why not use a shorter time frame like 2040 or 2050. Also, could the reason for similar results from all GCMs be because the model has some flaws that remain unaddressed.
Re: The study design and methods potentially consider the long period. First, we selected this scenario (SSP85-2100), providing that this is the worst-case climate scenario with the highest emissions, severe global warming, and where temperatures are projected to rise significantly by the end of the century. Technically, our approach integrates Ensemble Learning (multiple models instead of a single model) that is known to quantify uncertainty in long-term projections. In addition, using climate change scenarios coupled with real-time monitoring data from satellite imagery, altogether allows for long-term trends to be estimated. By combining the in-situ data, the nodularin concentration, along with previous data, further improves the accuracy of models. This altogether allows for accurate predictions for the next 75 years (2100 climate scenario) and reveals uncertainty in long-term projections regardless of how long the period is. We also considered having the same high-resolution scenarios for the currently selected variables from all sources (near-surface chlorophyll, nitrate, phosphate, salinity, and temperature) to ensure the data integrity and that the model is internally consistent. Along with integrity and accuracy, our study design ensures consistency that the transition from current data to forecasted data (future predictions) is seamless and logical, preventing abrupt, unrealistic shifts in the model’s trajectory. This can clearly be observed in model accuracy and variable contribution, where our modeling approach captured the spatial changes in nodularin occurrence over the next 75 years with excellent performance.
In the revised version of the manuscript, we will rewrite the above-provided clarification in a paragraph to explain how our approach handles such uncertainty associated with future scenarios. We will also, in the revision, add one sentence on the motivation of using a long-term scenario.
In general, kriging requires data that is spatially and temporally well distributed. The data here is pretty limited with 54 locations across an over 300,000 sq km area. What is the spatial correlation between data? How was the variogram like? Are the model results biased because of availability of more data in one section of the study area such as West and East Gotland Sea.
Re: The 139 data points represent moderate clustering according to Moran’s I value = 0.22, and this is why we applied the variogram geostatistical tool implemented in the Kriging algorithm. In detail, applying the semi-variogram geostatistical tool in the Kriging algorithm allows quantifying the spatial autocorrelation by determining the range or the distance within which points are correlated and the sampling error or fine-scale variation called "nugget." Moreover, kriging assigns weights to neighboring points based on their spatial correlation, meaning that kriging firsthand considers the autocorrelation of points, not just their physical distance. By utilizing autocorrelation, kriging removes spatially correlated noise and creates a fine and less smeared kriged map. Kriging also ensures that the final residuals are not autocorrelated by modeling the main predictors and creating the best linear unbiased predictions. Taken together, we believe that kriging is able not just to resolve the spatial autocorrelation in the dataset but also to create fine, unbiased predictions.
For the second pathway of the analysis (i.e., the ensemble learning), and to overcome the overfitting issue, we generated what are so-called “pseudoabsence data points,” which represent those unsampled sites across the study area. The general idea behind it is to generate points in the study area that will be used to compare the observed environment (represented by the presences) against what is available. Those points are NOT to be considered as absences and rather represent the available environment (see Barbet-Massin et al., 2012). Moreover, when learning dependence from data, to avoid overfitting, it is important to divide the data into the training set and the testing set. We first train our model on the training set, and then we use the data from the testing set to gauge the accuracy of the resulting model. Here, the dataset is split into two segments: 70% training and 30% testing data, which allows the model to learn on one subset of the data (training set) and evaluate its performance on another subset (testing set). This ensures that the model generalizes well to new data, making it more robust and reliable.
In the revised version of the manuscript, we will clarify this approach and benefit of the model setup and that we ensure that it works even on small datasets. We will add a paragraph, like above provided statement, clarifying how different methods resolve the issue of limited sampling size (number of observations) and that the overall approach managed to handle autocorrelation and moderate clustering of points. Please also refer to my response to your first question in the review report.
I would have like to seen more about the limitations of the model structure over the results for 75 years which can vary substantially from the current modelled scenarios.
Re: The study design potentially addresses all these limitations. For instance, using Ensemble Learning (multiple models instead of single model) is known to quantify uncertainty in long-term projections. In addition, using climate change scenarios coupled with real-time monitoring data from satellite imagery altogether allows for long-term trends to be estimated. By combining the in-situ data to previous further improves the accuracy of models. This altogether allows for accurate predictions for the next 75 years (2100 climate scenario) and reveals uncertainty in long-term projections regardless of how long the period is.
In the revised version of the manuscript, we will add a paragraph(s) clarifying how our model handles future scenarios, resolves uncertainty, and possible limitations.
Applying this model framework in other locations, needs context on strengths and weaknesses of the formulation so that it can be modified accordingly to meet the requirements of that problem.
Re: We fully agree. Regardless of the model's high accuracy and performance, we highlighted the limitations of the study considering the limited data points used and the shorter time period of the sampling that extends for only one year; see lines 416-418.
In revised version of the manuscript, we will provide an additional statement on possible model limitations.
Additionally, the results are presented in a confusing manner, there is too much back and forth between too many scenarios with numbers presented in both tables and text. I had a hard time following the narrative. I would recommend the question be stated clearly in the introduction and results focused around answering those particular questions. Maybe present the numbers in table and text in the narrative with numbers being brought up where it is critical for the narrative. Could the reason for no nodularin in Bothnian and Southern Kattegat be the data not detecting any nodularin at the time of data collection. I have serious doubts about a model developed using such high accuracy developed using such limited data.
Re: Please refer to the above response, I copy-paste below:
The 139 data points represent moderate clustering according to Moran’s I value = 0.22, and this is why we applied the variogram geostatistical tool implemented in the Kriging algorithm. In detail, applying the semi-variogram geostatistical tool in the Kriging algorithm allows quantifying the spatial autocorrelation by determining the range or the distance within which points are correlated and the sampling error or fine-scale variation called "nugget." Moreover, kriging assigns weights to neighboring points based on their spatial correlation, meaning that kriging firsthand considers the autocorrelation of points, not just their physical distance. By utilizing autocorrelation, kriging removes spatially correlated noise and creates a fine and less smeared kriged map. Kriging also ensures that the final residuals are not autocorrelated by modeling the main predictors and creating the best linear unbiased predictions. Taken together, we believe that kriging is able not just to resolve the spatial autocorrelation in the dataset but also to create fine, unbiased predictions.
For the second pathway of the analysis (i.e., the ensemble learning), and to overcome the overfitting issue, we generated what are so-called “pseudoabsence data points,” which represent those unsampled sites across the study area. The general idea behind it is to generate points in the study area that will be used to compare the observed environment (represented by the presences) against what is available. Those points are NOT to be considered as absences and rather represent the available environment (see Barbet-Massin et al., 2012). Moreover, when learning dependence from data, to avoid overfitting, it is important to divide the data into the training set and the testing set. We first train our model on the training set, and then we use the data from the testing set to gauge the accuracy of the resulting model. Here, the dataset is split into two segments: 70% training and 30% testing data, which allows the model to learn on one subset of the data (training set) and evaluate its performance on another subset (testing set). This ensures that the model generalizes well to new data, making it more robust and reliable.
In the revised version of the manuscript, we will add a paragraph from the above-provided statement answering the RC2 doubts about the limited dataset and how the approach handles this issue.
If the paper discusses the causes for limited data, potential fallacies in its current formulation and the limitations of the results in its current state, I think it would be much more beneficial than just discussing how great the model performance is.
Re: The data limitation was addressed as cavities in the study; see line 416-418
There are also some grammatical and structural corrections which can be fixed pretty easily. Attached below are specific locations I noticed.
Specific corrections:
Line 13: species. However,
Re: Will be corrected to “However, “ in revised version.
Line 75: This could be because the phenomenon….
Re: Will be corrected to “This could be because the phenomenon….” in revised version.
Line 78 : … identification of patterns … ??
Re: Will be corrected to “identified of patterns” in revised version.
Line 80 : multiple variables
Re: Will be corrected to “multiple variables” in revised version
Line 141: Would be better is this line is written differently. Starting a line with how is not recommended.
Re: The line will be written according to suggestion by RC2 in the revised version of the manuscript.
Figure 2. Good consolidation of information. Resolution needs to be improved.
Re: Resolution of Figure 2 will be increased in the revised version.
Line 185: Is it starting with latin ?
Re: Latin text (from journal template) will be removed in the revised manuscript. The sentence in the revision will start by “Prediction of nodularin spatial occurrence…”.
Line 314: ….ensemble learning….
Re: Will be corrected to “ensemble learning” in the revised version.
Citation: https://doi.org/10.5194/egusphere-2025-3290-AC1
-
AC1: 'Reply on RC2', Mohanad Abdelgadir, 29 Jan 2026
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 1,430 | 218 | 39 | 1,687 | 39 | 36 | 41 |
- HTML: 1,430
- PDF: 218
- XML: 39
- Total: 1,687
- Supplement: 39
- BibTeX: 36
- EndNote: 41
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Review of the manuscript “Prediction of present and future spatial occurrence of cyanobacteria and the toxin nodularin in the Baltic Sea” by Mohanad Abdelgadir, Bengt Karlson, Elin Dahlgren, Malin Olofsson
Summary: The authors use Empirical Bayesian Kriging (EBK) regression prediction, ensemble learning, and stacked species distribution modelling (SSDM) to predict and interpret the current and future area distribution of Nodularia sp. and the toxin nodularin across the Baltic Sea. The underlying data base consists of 139 observed samples, combined with numerical model data from various sources. Predictions for the future distribution of Nodularia sp. blooms and nodularin are based on projected climate change scenarios in the year 2100.
Major comments:
The subject of the study is of importance and general interest. The overall approach is interesting and all the work that went into this study is greatly appreciated. Unfortunately, however, I found it very difficult to rate methods and results of the presented study because I got lost in many details while I still lack a description of important key aspects.
For my feeling the authors try to do too much in just one publication and I lack a clear aim of all their experiments. I thus strongly recommend to rather focus on one specific aspect. Potential candidates I could imagine are: (1) a comparison of different techniques for enhancing the output of climate models to resolve local cyanobacteria blooms and toxins (here I don’t regard the Baltic Sea as a perfect candidate because many important processes are not resolved in climate models). (2) a comparison of different methods to interpolate nodularin measurements in space (if the available data do not suffice considerations could be added on how many extra data would be required and where) or (3) suggestions how to implement nodularin into a prognostic Baltic Sea model which already contains cyanobacteria. This list is certainly not comprehensive.
As the study is designed now, I have several major concerns which in my eyes need to be addressed:
(1) There seem to be very few measurements of the toxin nodularin available (as I understood 139 observed sample were investigated while most of them are clustered at some coasts). Relating these few measurements to a multitude of environmental factors and their interactions will most likely lead to overfitting. While the authors state that they used part of the data for testing, I did not find clear evidence which could rule out this concern. I would expect something like a direct comparison plot of the best prediction versus observed nodularin for some independent test data.
(2) For the occurrence of Nodularia Sp. biogeochemical model data of the ERGOM model are combined with the available observations. This approach might lead to inconsistencies and since this is the key aspect I would like to see at least some quality assessment of both, the numerically simulated and predicted Nodularia sp. blooms. Since Satellite data for Nodularia sp. are provided e.g. by SMHI, it would be fairly easy to show 2-dimensional plots of the extent of a particularly large and small bloom during the recent years as simulated (by the numerical model) and as predicted (by the methods of this study) in direct comparison to satellite observations.
Additionally, the in-situ samples for Nodularia sp. could be plotted against the respective model data to ensure that these data sources can be combined without problems.
(3) The authors then combine many different data sources from global to regional models which are almost for certain inconsistent and might make it very difficult to draw robust conclusions. E.g. global models do not resolve coastal upwelling and it is very unlikely that these models capture the complex salinity dynamics, nutrient inputs or sediment processes of the Baltic Sea. These aspects need at least attention.
(3) I did not find an independence test for the predictors. Then, I am surprised that distance to the shore has been used as predictor. I am well aware of a prominent study which uses this factor when investigating the onset of blooms – still blooms may then drift to the shore and frequently do so.
(5) I did not find convincing evidence for reliable predictions for any of the many methods. I would like to see clear visual comparisons to independent test data.
(6) Even if the authors revise and illustrate robust relations to predictors for Nodularia Sp. bloom occurrence and nodularin under present climate conditions, it is still not at all guaranteed that these could be extrapolated to a much warmer climate (e.g. species composition and competitive advantages might well change). I do not at all recommend to base predictions or even recommendations for politics on such uncertain ground.
Specific comments:
Ln 13: change “blooms often include toxin producing species” to “ blooms can contain toxin producing stains,”
Ln 14: change “climate change is expected to increase the occurrence of cyanobacterial blooms” to something like that “climate change may increase the occurrence of cyanobacterial blooms”
Ln 17-18: The choice of each method should be motivated.
Ln 29: I do not recommend any risk assessment or management decisions at current state.
Ln 64: Empirical Bayesian Kriging (EBK) regression prediction depends heavily on the density and distribution of the underlying samples. In Figure 1 it appears that most of the few samples are clustered at some coasts. Also, I doubt that all samples were taken at the same time ad it is not clear which state of the system the Kriging refers to. It would be good if the time aspect was clarified and I would like to see a comparison to independent test data.
Ln 84: It does not become clear how these approaches could overcome the above-mentioned problems.
Ln 87: Again, I do not at all recommend to base future predictions on such uncertain ground.
Ln 91: How many of the samples did contain nodularin? Where all samples taken during Nodularia bloom conditions? Where other variables, such as nutrients, salinity or temperature, measured as well?
Ln 105ff: Blending the nutrients from the SMHI forecast with chlorophyll_a simulated by the ERGOM-model needs some good motivation because the respective fields will not be consistent.
Ln 118: I could not find a meaningful comparison in the supplement.
Ln 120: It does not become clear how the predictions were “tested” based on the mentioned Copernicus data.
Ln 123: I would be very interested to see 2-dimensional plots on predicted Nodularia Sp. and nodularin when using the climate models under present climate conditions in comparison the predictions based on the combination of numerical Baltic Sea models.
Ln 185 Something went wrong here.
Ln 187ff: From here on I am somehow lost in many details, different methods (all with different design choices) and various feature selections while I am lacking a clear purpose of all the experiments.