the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Selecting and weighting dynamical models using data-driven approaches
Abstract. In geosciences, multi-model ensembles are helpful to explore the robustness of a range of results. To obtain a synthetic and improved representation of the studied dynamic system, the models are usually weighted. The simplest method, namely the model democracy, gives equal weights to all models, while more advanced approaches base weights on agreement with available observations. Here, we focus on determining weights for various versions of an idealized model of Atlantic Meridional Overturning Circulation. This is done by assessing their performance against synthetic observations (generated from one of the model versions) within a data assimilation framework using EnKF. In contrast to traditional data assimilation, we implement data-driven forecasts using the analog method based on catalogs of short-term trajectories. This approach allows us to efficiently emulate the model's dynamics while keeping computational costs low. For each model version, we compute a local performance metric, known as the contextual model evidence, to compare observations and model forecasts. This metric, based on the innovation likelihood, is sensitive to differences in model dynamics and considers forecast and observation uncertainties. Finally, the weights are calculated using both model performance and model codependency, and then evaluated on climatologies of long-term simulations. Results show good performance in identifying numerical simulations that best replicate observed short-term variations. Additionally, it outperforms benchmark approaches such as model democracy or climatologies-based strategies when reconstructing missing distributions. These findings encourage the application of the proposed methodology to more complex datasets in the future, like climate simulations.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(2323 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(2323 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-2649', Anonymous Referee #1, 15 Jan 2024
Review of “Selecting and weighting dynamical models using data-driven approaches”, egusphere-2023-2649
General comments:
This study developed a new approach for selecting and weighting dynamical models within a multi-model ensemble. The new points are the estimation of weights in the data assimilation framework of EnKF and the data-driven approach to forward the evolution of states. These two aspects may have the potential for further applications. In addition, the manuscript was well written and logically organized. However, some major comments have to be addressed before the manuscript can be considered for publication.
Major comments:
- It seems that the posterior probability density function (PDF) by assimilating observations in the EnKF algorithm is key to determine the weights of each dynamical system. One limitation of this methodology is the assumption that the variables among multi-model ensembles follow a Gaussian distribution. The systematic errors and the non-Gaussianlity would make the performance of this algorithm suboptimal. My question is if the nonlinear particle filter used in the same framework potentially enhances the performance of the new approach. It is worth some discussion in the conclusion section.
- The new approach was only compared to the benchmark approach such as the model democracy approach. However, various approaches that generate unequal weights have already been proposed. The authors also gave an overview of such kind of approaches. Why do not the authors compare the performance of their new approach to other unequally weighted approaches?
- About Eq. 5, the authors demonstrate the contextual model evidence (CME) takes into account both reliability information and accuracy information. It is unclear which terms in Eq. 5 are associated with reliability or accuracy. Please give more description and discussion. Please also give discussion on why both reliability and accuracy should be considered in defining the CME. Notice that most existing data-driven model used a loss function which is solely a function of mean square error.
4. For the data-driven approach, how long the historical data is sufficient for a robust estimation of the forward propagator?
5. Section 2.2.2, what is the difference between the method used here and the widely used multi-linear regression?
Grammatical errors:
- Ln25, “produces” -> “produced”
Citation: https://doi.org/10.5194/egusphere-2023-2649-RC1 -
RC2: 'Comment on egusphere-2023-2649', Anonymous Referee #2, 23 Jan 2024
Review of “Selecting and weighting dynamical models using data-driven approaches”, egusphere-2023-2649
This manuscript describes the use of the Bayesian data assimilation framework to estimate the weights for ensemble of models.
Using the weights hence obtained, the authors compute the climatological distributions during independent experiments and compare its performances to other well-known weighting methods.
The manuscript is well-written, presents a novel approach and is in general well organized. One of the assets of this paper is that it proposes a methodology which is - to my knowledge - directly applicable to real world predictions (thanks to AnDA), however with the caveats presented in the conclusion.
Therefore I recommend accepting it for publication once the comments below are addressed.
Comments:
- The authors emphasize that the CME method is based on the models "short-term" dynamics, i.e. that the CME weights are obtained during the DA cycles. This is also emphasized in the Appendix A where it is said that this approach provides "more informative insights into current conditions, including forecast states with their uncertainties." However, in the end, it is the average of the CME over a number of K of DA cycles which is used to compute the weights, covering the "attractor" of the models. So I don't understand this claim. To me a "climatology" of the CME is constructed and used, and so I have some trouble understanding the difference on this ground with climatology-based weighting. I would not call this weighting method a local one.
- Section 2.2.2: Since this is the heart of the methodology, I would have expected more details about it. In particular, the link between Eq. (7d) (and the equations below) and Figure 1 should be made.
- Line 225: "(which avoids computational divergence for certain attractors with highly nonlinear dynamics...)" I do not understand this. Does that mean that some models are unstable? Or that their nonlinearity is too high compared to the others? Or something else? Please clarify.
Citation: https://doi.org/10.5194/egusphere-2023-2649-RC2 -
AC1: 'Comment on egusphere-2023-2649', Pierre Le Bras, 22 Apr 2024
Dear Reviewers,
Thank you for your insightful comments and constructive feedback on our manuscript titled "Selecting and weighting dynamical models using data-driven approaches ". We greatly appreciate the time and effort you have dedicated to evaluating our work.
We have addressed each of the points raised in your reviews and incorporated necessary revisions to enhance the quality and clarity of the manuscript.
Attached, please find our responses to both reviews in a PDF file titled "NPG_WeightingIdealizedModel_AnswerToReviewers.pdf". Pages 1 to 6 of the response are dedicated to addressing the comments of Reviewer 1, while pages 7 to 10 address the comments of Reviewer 2. Additionally, we have included a PDF file of the revised manuscript with changes highlighted compared to the old version for your convenience.
Thank you for your time and consideration.
Yours sincerely,
Pierre Le Bras
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-2649', Anonymous Referee #1, 15 Jan 2024
Review of “Selecting and weighting dynamical models using data-driven approaches”, egusphere-2023-2649
General comments:
This study developed a new approach for selecting and weighting dynamical models within a multi-model ensemble. The new points are the estimation of weights in the data assimilation framework of EnKF and the data-driven approach to forward the evolution of states. These two aspects may have the potential for further applications. In addition, the manuscript was well written and logically organized. However, some major comments have to be addressed before the manuscript can be considered for publication.
Major comments:
- It seems that the posterior probability density function (PDF) by assimilating observations in the EnKF algorithm is key to determine the weights of each dynamical system. One limitation of this methodology is the assumption that the variables among multi-model ensembles follow a Gaussian distribution. The systematic errors and the non-Gaussianlity would make the performance of this algorithm suboptimal. My question is if the nonlinear particle filter used in the same framework potentially enhances the performance of the new approach. It is worth some discussion in the conclusion section.
- The new approach was only compared to the benchmark approach such as the model democracy approach. However, various approaches that generate unequal weights have already been proposed. The authors also gave an overview of such kind of approaches. Why do not the authors compare the performance of their new approach to other unequally weighted approaches?
- About Eq. 5, the authors demonstrate the contextual model evidence (CME) takes into account both reliability information and accuracy information. It is unclear which terms in Eq. 5 are associated with reliability or accuracy. Please give more description and discussion. Please also give discussion on why both reliability and accuracy should be considered in defining the CME. Notice that most existing data-driven model used a loss function which is solely a function of mean square error.
4. For the data-driven approach, how long the historical data is sufficient for a robust estimation of the forward propagator?
5. Section 2.2.2, what is the difference between the method used here and the widely used multi-linear regression?
Grammatical errors:
- Ln25, “produces” -> “produced”
Citation: https://doi.org/10.5194/egusphere-2023-2649-RC1 -
RC2: 'Comment on egusphere-2023-2649', Anonymous Referee #2, 23 Jan 2024
Review of “Selecting and weighting dynamical models using data-driven approaches”, egusphere-2023-2649
This manuscript describes the use of the Bayesian data assimilation framework to estimate the weights for ensemble of models.
Using the weights hence obtained, the authors compute the climatological distributions during independent experiments and compare its performances to other well-known weighting methods.
The manuscript is well-written, presents a novel approach and is in general well organized. One of the assets of this paper is that it proposes a methodology which is - to my knowledge - directly applicable to real world predictions (thanks to AnDA), however with the caveats presented in the conclusion.
Therefore I recommend accepting it for publication once the comments below are addressed.
Comments:
- The authors emphasize that the CME method is based on the models "short-term" dynamics, i.e. that the CME weights are obtained during the DA cycles. This is also emphasized in the Appendix A where it is said that this approach provides "more informative insights into current conditions, including forecast states with their uncertainties." However, in the end, it is the average of the CME over a number of K of DA cycles which is used to compute the weights, covering the "attractor" of the models. So I don't understand this claim. To me a "climatology" of the CME is constructed and used, and so I have some trouble understanding the difference on this ground with climatology-based weighting. I would not call this weighting method a local one.
- Section 2.2.2: Since this is the heart of the methodology, I would have expected more details about it. In particular, the link between Eq. (7d) (and the equations below) and Figure 1 should be made.
- Line 225: "(which avoids computational divergence for certain attractors with highly nonlinear dynamics...)" I do not understand this. Does that mean that some models are unstable? Or that their nonlinearity is too high compared to the others? Or something else? Please clarify.
Citation: https://doi.org/10.5194/egusphere-2023-2649-RC2 -
AC1: 'Comment on egusphere-2023-2649', Pierre Le Bras, 22 Apr 2024
Dear Reviewers,
Thank you for your insightful comments and constructive feedback on our manuscript titled "Selecting and weighting dynamical models using data-driven approaches ". We greatly appreciate the time and effort you have dedicated to evaluating our work.
We have addressed each of the points raised in your reviews and incorporated necessary revisions to enhance the quality and clarity of the manuscript.
Attached, please find our responses to both reviews in a PDF file titled "NPG_WeightingIdealizedModel_AnswerToReviewers.pdf". Pages 1 to 6 of the response are dedicated to addressing the comments of Reviewer 1, while pages 7 to 10 address the comments of Reviewer 2. Additionally, we have included a PDF file of the revised manuscript with changes highlighted compared to the old version for your convenience.
Thank you for your time and consideration.
Yours sincerely,
Pierre Le Bras
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
226 | 102 | 25 | 353 | 16 | 20 |
- HTML: 226
- PDF: 102
- XML: 25
- Total: 353
- BibTeX: 16
- EndNote: 20
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Pierre Le Bras
Florian Sévellec
Pierre Tandeo
Juan Ruiz
Pierre Ailliot
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(2323 KB) - Metadata XML