the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Improving the prediction of the Madden-Julian Oscillation of the ECMWF model by post-processing
Abstract. The Madden-Julian Oscillation (MJO) is a major source of predictability on the sub-seasonal (10- to 90-days) time scale. An improved forecast of the MJO, may have important socioeconomic impacts due to the influence of MJO on both, tropical and extratropical weather extremes. Although in the last decades state-of-the-art climate models have proved their capability for forecasting the MJO exceeding the 5 weeks prediction skill, there is still room for improving the prediction. In this study we use Multiple Linear Regression (MLR) and a Machine Learning (ML) algorithm as post-processing methods to improve the forecast of the model that currently holds the best MJO forecasting performance, the European Centre for Medium-Range Weather Forecast (ECMWF) model. We find that both MLR and ML improve the MJO prediction and that ML outperforms MLR. The largest improvement is in the prediction of the MJO geographical location and intensity.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(1222 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(1222 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-2', Anonymous Referee #1, 21 Mar 2022
General Comments
This paper presents use of Multiple Linear Regression (MLR) and a Machine Learning (ML) algorithm as post-processing methods to improve MJO forecast of the European Centre for ECMWF model. It is generally well written and showcases successful results to improve MJO forecasts. Still, manuscript needs improvement in descrbing technical implementation that is somewhat short and confusing in places, such as relation between input and output neurons and lead time, as well as on MLR implementation.
Specific Comments
- Line 105: “After selecting the number of output neurons (which is even and in fact defines our lead time, τ = Nh/2)” – shouldn’t be Nout instead of Nh?
- Line 110: It appears to me that for each lead time L (1<L<46), ML takes as input the predicted ECMF trajectory RMM1,2 up to day L, and as output RMM1,2 in ERA5 observations up to day L+3 – please elaborate and clarify by confirming or correcting as necessary. Also, what is done when L=44,45 and 46?
- Section 2.5: implementation of MLR is barely described at all, please expand, scuh as do you use regularization to avoid overfitting, etc...
- Line 115: Please explain what a “walk-forward validation” is.
Citation: https://doi.org/10.5194/egusphere-2022-2-RC1 -
AC1: 'Reply on RC1', Riccardo Silini, 22 Mar 2022
We thank the reviewer for a careful revision of our manuscript that has allowed us to improve our work. With respect to the comments of the reviewer:
- Line 105: “After selecting the number of output neurons (which is even and in fact defines our lead time, τ = Nh/2)” – shouldn’t be Nout instead of Nh?
Authors’ response: We thank the reviewer for noticing this typo, Yes indeed, tau = Nout/2
- Line 110: It appears to me that for each lead time L (1<L<46), ML takes as input the predicted ECMF trajectory RMM1,2 up to day L, and as output RMM1,2 in ERA5 observations up to day L+3 – please elaborate and clarify by confirming or correcting as necessary. Also, what is done when L=44,45 and 46?
Authors’ response: We have revised the manuscript to clarify this point. The number of inputs is generally larger than the number of outputs, since we use the information we have of future ECMWF-predicted RMMs, as input. So, for each lead time L(1 < L < 46), we will have (L+3)*2 inputs, or, Nin = Nout + 6 (line 110). if L is 44-46, Nin will be equal to Nout, since we don’t have access to the future values (line 110: Nin = Nout + 6 with an upper limit of 92 inputs). For lead times longer than 30 - 35 days, the prediction skill becomes poor (COR and RMSE already crossed the 0.5 and 1.4 thresholds), and thus, the last lead times (44-46 days) are not crucial.
- Section 2.5: implementation of MLR is barely described at all, please expand, such as do you use regularization to avoid overfitting, etc...
Authors’ response: We have revised the manuscript to clarify this point. We do not include a regularization term like in Ridge or Lasso. MLR is the ordinary least squares (OLS) linear regression. This choice is also due to consistency with Kim et al. 2021.
- Line 115: Please explain what a “walk-forward validation” is.
Authors’ response: We have revised the manuscript to clarify what “walk-forward validation” is. The procedure is as follows. First, we train the network on an expanding train set, and then test its performance on a validation set that contains the N samples that follow the train set. In our case, we found the best minimum number of samples for the train set, out of 2200 available, to be 1700. Then, the train set is extended by 100 samples (∼ 1 year) for each run, and validated on the subsequent 200 samples (∼ 2 years). This method of walk-forward validation ensures that no information coming from the future of the test set is used to train the model.
Citation: https://doi.org/10.5194/egusphere-2022-2-AC1
-
RC2: 'Comment on egusphere-2022-2', Anonymous Referee #2, 19 Jul 2022
Report on "Improving the prediction of the Madden-Julian Oscillation of the ECMWF model by post-processing” by Riccardo Silini et al
Silini et al show that post-processing of forecasts of an ECMWF MJO prediction can improve its prediction, both in amplitude and phase. Besides this practical application their work addresses the interesting question if it is advisable to use a simple forecast model in conjunction with a deep neural network (which is expensive to train) as done in Kim et al 2021, or the most advanced forecast model (i.e. ECMWF model in the contact of MJO) in conjunction with a shallow single layer neural network. The authors suggest that their approach of combining the advanced forecast model with a single single layer network is preferable in the context of MJO prediction.The authors train on (past and future) observations, in particular on two leading EOFs. The authors also compare simple multiple linear regression and a single neural network as post-processing methodologies.The questions and results the authors consider are definitely of a wide interest and the comparison with KIM et al (2021) is illuminating. I would, however, like to see a few clarifications:- the authors find that their ML post-processing improves MJO propagation mostly across the MC. Given the different performance across the different phases/sectors, would a stratification of the training data with respect to the initial conditions and their respective sectors be a good idea?
- the supplementary material Silini 2021b shows Wheeler-Hendon phase diagrams for all the data. I am not an expert on MJO but I was surprised how bad their forecasts perform. What is missing to get a better prediction? I imagine that RMM1 and RMM2 are not sufficient to predict the error between the forecast and the observations. Does one get better results if more EOFs are taken as input training data? Or can one include other variables as input training data?
- a minor comment: it may help the reader to begin the Discussion section with an introductory sentence which relates to their initial question of forecast model/postprocessing method complexity rather than starting without motivation “It is interesting to compare …”.
Citation: https://doi.org/10.5194/egusphere-2022-2-RC2 -
AC2: 'Reply on RC2', Riccardo Silini, 22 Jul 2022
We thank the reviewer for his / hers comments which allow us to improve our work, in particular, the discussion. The corresponding author (RS) has recently defended his doctoral thesis, and the reviewer’s pertinent and relevant comments overlap in part with those of the expert PhD committee.
1. The reviewer wonders “would a stratification of the training data with respect to the initial conditions and their respective sectors be a good idea?” Intuitively this is a great idea, but technically it is not possible because of the lack of data to train the networks. We work with the ECMWF predictions, which are available every two weeks, for 20 years. If we split the dataset into 8 (phases), we wouldn’t have enough data to perform a reliable training of the network. In fact, we can see already from our training that we found the minimum number of samples needed for the training to be 1700 out of the 2200 available. We will include a comment about this point in the revised manuscript.
2. The forecast on the Wheeler-Hendon phase diagram using machine learn- ing techniques as standalone are not as good as those obtained with state-of- the-art dynamical models. The reviewer wonders “What is missing to get a better prediction? ”
We believe that it might be possible to obtain better results with more EOFs, it probably depends on the importance of the associated variance. Since we were proposing a different approach to predict the MJO, we had to be able to fairly compare our results with other models, and the best models use RMM1 and RMM2 only. For what concerns including other variables as input data, it is a very promising idea, in fact RS and CM developed in 2021 (Silini, & Masoller, Sci. Rep. 11, 8423) a fast and effective metric to compute information transfer among variables, which would allow to identify variables that can be used as informative inputs. This is the object of future work and in the revised manuscript we will include a comment about this point.
3. We agree with the reviewer’s suggestion, and we will modify the revised manuscript accordingly.
Citation: https://doi.org/10.5194/egusphere-2022-2-AC2
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-2', Anonymous Referee #1, 21 Mar 2022
General Comments
This paper presents use of Multiple Linear Regression (MLR) and a Machine Learning (ML) algorithm as post-processing methods to improve MJO forecast of the European Centre for ECMWF model. It is generally well written and showcases successful results to improve MJO forecasts. Still, manuscript needs improvement in descrbing technical implementation that is somewhat short and confusing in places, such as relation between input and output neurons and lead time, as well as on MLR implementation.
Specific Comments
- Line 105: “After selecting the number of output neurons (which is even and in fact defines our lead time, τ = Nh/2)” – shouldn’t be Nout instead of Nh?
- Line 110: It appears to me that for each lead time L (1<L<46), ML takes as input the predicted ECMF trajectory RMM1,2 up to day L, and as output RMM1,2 in ERA5 observations up to day L+3 – please elaborate and clarify by confirming or correcting as necessary. Also, what is done when L=44,45 and 46?
- Section 2.5: implementation of MLR is barely described at all, please expand, scuh as do you use regularization to avoid overfitting, etc...
- Line 115: Please explain what a “walk-forward validation” is.
Citation: https://doi.org/10.5194/egusphere-2022-2-RC1 -
AC1: 'Reply on RC1', Riccardo Silini, 22 Mar 2022
We thank the reviewer for a careful revision of our manuscript that has allowed us to improve our work. With respect to the comments of the reviewer:
- Line 105: “After selecting the number of output neurons (which is even and in fact defines our lead time, τ = Nh/2)” – shouldn’t be Nout instead of Nh?
Authors’ response: We thank the reviewer for noticing this typo, Yes indeed, tau = Nout/2
- Line 110: It appears to me that for each lead time L (1<L<46), ML takes as input the predicted ECMF trajectory RMM1,2 up to day L, and as output RMM1,2 in ERA5 observations up to day L+3 – please elaborate and clarify by confirming or correcting as necessary. Also, what is done when L=44,45 and 46?
Authors’ response: We have revised the manuscript to clarify this point. The number of inputs is generally larger than the number of outputs, since we use the information we have of future ECMWF-predicted RMMs, as input. So, for each lead time L(1 < L < 46), we will have (L+3)*2 inputs, or, Nin = Nout + 6 (line 110). if L is 44-46, Nin will be equal to Nout, since we don’t have access to the future values (line 110: Nin = Nout + 6 with an upper limit of 92 inputs). For lead times longer than 30 - 35 days, the prediction skill becomes poor (COR and RMSE already crossed the 0.5 and 1.4 thresholds), and thus, the last lead times (44-46 days) are not crucial.
- Section 2.5: implementation of MLR is barely described at all, please expand, such as do you use regularization to avoid overfitting, etc...
Authors’ response: We have revised the manuscript to clarify this point. We do not include a regularization term like in Ridge or Lasso. MLR is the ordinary least squares (OLS) linear regression. This choice is also due to consistency with Kim et al. 2021.
- Line 115: Please explain what a “walk-forward validation” is.
Authors’ response: We have revised the manuscript to clarify what “walk-forward validation” is. The procedure is as follows. First, we train the network on an expanding train set, and then test its performance on a validation set that contains the N samples that follow the train set. In our case, we found the best minimum number of samples for the train set, out of 2200 available, to be 1700. Then, the train set is extended by 100 samples (∼ 1 year) for each run, and validated on the subsequent 200 samples (∼ 2 years). This method of walk-forward validation ensures that no information coming from the future of the test set is used to train the model.
Citation: https://doi.org/10.5194/egusphere-2022-2-AC1
-
RC2: 'Comment on egusphere-2022-2', Anonymous Referee #2, 19 Jul 2022
Report on "Improving the prediction of the Madden-Julian Oscillation of the ECMWF model by post-processing” by Riccardo Silini et al
Silini et al show that post-processing of forecasts of an ECMWF MJO prediction can improve its prediction, both in amplitude and phase. Besides this practical application their work addresses the interesting question if it is advisable to use a simple forecast model in conjunction with a deep neural network (which is expensive to train) as done in Kim et al 2021, or the most advanced forecast model (i.e. ECMWF model in the contact of MJO) in conjunction with a shallow single layer neural network. The authors suggest that their approach of combining the advanced forecast model with a single single layer network is preferable in the context of MJO prediction.The authors train on (past and future) observations, in particular on two leading EOFs. The authors also compare simple multiple linear regression and a single neural network as post-processing methodologies.The questions and results the authors consider are definitely of a wide interest and the comparison with KIM et al (2021) is illuminating. I would, however, like to see a few clarifications:- the authors find that their ML post-processing improves MJO propagation mostly across the MC. Given the different performance across the different phases/sectors, would a stratification of the training data with respect to the initial conditions and their respective sectors be a good idea?
- the supplementary material Silini 2021b shows Wheeler-Hendon phase diagrams for all the data. I am not an expert on MJO but I was surprised how bad their forecasts perform. What is missing to get a better prediction? I imagine that RMM1 and RMM2 are not sufficient to predict the error between the forecast and the observations. Does one get better results if more EOFs are taken as input training data? Or can one include other variables as input training data?
- a minor comment: it may help the reader to begin the Discussion section with an introductory sentence which relates to their initial question of forecast model/postprocessing method complexity rather than starting without motivation “It is interesting to compare …”.
Citation: https://doi.org/10.5194/egusphere-2022-2-RC2 -
AC2: 'Reply on RC2', Riccardo Silini, 22 Jul 2022
We thank the reviewer for his / hers comments which allow us to improve our work, in particular, the discussion. The corresponding author (RS) has recently defended his doctoral thesis, and the reviewer’s pertinent and relevant comments overlap in part with those of the expert PhD committee.
1. The reviewer wonders “would a stratification of the training data with respect to the initial conditions and their respective sectors be a good idea?” Intuitively this is a great idea, but technically it is not possible because of the lack of data to train the networks. We work with the ECMWF predictions, which are available every two weeks, for 20 years. If we split the dataset into 8 (phases), we wouldn’t have enough data to perform a reliable training of the network. In fact, we can see already from our training that we found the minimum number of samples needed for the training to be 1700 out of the 2200 available. We will include a comment about this point in the revised manuscript.
2. The forecast on the Wheeler-Hendon phase diagram using machine learn- ing techniques as standalone are not as good as those obtained with state-of- the-art dynamical models. The reviewer wonders “What is missing to get a better prediction? ”
We believe that it might be possible to obtain better results with more EOFs, it probably depends on the importance of the associated variance. Since we were proposing a different approach to predict the MJO, we had to be able to fairly compare our results with other models, and the best models use RMM1 and RMM2 only. For what concerns including other variables as input data, it is a very promising idea, in fact RS and CM developed in 2021 (Silini, & Masoller, Sci. Rep. 11, 8423) a fast and effective metric to compute information transfer among variables, which would allow to identify variables that can be used as informative inputs. This is the object of future work and in the revised manuscript we will include a comment about this point.
3. We agree with the reviewer’s suggestion, and we will modify the revised manuscript accordingly.
Citation: https://doi.org/10.5194/egusphere-2022-2-AC2
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
403 | 136 | 20 | 559 | 11 | 8 |
- HTML: 403
- PDF: 136
- XML: 20
- Total: 559
- BibTeX: 11
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Riccardo Silini
Sebastian Lerch
Nikolaos Mastrantonas
Holger Kantz
Marcelo Barreiro
Cristina Masoller
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(1222 KB) - Metadata XML