the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Dynamic weighted ensemble of geoscientific models via automated machine learning-based classification
Abstract. Despite recent developments in geoscientific (e.g., physics/data-driven) models, effectively assembling multiple models for approaching a benchmark solution remains challenging in many sub-disciplines of geoscientific fields. Here, we proposed an automated machine learning-assisted ensemble framework (AutoML-Ens) that attempts to resolve this challenge. Details of the methodology and workflow of AutoML-Ens were provided, and a prototype model was realized with the key strategy of mapping between the probabilities derived from the machine learning classifier and the dynamic weights assigned to the candidate ensemble members. Based on the newly proposed framework, its applications for two real-world examples (i.e., mapping global soil water retention parameters and estimating remotely sensed cropland evapotranspiration) were investigated and discussed. Results showed that compared to conventional ensemble approaches, AutoML-Ens was superior across the datasets (the training, testing, and overall datasets) and environmental gradients with improved performance metrics (e.g., coefficient of determination, Kling-Gupta efficiency, and root mean squared error). The better performance suggested the great potential of AutoML-Ens for improving quantification and reducing uncertainty in estimates due to its two unique features, i.e., assigning dynamic weights for candidate models and taking full advantage of AutoML-assisted workflow. In addition to the representative results, we also discussed the interpretational aspects of the used framework and its possible extensions. More importantly, we emphasized the benefits of combining data-driven approaches with physics constraints for geoscientific model ensemble problems with high dimensionality in space and non-linear behaviors in nature.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(5818 KB)
-
Supplement
(358 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(5818 KB) - Metadata XML
-
Supplement
(358 KB) - BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-1326', Anonymous Referee #1, 06 Feb 2023
Review of Dynamic weighted ensemble of geoscientific models via automated machine learning-based classification by Chen et al.
This manuscript demonstrates the merits of automatic ML (AutoML) for two geoscience use cases. In general, the paper is well written. The authors developed an ML workflow to find the best combination of models or the optimal model. They used the term ML classifier. It took me a while to understand this is different from the conventional classification problem for which the goal is to identification class labels for each sample. Instead, the goal in this work is to find the weights for combining the physics-based model ensemble.
My main question is whether it is necessary to use the ensemble-bassed AutoML in your use cases. Can you simply use a single ML model, e.g., XGBoost, to find the model weights/probabilities? Your workflow sounds like an ensemble of ML models for an ensemble physics models. Is this right? If so, the computational burden may be overwhelming.
Other minor comments:
Figure 3 (d)-(j). It seems all models fall outside the gray uncertainty envelope related to the 17 models. AutoML also represents an ensemble of ML models. In addition to plotting the ensemble mean from AutoML, can you develop an uncertainty envelope based on the AutoML ensemble.
Figure 7. Both AutoML-Ens and STIC use very similar reddish color. Can you make a stronger contrast?
Citation: https://doi.org/10.5194/egusphere-2022-1326-RC1 -
CC1: 'Reply on RC1', Hao Chen, 03 May 2023
We greatly value your feedback. It is also the finest review of our work that we have received in quite some time, as someone has finally paid attention to what is truly innovative about this paper. As you mentioned, our proposed method differs from conventional machine learning classification models. And in the revised manuscript, we will attempt to more clearly reflect this to reduce the amount of time that ‘for a while’ is used.
Regarding your major concerns about the necessity of utilizing AutoML and related inquiries, we would like to respond in three aspects:
1. The primary contribution to the paper is a classifier that can provide dynamic weights for the participating sub-models, so it is feasible whether it is based on an ensemble or individual machine learning models. As you mentioned, we use the ensemble technique in the AutoML platform, which may vary across AutoML platforms with bagging, boosting, and stacking approaches, but has been demonstrated to outperform individual machine learning algorithms potentially.
2. AutoML has proven essential in at least these two cases. Since machine learning model selection and hyperparameter optimization are indeed necessary steps in these examples, the most direct indication is that the stacked ensemble models outperform the single machine learning algorithm, and for these two examples, we obtain two different relatively optimal single machine learning models, XGBoost and XRT, respectively. In addition, the performance of various machine learning algorithms and their variants with varying hyperparameters may also exhibit significant differences in performance; for instance, in the ET ensemble, the classification error of the GBM-like model ranges from 0.615 to 0.653 (as shown in Table 1). Moreover, if we do not have practical tools or methods to help us choose a model and tune it, for instance, if we only train XGBoost, likely, it is not optimal for solving an issue in particular by itself, although it may be second to the best-performing ensemble model. Avoiding the errors that could result from focusing on a single model, AutoML makes it easier to determine a plausible and relatively optimal model. Further, if it turns out that we find that the optimal model for a problem is only the XGBoost model, but not the ensemble model, we can focus only on the tuning of the XGBoost model, which is an easy step to implement in the AutoML platform, for instance by using an include_algo parameter (https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/include_algos.html). In light of this, it is necessary.
3. Regarding the issue of computational effort for individual and ensemble model training, although it varies depending on the subject of concern, this is not a significant matter: recent studies have shown, for instance, that for the H2O-AML platform we are using, the AutoML ensembled model maybe even more time efficient than training a single model on their own; even for the same XGBoost model, training the XGBoost models in an AutoML way can run exponentially more efficient than training them alone [see Ferreira et al. 2021 for a benchmark testing of various AutoML platforms]. Moreover, the majority of the issues that we solve are one-time instances that do not require ongoing retraining of the model; therefore, time may not be a factor relative to the accuracy requirements, even though it may take a few days to process tens of millions of data. Moreover, for issues of real-time ensemble forecasting, for example, the initial AutoML may take more time, but with a simple setup, it can help us rapidly select the relatively optimal model of, say, XGBoost and GBM and proceed with additional investigation. So, this does not affect the necessity of utilizing AutoML, although this will somewhat depend on the quantity of data and model complexity. Occasionally, such trade-offs need to be considered.
Ferreira, A. Pilastri, C. M. Martins, P. M. Pires, and P. Cortez, “A Comparison of AutoML Tools for Machine Learning, Deep Learning and XGBoost,” in 2021 International Joint Conference on Neural Networks (IJCNN), 18-22 July 2021 2021, pp. 1–8, doi: 10.1109/IJCNN52387.2021.9534091.
Consequently, AutoML is still necessary. On the one hand, it is reflected in better prediction results, although it sometimes only slightly outperforms some sub-models. On the other hand, it also helps us to focus on some specific sub-models subsequently; furthermore, due to the advantages of AutoML itself, it may not take more time and computational resources than training individual models, which also promotes its more potential, and the current rapidly developing computational resources will in many cases enable AutoML to realize its full possibilities.
For the minor comments:
Regarding Figure 3, we would first like to clarify that the gray bands represent the predictions of 13 PTF models, which explains why the ensemble class of models, and AutoML-Ens in particular, does not fall within this range of bands. We will update this figure in the revised manuscript and expand on our findings in the relevant section to better reflect the differences between various machine learning methods.
Regarding Figure 7, replacing a color based on a sharp contrast effect is not a problem; thank you for the suggestion.
Once again, on a personal note, I am grateful for your encouragement and apologize for my delayed reply.
Citation: https://doi.org/10.5194/egusphere-2022-1326-CC1 -
AC1: 'Reply on RC1', Tiejun Wang, 26 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-AC1-supplement.pdf
-
CC1: 'Reply on RC1', Hao Chen, 03 May 2023
-
RC2: 'Comment on egusphere-2022-1326', Anonymous Referee #2, 25 May 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-RC2-supplement.pdf
-
AC2: 'Reply on RC2', Tiejun Wang, 26 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Tiejun Wang, 26 Jun 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-1326', Anonymous Referee #1, 06 Feb 2023
Review of Dynamic weighted ensemble of geoscientific models via automated machine learning-based classification by Chen et al.
This manuscript demonstrates the merits of automatic ML (AutoML) for two geoscience use cases. In general, the paper is well written. The authors developed an ML workflow to find the best combination of models or the optimal model. They used the term ML classifier. It took me a while to understand this is different from the conventional classification problem for which the goal is to identification class labels for each sample. Instead, the goal in this work is to find the weights for combining the physics-based model ensemble.
My main question is whether it is necessary to use the ensemble-bassed AutoML in your use cases. Can you simply use a single ML model, e.g., XGBoost, to find the model weights/probabilities? Your workflow sounds like an ensemble of ML models for an ensemble physics models. Is this right? If so, the computational burden may be overwhelming.
Other minor comments:
Figure 3 (d)-(j). It seems all models fall outside the gray uncertainty envelope related to the 17 models. AutoML also represents an ensemble of ML models. In addition to plotting the ensemble mean from AutoML, can you develop an uncertainty envelope based on the AutoML ensemble.
Figure 7. Both AutoML-Ens and STIC use very similar reddish color. Can you make a stronger contrast?
Citation: https://doi.org/10.5194/egusphere-2022-1326-RC1 -
CC1: 'Reply on RC1', Hao Chen, 03 May 2023
We greatly value your feedback. It is also the finest review of our work that we have received in quite some time, as someone has finally paid attention to what is truly innovative about this paper. As you mentioned, our proposed method differs from conventional machine learning classification models. And in the revised manuscript, we will attempt to more clearly reflect this to reduce the amount of time that ‘for a while’ is used.
Regarding your major concerns about the necessity of utilizing AutoML and related inquiries, we would like to respond in three aspects:
1. The primary contribution to the paper is a classifier that can provide dynamic weights for the participating sub-models, so it is feasible whether it is based on an ensemble or individual machine learning models. As you mentioned, we use the ensemble technique in the AutoML platform, which may vary across AutoML platforms with bagging, boosting, and stacking approaches, but has been demonstrated to outperform individual machine learning algorithms potentially.
2. AutoML has proven essential in at least these two cases. Since machine learning model selection and hyperparameter optimization are indeed necessary steps in these examples, the most direct indication is that the stacked ensemble models outperform the single machine learning algorithm, and for these two examples, we obtain two different relatively optimal single machine learning models, XGBoost and XRT, respectively. In addition, the performance of various machine learning algorithms and their variants with varying hyperparameters may also exhibit significant differences in performance; for instance, in the ET ensemble, the classification error of the GBM-like model ranges from 0.615 to 0.653 (as shown in Table 1). Moreover, if we do not have practical tools or methods to help us choose a model and tune it, for instance, if we only train XGBoost, likely, it is not optimal for solving an issue in particular by itself, although it may be second to the best-performing ensemble model. Avoiding the errors that could result from focusing on a single model, AutoML makes it easier to determine a plausible and relatively optimal model. Further, if it turns out that we find that the optimal model for a problem is only the XGBoost model, but not the ensemble model, we can focus only on the tuning of the XGBoost model, which is an easy step to implement in the AutoML platform, for instance by using an include_algo parameter (https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/include_algos.html). In light of this, it is necessary.
3. Regarding the issue of computational effort for individual and ensemble model training, although it varies depending on the subject of concern, this is not a significant matter: recent studies have shown, for instance, that for the H2O-AML platform we are using, the AutoML ensembled model maybe even more time efficient than training a single model on their own; even for the same XGBoost model, training the XGBoost models in an AutoML way can run exponentially more efficient than training them alone [see Ferreira et al. 2021 for a benchmark testing of various AutoML platforms]. Moreover, the majority of the issues that we solve are one-time instances that do not require ongoing retraining of the model; therefore, time may not be a factor relative to the accuracy requirements, even though it may take a few days to process tens of millions of data. Moreover, for issues of real-time ensemble forecasting, for example, the initial AutoML may take more time, but with a simple setup, it can help us rapidly select the relatively optimal model of, say, XGBoost and GBM and proceed with additional investigation. So, this does not affect the necessity of utilizing AutoML, although this will somewhat depend on the quantity of data and model complexity. Occasionally, such trade-offs need to be considered.
Ferreira, A. Pilastri, C. M. Martins, P. M. Pires, and P. Cortez, “A Comparison of AutoML Tools for Machine Learning, Deep Learning and XGBoost,” in 2021 International Joint Conference on Neural Networks (IJCNN), 18-22 July 2021 2021, pp. 1–8, doi: 10.1109/IJCNN52387.2021.9534091.
Consequently, AutoML is still necessary. On the one hand, it is reflected in better prediction results, although it sometimes only slightly outperforms some sub-models. On the other hand, it also helps us to focus on some specific sub-models subsequently; furthermore, due to the advantages of AutoML itself, it may not take more time and computational resources than training individual models, which also promotes its more potential, and the current rapidly developing computational resources will in many cases enable AutoML to realize its full possibilities.
For the minor comments:
Regarding Figure 3, we would first like to clarify that the gray bands represent the predictions of 13 PTF models, which explains why the ensemble class of models, and AutoML-Ens in particular, does not fall within this range of bands. We will update this figure in the revised manuscript and expand on our findings in the relevant section to better reflect the differences between various machine learning methods.
Regarding Figure 7, replacing a color based on a sharp contrast effect is not a problem; thank you for the suggestion.
Once again, on a personal note, I am grateful for your encouragement and apologize for my delayed reply.
Citation: https://doi.org/10.5194/egusphere-2022-1326-CC1 -
AC1: 'Reply on RC1', Tiejun Wang, 26 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-AC1-supplement.pdf
-
CC1: 'Reply on RC1', Hao Chen, 03 May 2023
-
RC2: 'Comment on egusphere-2022-1326', Anonymous Referee #2, 25 May 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-RC2-supplement.pdf
-
AC2: 'Reply on RC2', Tiejun Wang, 26 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Tiejun Wang, 26 Jun 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
773 | 211 | 18 | 1,002 | 61 | 8 | 13 |
- HTML: 773
- PDF: 211
- XML: 18
- Total: 1,002
- Supplement: 61
- BibTeX: 8
- EndNote: 13
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Tiejun Wang
Yonggen Zhang
Yun Bai
Xi Chen
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(5818 KB) - Metadata XML
-
Supplement
(358 KB) - BibTeX
- EndNote
- Final revised paper