Dynamic weighted ensemble of geoscientific models via automated machine learning-based classification

Chen, Hao; Wang, Tiejun; Zhang, Yonggen; Bai, Yun; Chen, Xi

doi:10.5194/egusphere-2022-1326

Preprints

https://doi.org/10.5194/egusphere-2022-1326

Preprints

05 Jan 2023

| 05 Jan 2023

Dynamic weighted ensemble of geoscientific models via automated machine learning-based classification

Hao Chen, Tiejun Wang, Yonggen Zhang, Yun Bai, and Xi Chen

Abstract. Despite recent developments in geoscientific (e.g., physics/data-driven) models, effectively assembling multiple models for approaching a benchmark solution remains challenging in many sub-disciplines of geoscientific fields. Here, we proposed an automated machine learning-assisted ensemble framework (AutoML-Ens) that attempts to resolve this challenge. Details of the methodology and workflow of AutoML-Ens were provided, and a prototype model was realized with the key strategy of mapping between the probabilities derived from the machine learning classifier and the dynamic weights assigned to the candidate ensemble members. Based on the newly proposed framework, its applications for two real-world examples (i.e., mapping global soil water retention parameters and estimating remotely sensed cropland evapotranspiration) were investigated and discussed. Results showed that compared to conventional ensemble approaches, AutoML-Ens was superior across the datasets (the training, testing, and overall datasets) and environmental gradients with improved performance metrics (e.g., coefficient of determination, Kling-Gupta efficiency, and root mean squared error). The better performance suggested the great potential of AutoML-Ens for improving quantification and reducing uncertainty in estimates due to its two unique features, i.e., assigning dynamic weights for candidate models and taking full advantage of AutoML-assisted workflow. In addition to the representative results, we also discussed the interpretational aspects of the used framework and its possible extensions. More importantly, we emphasized the benefits of combining data-driven approaches with physics constraints for geoscientific model ensemble problems with high dimensionality in space and non-linear behaviors in nature.

Received: 23 Nov 2022 – Discussion started: 05 Jan 2023

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 5818 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (5818 KB)

Supplement (358 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

12 Oct 2023

Dynamically weighted ensemble of geoscientific models via automated machine-learning-based classification

Hao Chen, Tiejun Wang, Yonggen Zhang, Yun Bai, and Xi Chen

Geosci. Model Dev., 16, 5685–5701, https://doi.org/10.5194/gmd-16-5685-2023,https://doi.org/10.5194/gmd-16-5685-2023, 2023

Short summary

Hao Chen, Tiejun Wang, Yonggen Zhang, Yun Bai, and Xi Chen

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-1326', Anonymous Referee #1, 06 Feb 2023

Review of Dynamic weighted ensemble of geoscientific models via automated machine learning-based classification by Chen et al.
This manuscript demonstrates the merits of automatic ML (AutoML) for two geoscience use cases. In general, the paper is well written. The authors developed an ML workflow to find the best combination of models or the optimal model. They used the term ML classifier. It took me a while to understand this is different from the conventional classification problem for which the goal is to identification class labels for each sample. Instead, the goal in this work is to find the weights for combining the physics-based model ensemble.
My main question is whether it is necessary to use the ensemble-bassed AutoML in your use cases. Can you simply use a single ML model, e.g., XGBoost, to find the model weights/probabilities? Your workflow sounds like an ensemble of ML models for an ensemble physics models. Is this right? If so, the computational burden may be overwhelming.
Other minor comments:
Figure 3 (d)-(j). It seems all models fall outside the gray uncertainty envelope related to the 17 models. AutoML also represents an ensemble of ML models. In addition to plotting the ensemble mean from AutoML, can you develop an uncertainty envelope based on the AutoML ensemble.
Figure 7. Both AutoML-Ens and STIC use very similar reddish color. Can you make a stronger contrast?

Citation: https://doi.org/10.5194/egusphere-2022-1326-RC1
- CC1: 'Reply on RC1', Hao Chen, 03 May 2023
  
  We greatly value your feedback. It is also the finest review of our work that we have received in quite some time, as someone has finally paid attention to what is truly innovative about this paper. As you mentioned, our proposed method differs from conventional machine learning classification models. And in the revised manuscript, we will attempt to more clearly reflect this to reduce the amount of time that ‘for a while’ is used.
  Regarding your major concerns about the necessity of utilizing AutoML and related inquiries, we would like to respond in three aspects:
  1. The primary contribution to the paper is a classifier that can provide dynamic weights for the participating sub-models, so it is feasible whether it is based on an ensemble or individual machine learning models. As you mentioned, we use the ensemble technique in the AutoML platform, which may vary across AutoML platforms with bagging, boosting, and stacking approaches, but has been demonstrated to outperform individual machine learning algorithms potentially.
  2. AutoML has proven essential in at least these two cases. Since machine learning model selection and hyperparameter optimization are indeed necessary steps in these examples, the most direct indication is that the stacked ensemble models outperform the single machine learning algorithm, and for these two examples, we obtain two different relatively optimal single machine learning models, XGBoost and XRT, respectively. In addition, the performance of various machine learning algorithms and their variants with varying hyperparameters may also exhibit significant differences in performance; for instance, in the ET ensemble, the classification error of the GBM-like model ranges from 0.615 to 0.653 (as shown in Table 1). Moreover, if we do not have practical tools or methods to help us choose a model and tune it, for instance, if we only train XGBoost, likely, it is not optimal for solving an issue in particular by itself, although it may be second to the best-performing ensemble model. Avoiding the errors that could result from focusing on a single model, AutoML makes it easier to determine a plausible and relatively optimal model. Further, if it turns out that we find that the optimal model for a problem is only the XGBoost model, but not the ensemble model, we can focus only on the tuning of the XGBoost model, which is an easy step to implement in the AutoML platform, for instance by using an include_algo parameter (https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/include_algos.html). In light of this, it is necessary.
  3. Regarding the issue of computational effort for individual and ensemble model training, although it varies depending on the subject of concern, this is not a significant matter: recent studies have shown, for instance, that for the H2O-AML platform we are using, the AutoML ensembled model maybe even more time efficient than training a single model on their own; even for the same XGBoost model, training the XGBoost models in an AutoML way can run exponentially more efficient than training them alone [see Ferreira et al. 2021 for a benchmark testing of various AutoML platforms]. Moreover, the majority of the issues that we solve are one-time instances that do not require ongoing retraining of the model; therefore, time may not be a factor relative to the accuracy requirements, even though it may take a few days to process tens of millions of data. Moreover, for issues of real-time ensemble forecasting, for example, the initial AutoML may take more time, but with a simple setup, it can help us rapidly select the relatively optimal model of, say, XGBoost and GBM and proceed with additional investigation. So, this does not affect the necessity of utilizing AutoML, although this will somewhat depend on the quantity of data and model complexity. Occasionally, such trade-offs need to be considered.
  Ferreira, A. Pilastri, C. M. Martins, P. M. Pires, and P. Cortez, “A Comparison of AutoML Tools for Machine Learning, Deep Learning and XGBoost,” in 2021 International Joint Conference on Neural Networks (IJCNN), 18-22 July 2021 2021, pp. 1–8, doi: 10.1109/IJCNN52387.2021.9534091.
  Consequently, AutoML is still necessary. On the one hand, it is reflected in better prediction results, although it sometimes only slightly outperforms some sub-models. On the other hand, it also helps us to focus on some specific sub-models subsequently; furthermore, due to the advantages of AutoML itself, it may not take more time and computational resources than training individual models, which also promotes its more potential, and the current rapidly developing computational resources will in many cases enable AutoML to realize its full possibilities.
  For the minor comments:
  Regarding Figure 3, we would first like to clarify that the gray bands represent the predictions of 13 PTF models, which explains why the ensemble class of models, and AutoML-Ens in particular, does not fall within this range of bands. We will update this figure in the revised manuscript and expand on our findings in the relevant section to better reflect the differences between various machine learning methods.
  Regarding Figure 7, replacing a color based on a sharp contrast effect is not a problem; thank you for the suggestion.
  Once again, on a personal note, I am grateful for your encouragement and apologize for my delayed reply.
  
  Citation: https://doi.org/10.5194/egusphere-2022-1326-CC1
- AC1: 'Reply on RC1', Tiejun Wang, 26 Jun 2023
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2022-1326-AC1
RC2:
'Comment on egusphere-2022-1326', Anonymous Referee #2, 25 May 2023

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-RC2-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2022-1326-RC2
- AC2: 'Reply on RC2', Tiejun Wang, 26 Jun 2023
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2022-1326-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-1326', Anonymous Referee #1, 06 Feb 2023

Review of Dynamic weighted ensemble of geoscientific models via automated machine learning-based classification by Chen et al.
This manuscript demonstrates the merits of automatic ML (AutoML) for two geoscience use cases. In general, the paper is well written. The authors developed an ML workflow to find the best combination of models or the optimal model. They used the term ML classifier. It took me a while to understand this is different from the conventional classification problem for which the goal is to identification class labels for each sample. Instead, the goal in this work is to find the weights for combining the physics-based model ensemble.
My main question is whether it is necessary to use the ensemble-bassed AutoML in your use cases. Can you simply use a single ML model, e.g., XGBoost, to find the model weights/probabilities? Your workflow sounds like an ensemble of ML models for an ensemble physics models. Is this right? If so, the computational burden may be overwhelming.
Other minor comments:
Figure 3 (d)-(j). It seems all models fall outside the gray uncertainty envelope related to the 17 models. AutoML also represents an ensemble of ML models. In addition to plotting the ensemble mean from AutoML, can you develop an uncertainty envelope based on the AutoML ensemble.
Figure 7. Both AutoML-Ens and STIC use very similar reddish color. Can you make a stronger contrast?

Citation: https://doi.org/10.5194/egusphere-2022-1326-RC1
- CC1: 'Reply on RC1', Hao Chen, 03 May 2023
  
  We greatly value your feedback. It is also the finest review of our work that we have received in quite some time, as someone has finally paid attention to what is truly innovative about this paper. As you mentioned, our proposed method differs from conventional machine learning classification models. And in the revised manuscript, we will attempt to more clearly reflect this to reduce the amount of time that ‘for a while’ is used.
  Regarding your major concerns about the necessity of utilizing AutoML and related inquiries, we would like to respond in three aspects:
  1. The primary contribution to the paper is a classifier that can provide dynamic weights for the participating sub-models, so it is feasible whether it is based on an ensemble or individual machine learning models. As you mentioned, we use the ensemble technique in the AutoML platform, which may vary across AutoML platforms with bagging, boosting, and stacking approaches, but has been demonstrated to outperform individual machine learning algorithms potentially.
  2. AutoML has proven essential in at least these two cases. Since machine learning model selection and hyperparameter optimization are indeed necessary steps in these examples, the most direct indication is that the stacked ensemble models outperform the single machine learning algorithm, and for these two examples, we obtain two different relatively optimal single machine learning models, XGBoost and XRT, respectively. In addition, the performance of various machine learning algorithms and their variants with varying hyperparameters may also exhibit significant differences in performance; for instance, in the ET ensemble, the classification error of the GBM-like model ranges from 0.615 to 0.653 (as shown in Table 1). Moreover, if we do not have practical tools or methods to help us choose a model and tune it, for instance, if we only train XGBoost, likely, it is not optimal for solving an issue in particular by itself, although it may be second to the best-performing ensemble model. Avoiding the errors that could result from focusing on a single model, AutoML makes it easier to determine a plausible and relatively optimal model. Further, if it turns out that we find that the optimal model for a problem is only the XGBoost model, but not the ensemble model, we can focus only on the tuning of the XGBoost model, which is an easy step to implement in the AutoML platform, for instance by using an include_algo parameter (https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/include_algos.html). In light of this, it is necessary.
  3. Regarding the issue of computational effort for individual and ensemble model training, although it varies depending on the subject of concern, this is not a significant matter: recent studies have shown, for instance, that for the H2O-AML platform we are using, the AutoML ensembled model maybe even more time efficient than training a single model on their own; even for the same XGBoost model, training the XGBoost models in an AutoML way can run exponentially more efficient than training them alone [see Ferreira et al. 2021 for a benchmark testing of various AutoML platforms]. Moreover, the majority of the issues that we solve are one-time instances that do not require ongoing retraining of the model; therefore, time may not be a factor relative to the accuracy requirements, even though it may take a few days to process tens of millions of data. Moreover, for issues of real-time ensemble forecasting, for example, the initial AutoML may take more time, but with a simple setup, it can help us rapidly select the relatively optimal model of, say, XGBoost and GBM and proceed with additional investigation. So, this does not affect the necessity of utilizing AutoML, although this will somewhat depend on the quantity of data and model complexity. Occasionally, such trade-offs need to be considered.
  Ferreira, A. Pilastri, C. M. Martins, P. M. Pires, and P. Cortez, “A Comparison of AutoML Tools for Machine Learning, Deep Learning and XGBoost,” in 2021 International Joint Conference on Neural Networks (IJCNN), 18-22 July 2021 2021, pp. 1–8, doi: 10.1109/IJCNN52387.2021.9534091.
  Consequently, AutoML is still necessary. On the one hand, it is reflected in better prediction results, although it sometimes only slightly outperforms some sub-models. On the other hand, it also helps us to focus on some specific sub-models subsequently; furthermore, due to the advantages of AutoML itself, it may not take more time and computational resources than training individual models, which also promotes its more potential, and the current rapidly developing computational resources will in many cases enable AutoML to realize its full possibilities.
  For the minor comments:
  Regarding Figure 3, we would first like to clarify that the gray bands represent the predictions of 13 PTF models, which explains why the ensemble class of models, and AutoML-Ens in particular, does not fall within this range of bands. We will update this figure in the revised manuscript and expand on our findings in the relevant section to better reflect the differences between various machine learning methods.
  Regarding Figure 7, replacing a color based on a sharp contrast effect is not a problem; thank you for the suggestion.
  Once again, on a personal note, I am grateful for your encouragement and apologize for my delayed reply.
  
  Citation: https://doi.org/10.5194/egusphere-2022-1326-CC1
- AC1: 'Reply on RC1', Tiejun Wang, 26 Jun 2023
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2022-1326-AC1
RC2:
'Comment on egusphere-2022-1326', Anonymous Referee #2, 25 May 2023

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-RC2-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2022-1326-RC2
- AC2: 'Reply on RC2', Tiejun Wang, 26 Jun 2023
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2022-1326/egusphere-2022-1326-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2022-1326-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Tiejun Wang on behalf of the Authors (26 Jun 2023) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (11 Jul 2023) by Klaus Klingmüller

RR by Anonymous Referee #2 (25 Jul 2023)

RR by Anonymous Referee #1 (04 Aug 2023)

ED: Publish subject to minor revisions (review by editor) (18 Aug 2023) by Klaus Klingmüller

AR by Tiejun Wang on behalf of the Authors (20 Aug 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (07 Sep 2023) by Klaus Klingmüller

AR by Tiejun Wang on behalf of the Authors (08 Sep 2023) Manuscript

Journal article(s) based on this preprint

12 Oct 2023

Dynamically weighted ensemble of geoscientific models via automated machine-learning-based classification

Hao Chen, Tiejun Wang, Yonggen Zhang, Yun Bai, and Xi Chen

Geosci. Model Dev., 16, 5685–5701, https://doi.org/10.5194/gmd-16-5685-2023,https://doi.org/10.5194/gmd-16-5685-2023, 2023

Short summary

Hao Chen, Tiejun Wang, Yonggen Zhang, Yun Bai, and Xi Chen

Supplement

https://doi.org/10.5194/egusphere-2022-1326-supplement

Hao Chen, Tiejun Wang, Yonggen Zhang, Yun Bai, and Xi Chen

Viewed

Total article views: 2,077 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
1,103	915	59	2,077	272	46	92

HTML: 1,103
PDF: 915
XML: 59
Total: 2,077
Supplement: 272
BibTeX: 46
EndNote: 92

Views and downloads (calculated since 05 Jan 2023)

Month	HTML	PDF	XML	Total
Jan 2023	142	33	4	179
Feb 2023	100	25	2	127
Mar 2023	96	25	0	121
Apr 2023	71	11	0	82
May 2023	81	17	3	101
Jun 2023	69	28	5	102
Jul 2023	68	19	3	90
Aug 2023	69	20	1	90
Sep 2023	55	26	0	81
Oct 2023	22	7	0	29
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	11	12	2	25
Jun 2024	17	11	4	32
Jul 2024	7	11	6	24
Aug 2024	10	5	4	19
Sep 2024	7	9	4	20
Oct 2024	4	4	1	9
Nov 2024	10	2	1	13
Dec 2024	8	10	0	18
Jan 2025	11	9	2	22
Feb 2025	17	11	3	31
Mar 2025	7	15	1	23
Apr 2025	9	31	0	40
May 2025	8	21	0	29
Jun 2025	27	34	1	62
Jul 2025	15	26	0	41
Aug 2025	9	27	0	36
Sep 2025	14	39	1	54
Oct 2025	11	32	0	43
Nov 2025	14	68	0	82
Dec 2025	15	31	1	47
Jan 2026	17	31	3	51
Feb 2026	23	28	4	55
Mar 2026	22	47	1	70
Apr 2026	37	190	2	229
May 2026	0

Cumulative views and downloads (calculated since 05 Jan 2023)

Month	HTML	PDF	XML	Total
Jan 2023	142	33	4	179
Feb 2023	100	25	2	127
Mar 2023	96	25	0	121
Apr 2023	71	11	0	82
May 2023	81	17	3	101
Jun 2023	69	28	5	102
Jul 2023	68	19	3	90
Aug 2023	69	20	1	90
Sep 2023	55	26	0	81
Oct 2023	22	7	0	29
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	11	12	2	25
Jun 2024	17	11	4	32
Jul 2024	7	11	6	24
Aug 2024	10	5	4	19
Sep 2024	7	9	4	20
Oct 2024	4	4	1	9
Nov 2024	10	2	1	13
Dec 2024	8	10	0	18
Jan 2025	11	9	2	22
Feb 2025	17	11	3	31
Mar 2025	7	15	1	23
Apr 2025	9	31	0	40
May 2025	8	21	0	29
Jun 2025	27	34	1	62
Jul 2025	15	26	0	41
Aug 2025	9	27	0	36
Sep 2025	14	39	1	54
Oct 2025	11	32	0	43
Nov 2025	14	68	0	82
Dec 2025	15	31	1	47
Jan 2026	17	31	3	51
Feb 2026	23	28	4	55
Mar 2026	22	47	1	70
Apr 2026	37	190	2	229
May 2026	0

Viewed (geographical distribution)

Total article views: 2,079 (including HTML, PDF, and XML) Thereof 2,079 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 02 May 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (5818 KB)
Metadata XML

Short summary

Effectively assembling multiple models for approaching a benchmark solution remains a long-standing issue for various geoscience domains. We here proposed an automated machine learning-assisted ensemble framework (AutoML-Ens) that attempts to resolve this challenge. Results demonstrated the great potential of AutoML-Ens for improving estimations due to its two unique features, i.e., assigning dynamic weights for candidate models and taking full advantage of AutoML-assisted workflow.


Total:	0
HTML:	0
PDF:	0
XML:	0