A methodological framework for improving the performance of data-driven models, a case study for daily runoff prediction in the Maumee domain, U.S.

Hu, Yao; Ghosh, Chirantan; Malakpour-Estalaki, Siamak

doi:https://doi.org/10.5194/egusphere-2022-815

Preprints

https://doi.org/10.5194/egusphere-2022-815

Preprints

21 Oct 2022

| 21 Oct 2022

A methodological framework for improving the performance of data-driven models, a case study for daily runoff prediction in the Maumee domain, U.S.

Yao Hu, Chirantan Ghosh, and Siamak Malakpour-Estalaki

Abstract. Geoscientific models are simplified representations of complex earth and environmental systems (EESs). Compared with physics-based numerical models, data-driven modeling has gained popularity due mainly to data proliferation in EESs and the ability to perform prediction without requiring explicit mathematical representation of complex biophysical processes. However, because of the black-box nature of data-driven models, their performance cannot be guaranteed. To address this issue, we developed a generalizable framework for the improvement of the efficiency and effectiveness of model training and the reduction of model overfitting. This framework consists of two parts: hyperparameter selection based on Sobol global sensitivity analysis, and hyperparameter tuning using a Bayesian optimization approach. We demonstrated the framework efficacy through a case study of daily edge-of-field (EOF) runoff predictions by a tree-based data-driven model using eXtreme Gradient Boosting (XGBoost) algorithm in the Maumee domain, U.S. This framework contributes towards improving the performance of a variety of data-driven models and can thus help promote their applications in EESs.

Received: 31 Aug 2022 – Discussion started: 21 Oct 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 8652 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (8652 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

06 Apr 2023

A methodological framework for improving the performance of data-driven models: a case study for daily runoff prediction in the Maumee domain, USA

Yao Hu, Chirantan Ghosh, and Siamak Malakpour-Estalaki

Geosci. Model Dev., 16, 1925–1936, https://doi.org/10.5194/gmd-16-1925-2023,https://doi.org/10.5194/gmd-16-1925-2023, 2023

Short summary

Yao Hu, Chirantan Ghosh, and Siamak Malakpour-Estalaki

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-815', Anonymous Referee #1, 04 Dec 2022

This manuscript proposed an AI framework consisting of hyperparameter selection and training parts to improve training efficiency and reduce overfitting. And the case study for daily runoff prediction in the Maumee domain showed the good potential of this framework. In general, this manuscript is well-written, and the conclusion is reasonable. Therefore, I think the manuscript can be published in EGUsphere after minor revision.

(a) When discussing the hyperparameter SS, the paper mentioned that the daily EOF runoff is imbalanced data. Usually, this is very important for AI model training. Can authors explain how to deal with the imbalanced data? I recommend the authors preprocess the data to improve the availability of the data in model training if the authors did not deal with the imbalanced data.

(b) I suggest authors add a flowchart in the manuscript, which will be good for readers to understand the framework.

Citation: https://doi.org/10.5194/egusphere-2022-815-RC1
- AC1: 'Reply on RC1', Yao Hu, 02 Feb 2023
  
  We thank the reviewers for their insightful comments and constructive suggestions that have led to the improvement of our paper. Our responses to all the comments and suggestions are detailed below.
  Reviewer #1: This manuscript proposed an AI framework consisting of hyperparameter selection and training parts to improve training efficiency and reduce overfitting. And the case study for daily runoff prediction in the Maumee domain showed the good potential of this framework. In general, this manuscript is well-written, and the conclusion is reasonable. Therefore, I think the manuscript can be published in EGUsphere after minor revision.
  (a) When discussing the hyperparameter SS, the paper mentioned that the daily EOF runoff is imbalanced data. Usually, this is very important for AI model training. Can authors explain how to deal with the imbalanced data? I recommend the authors preprocess the data to improve the availability of the data in model training if the authors did not deal with the imbalanced data.
  Thanks. We agreed with the reviewer that the imbalanced data poses great challenges to model training. There are several ways to improve the training quality for imbalanced data: 1) Select an effective machine learning (ML) algorithm that has built-in mechanisms to deal with imbalanced data. 2) Choose a good cross-validation strategy to ensure the training and test datasets follow a similar distribution of the target variable. 3) Set class weights on your target classes to give more weight to the minority class. In our study, as our focus is to identify the influential hyperparameters for the regression models that are trained to predict the magnitude of daily EOF runoff, we chose an effective ML algorithm, the Extreme Gradient Boosting (XGBoost) algorithm for model training; the XGBoost algorithm offers a range of hyperparameters that can give fine-grained control over the model training performance against imbalance data. For example, we used the Stratified K-Fold cross-validation to ensure the training and test datasets follow a similar distribution and defined a loss function that penalizes more the missing predictions of non-zero runoff events, that is, the minority class in this study.
  (b) I suggest authors add a flowchart in the manuscript, which will be good for readers to understand the framework.
  Thanks. Figure 1 explains the different components of the methodological framework. To further clarify, we follow the suggestion by the reviewer to add some descriptions on the workflow as follows:
  We first choose a machine learning algorithm and its associated hyperparameters. Then, we feed the initial hyperparameters (1) to the hyperparameter selection (HS) module to determine the influential hyperparameters (2). Once initial values are assigned to the influential hyperparameters (4), we use the hyperparameter tuning to identify their optimal values (3), which allows the algorithm to achieve the best performance in training. A case study is used to illustrate the workflow in more detail.
  
  Citation: https://doi.org/10.5194/egusphere-2022-815-AC1
RC2:
'Comment on egusphere-2022-815', Anonymous Referee #2, 09 Jan 2023
The authors developed a generalizable framework for the improvement of the efficiency and effectiveness of model training and the reduction of model overfitting. This study makes attemp to predict daily runoff based on data-driven models. The two parts of proposed framework: hyperparameter selection and hyperparameter tuning are significant for machine learning. However, I suggest the authors should make more complete explanations on the results. I recommend the article for acceptance after minor revision.

The data-driven model using the eXtreme Grandient Boosting. There are lots of the machine learning method and I suggest the authors explain the reason in the introduction part.

I think the authors should make the simple introduction of study area (figure2) such as climate, soil.

Figure 5, It seems that the runoff training samples from July, 2012 to Dec.2013 is larger than runoff test samples from Jan.,2014 to Jan.,2016. What is the accuracy? If conversely, training samples are from Jan.,2014 to Jan.,2016 and test samples are from July, 2012 to Dec.2013.

Figure 6(a), it seems that HS is better than HT when measured EOF has larger value. How about the performance of HS-HT when measured EOF has larger value?

From Zenodo, we could find the input file and there are lots of inputs such as soil moisture for this study. I suggest the author give simple introduction about the data input in the Part 2 method.

From input data, soil temperature seems below 0 Celsius degree in winter. Is there some influences of soil frozen on runoff simulation based on the proposed framework?
Citation: https://doi.org/10.5194/egusphere-2022-815-RC2
- AC2: 'Reply on RC2', Yao Hu, 02 Feb 2023
  
  We thank the reviewers for their insightful comments and constructive suggestions that have led to the improvement of our paper. Our responses to all the comments and suggestions are detailed below.
  Reviewer #2. The authors developed a generalizable framework for the improvement of the efficiency and effectiveness of model training and the reduction of model overfitting. This study makes attempt to predict daily runoff based on data-driven models. The two parts of proposed framework: hyperparameter selection and hyperparameter tuning are significant for machine learning. However, I suggest the authors should make more complete explanations on the results. I recommend the article for acceptance after minor revision.
  1. The data-driven model using the eXtreme Grandient Boosting. There are lots of the machine learning method and I suggest the authors explain the reason in the introduction part.
  The framework is generally applicable to data-driven models using different machine learning algorithms, which need to be fine-tuned through hyperparameters. In this study, we chose to use the model using eXtreme Gradient Boosting algorithm (XGBoost), as it has been demonstrated to be effective for a wide range of regression and classification problems, such as the imbalanced data problem of runoff prediction in our case. We included an explanation of the choice of the XGBoost algorithm.
  
  2. I think the authors should make the simple introduction of study area (figure2) such as climate, soil.
  Thanks for the suggestion. We included an introduction to the study site (See Section 2.3 Case Study).
  3. Figure 5, It seems that the runoff training samples from July, 2012 to Dec.2013 is larger than runoff test samples from Jan.,2014 to Jan.,2016. What is the accuracy? If conversely, training samples are from Jan.,2014 to Jan.,2016 and test samples are from July, 2012 to Dec.2013.
  Thanks. A training dataset is typically larger than a test dataset, as the purpose of training is to expose the model to more data to learn meaningful patterns from the data. If we do this reversely, we can expect to have worse test performance for the period from July 2012 to December 2013 for both the HT and HS-HT cases. Meanwhile, Figure 5 is intended to show that models preceded by Hyperparameter Selection (HS) and Hyperparameter Tuning (HT) approaches are less prone to overfitting than the case with the HT approach alone. When swapping the test dataset for training for both scenarios, the new results still support the conclusion, i.e., models are less prone to overfitting when using both the HS and HT approaches than that for the case with only the HT approach.
  4. Figure 6(a), it seems that HS is better than HT when measured EOF has larger value. How about the performance of HS-HT when measured EOF has larger value?
  Thanks. As shown by Figure 6(a), more blue dots are closer to the Y=X line as the measured EOF runoff values are greater than 10 mm/d, indicating that the model preceded by the HT approach can better predict larger EOF values than the case preceded by the HS approach. Similarly, the model preceded by the HS-HT approach performed better to predict larger EOF values than that with the HS approach.
  5. From Zenodo, we could find the input file and there are lots of inputs such as soil moisture for this study. I suggest the author give simple introduction about the data input in the Part 2 method.
  Thanks for the suggestion. We included a data description in the Method section (See Section 2.3.1 Data Preparation) and the supporting information (See Table S2: Influential variables for the Maumee Domain).
  6. From input data, soil temperature seems below 0 Celsius degree in winter. Is there some influences of soil frozen on runoff simulation based on the proposed framework?
  Thanks. When predicting the edge-of-field (EOF) runoff, input variables were identified and selected from the previous study to predict the runoff for both winter and non-winter seasons. For example, the influences of frozen soil can be captured by two input variables, including ACSNOM (accumulated melting water out of snow bottom) and SOIL_T (soil Temperature). Please see the modifications in Section 2.3.1 Data Preparation.
  
  Citation: https://doi.org/10.5194/egusphere-2022-815-AC2
- AC3: 'Reply on RC2', Yao Hu, 02 Feb 2023
  
  Supporting information for the manuscript is included.
  
  Citation: https://doi.org/10.5194/egusphere-2022-815-AC3

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-815', Anonymous Referee #1, 04 Dec 2022

This manuscript proposed an AI framework consisting of hyperparameter selection and training parts to improve training efficiency and reduce overfitting. And the case study for daily runoff prediction in the Maumee domain showed the good potential of this framework. In general, this manuscript is well-written, and the conclusion is reasonable. Therefore, I think the manuscript can be published in EGUsphere after minor revision.

(a) When discussing the hyperparameter SS, the paper mentioned that the daily EOF runoff is imbalanced data. Usually, this is very important for AI model training. Can authors explain how to deal with the imbalanced data? I recommend the authors preprocess the data to improve the availability of the data in model training if the authors did not deal with the imbalanced data.

(b) I suggest authors add a flowchart in the manuscript, which will be good for readers to understand the framework.

Citation: https://doi.org/10.5194/egusphere-2022-815-RC1
- AC1: 'Reply on RC1', Yao Hu, 02 Feb 2023
  
  We thank the reviewers for their insightful comments and constructive suggestions that have led to the improvement of our paper. Our responses to all the comments and suggestions are detailed below.
  Reviewer #1: This manuscript proposed an AI framework consisting of hyperparameter selection and training parts to improve training efficiency and reduce overfitting. And the case study for daily runoff prediction in the Maumee domain showed the good potential of this framework. In general, this manuscript is well-written, and the conclusion is reasonable. Therefore, I think the manuscript can be published in EGUsphere after minor revision.
  (a) When discussing the hyperparameter SS, the paper mentioned that the daily EOF runoff is imbalanced data. Usually, this is very important for AI model training. Can authors explain how to deal with the imbalanced data? I recommend the authors preprocess the data to improve the availability of the data in model training if the authors did not deal with the imbalanced data.
  Thanks. We agreed with the reviewer that the imbalanced data poses great challenges to model training. There are several ways to improve the training quality for imbalanced data: 1) Select an effective machine learning (ML) algorithm that has built-in mechanisms to deal with imbalanced data. 2) Choose a good cross-validation strategy to ensure the training and test datasets follow a similar distribution of the target variable. 3) Set class weights on your target classes to give more weight to the minority class. In our study, as our focus is to identify the influential hyperparameters for the regression models that are trained to predict the magnitude of daily EOF runoff, we chose an effective ML algorithm, the Extreme Gradient Boosting (XGBoost) algorithm for model training; the XGBoost algorithm offers a range of hyperparameters that can give fine-grained control over the model training performance against imbalance data. For example, we used the Stratified K-Fold cross-validation to ensure the training and test datasets follow a similar distribution and defined a loss function that penalizes more the missing predictions of non-zero runoff events, that is, the minority class in this study.
  (b) I suggest authors add a flowchart in the manuscript, which will be good for readers to understand the framework.
  Thanks. Figure 1 explains the different components of the methodological framework. To further clarify, we follow the suggestion by the reviewer to add some descriptions on the workflow as follows:
  We first choose a machine learning algorithm and its associated hyperparameters. Then, we feed the initial hyperparameters (1) to the hyperparameter selection (HS) module to determine the influential hyperparameters (2). Once initial values are assigned to the influential hyperparameters (4), we use the hyperparameter tuning to identify their optimal values (3), which allows the algorithm to achieve the best performance in training. A case study is used to illustrate the workflow in more detail.
  
  Citation: https://doi.org/10.5194/egusphere-2022-815-AC1
RC2:
'Comment on egusphere-2022-815', Anonymous Referee #2, 09 Jan 2023
The authors developed a generalizable framework for the improvement of the efficiency and effectiveness of model training and the reduction of model overfitting. This study makes attemp to predict daily runoff based on data-driven models. The two parts of proposed framework: hyperparameter selection and hyperparameter tuning are significant for machine learning. However, I suggest the authors should make more complete explanations on the results. I recommend the article for acceptance after minor revision.

The data-driven model using the eXtreme Grandient Boosting. There are lots of the machine learning method and I suggest the authors explain the reason in the introduction part.

I think the authors should make the simple introduction of study area (figure2) such as climate, soil.

Figure 5, It seems that the runoff training samples from July, 2012 to Dec.2013 is larger than runoff test samples from Jan.,2014 to Jan.,2016. What is the accuracy? If conversely, training samples are from Jan.,2014 to Jan.,2016 and test samples are from July, 2012 to Dec.2013.

Figure 6(a), it seems that HS is better than HT when measured EOF has larger value. How about the performance of HS-HT when measured EOF has larger value?

From Zenodo, we could find the input file and there are lots of inputs such as soil moisture for this study. I suggest the author give simple introduction about the data input in the Part 2 method.

From input data, soil temperature seems below 0 Celsius degree in winter. Is there some influences of soil frozen on runoff simulation based on the proposed framework?
Citation: https://doi.org/10.5194/egusphere-2022-815-RC2
- AC2: 'Reply on RC2', Yao Hu, 02 Feb 2023
  
  We thank the reviewers for their insightful comments and constructive suggestions that have led to the improvement of our paper. Our responses to all the comments and suggestions are detailed below.
  Reviewer #2. The authors developed a generalizable framework for the improvement of the efficiency and effectiveness of model training and the reduction of model overfitting. This study makes attempt to predict daily runoff based on data-driven models. The two parts of proposed framework: hyperparameter selection and hyperparameter tuning are significant for machine learning. However, I suggest the authors should make more complete explanations on the results. I recommend the article for acceptance after minor revision.
  1. The data-driven model using the eXtreme Grandient Boosting. There are lots of the machine learning method and I suggest the authors explain the reason in the introduction part.
  The framework is generally applicable to data-driven models using different machine learning algorithms, which need to be fine-tuned through hyperparameters. In this study, we chose to use the model using eXtreme Gradient Boosting algorithm (XGBoost), as it has been demonstrated to be effective for a wide range of regression and classification problems, such as the imbalanced data problem of runoff prediction in our case. We included an explanation of the choice of the XGBoost algorithm.
  
  2. I think the authors should make the simple introduction of study area (figure2) such as climate, soil.
  Thanks for the suggestion. We included an introduction to the study site (See Section 2.3 Case Study).
  3. Figure 5, It seems that the runoff training samples from July, 2012 to Dec.2013 is larger than runoff test samples from Jan.,2014 to Jan.,2016. What is the accuracy? If conversely, training samples are from Jan.,2014 to Jan.,2016 and test samples are from July, 2012 to Dec.2013.
  Thanks. A training dataset is typically larger than a test dataset, as the purpose of training is to expose the model to more data to learn meaningful patterns from the data. If we do this reversely, we can expect to have worse test performance for the period from July 2012 to December 2013 for both the HT and HS-HT cases. Meanwhile, Figure 5 is intended to show that models preceded by Hyperparameter Selection (HS) and Hyperparameter Tuning (HT) approaches are less prone to overfitting than the case with the HT approach alone. When swapping the test dataset for training for both scenarios, the new results still support the conclusion, i.e., models are less prone to overfitting when using both the HS and HT approaches than that for the case with only the HT approach.
  4. Figure 6(a), it seems that HS is better than HT when measured EOF has larger value. How about the performance of HS-HT when measured EOF has larger value?
  Thanks. As shown by Figure 6(a), more blue dots are closer to the Y=X line as the measured EOF runoff values are greater than 10 mm/d, indicating that the model preceded by the HT approach can better predict larger EOF values than the case preceded by the HS approach. Similarly, the model preceded by the HS-HT approach performed better to predict larger EOF values than that with the HS approach.
  5. From Zenodo, we could find the input file and there are lots of inputs such as soil moisture for this study. I suggest the author give simple introduction about the data input in the Part 2 method.
  Thanks for the suggestion. We included a data description in the Method section (See Section 2.3.1 Data Preparation) and the supporting information (See Table S2: Influential variables for the Maumee Domain).
  6. From input data, soil temperature seems below 0 Celsius degree in winter. Is there some influences of soil frozen on runoff simulation based on the proposed framework?
  Thanks. When predicting the edge-of-field (EOF) runoff, input variables were identified and selected from the previous study to predict the runoff for both winter and non-winter seasons. For example, the influences of frozen soil can be captured by two input variables, including ACSNOM (accumulated melting water out of snow bottom) and SOIL_T (soil Temperature). Please see the modifications in Section 2.3.1 Data Preparation.
  
  Citation: https://doi.org/10.5194/egusphere-2022-815-AC2
- AC3: 'Reply on RC2', Yao Hu, 02 Feb 2023
  
  Supporting information for the manuscript is included.
  
  Citation: https://doi.org/10.5194/egusphere-2022-815-AC3

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Yao Hu on behalf of the Authors (06 Feb 2023) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (07 Feb 2023) by Le Yu

RR by Anonymous Referee #3 (17 Feb 2023)

Suggestions for revision or reasons for rejection

Hu et al. present a methodological framework to improve the efficiency and effectiveness of data-driven models using a two-part approach: (1) using Sobol global sensitivity analysis for hyperparameter selection (HS) and (2) using Bayesian optimization for hyperparameter tuning (HT). This generalizable framework was demonstrated using a case study of daily EOF runoff predictions in the Maumee domain, demonstrating that a combination of HS-HT as developed in the framework is most effective in improving performance of the data-driven model in addition to reducing overfitting. The manuscript is well written and suitable for publication in GMD. I recommend publication after addressing my minor comments below.

Specific comments:
1. L25: The authors point in the introduction several barriers to the wide use of physics-based numerical models, which are accurate. For a balanced discussion, I also recommend a brief sentence discussing potential barriers to usage of data-driven machine learning models. e.g., the availability (and storage requirements) of data, and need for computational resources (i.e., GPUs).

2. L81: A set of initial values is assigned to the influential hyperparameters during HS. Is the HT phase sensitive to the choice of initial parameter values (would this affect the outcome of the HT process)? How are these initial values chosen?

3. L207-209: Following up on Reviewer #1's comments regarding imbalanced data, I would suggest elaborating the added paragraph with the general advice offered in the authors' response - namely, the choice of an effective ML algorithm, a good CV strategy, and weight the minority class in the class weights. This will be useful for future readers of the paper as guidance beyond the specific problem presented.

4. The results section for the test case is well presented. I would only suggest, if possible, to add a comparison of the HS-HT approach's test performance with prior work (either using physics-based or data-driven models) to provide better context for the performance of the proposed framework.

Technical corrections:
5. L57: I suggest keeping "Category I" and "Category II" numbering consistent with the numbering above, now 1) and 2).

Hide

RR by Anonymous Referee #1 (18 Feb 2023)

RR by Anonymous Referee #4 (28 Feb 2023)

ED: Publish subject to minor revisions (review by editor) (28 Feb 2023) by Le Yu

AR by Yao Hu on behalf of the Authors (08 Mar 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (08 Mar 2023) by Le Yu

AR by Yao Hu on behalf of the Authors (11 Mar 2023) Manuscript

Journal article(s) based on this preprint

06 Apr 2023

A methodological framework for improving the performance of data-driven models: a case study for daily runoff prediction in the Maumee domain, USA

Yao Hu, Chirantan Ghosh, and Siamak Malakpour-Estalaki

Geosci. Model Dev., 16, 1925–1936, https://doi.org/10.5194/gmd-16-1925-2023,https://doi.org/10.5194/gmd-16-1925-2023, 2023

Short summary

Yao Hu, Chirantan Ghosh, and Siamak Malakpour-Estalaki

Viewed

Total article views: 322 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
234	75	13	322	3	3

HTML: 234
PDF: 75
XML: 13
Total: 322
BibTeX: 3
EndNote: 3

Views and downloads (calculated since 21 Oct 2022)

Month	HTML	PDF	XML	Total
Oct 2022	70	18	4	92
Nov 2022	36	15	0	51
Dec 2022	29	9	3	41
Jan 2023	19	5	0	24
Feb 2023	57	22	6	85
Mar 2023	22	5	0	27
Apr 2023	1	1	0	2
May 2023	0
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Cumulative views and downloads (calculated since 21 Oct 2022)

Month	HTML	PDF	XML	Total
Oct 2022	70	18	4	92
Nov 2022	36	15	0	51
Dec 2022	29	9	3	41
Jan 2023	19	5	0	24
Feb 2023	57	22	6	85
Mar 2023	22	5	0	27
Apr 2023	1	1	0	2
May 2023	0
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Viewed (geographical distribution)

Total article views: 316 (including HTML, PDF, and XML) Thereof 316 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 04 Sep 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (8652 KB)
Metadata XML

Short summary

Data-driven models (DDMs) gain popularity in earth and environmental systems due mainly to the advancements in data collection techniques and artificial intelligence (AI). The performance of such models is determined by the underlying machine learning (ML) algorithms. In this study, we develop a framework to improve the model performance by optimizing ML algorithms. We demonstrate the effectiveness of the framework using a DDM to predict edge-of-field runoff in the Maumee domain, U.S.


Total:	0
HTML:	0
PDF:	0
XML:	0