1Institute of Surface-Earth System Science, School of Earth System Science, Tianjin University, Tianjin, 300072, China
2Tianjin Key Laboratory of Earth Critical Zone Science and Sustainable Development in Bohai Rim, Tianjin University, Tianjin, 300072, China
3Tianjin Bohai Rim Coastal Earth Critical Zone National Observation and Research Station, Tianjin University, Tianjin, 300072, China
4State Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100101, China
5Research Center for Remote Sensing Information and Digital Earth, College of Computer Science and Technology, Qingdao University, Qingdao, 266071, China
1Institute of Surface-Earth System Science, School of Earth System Science, Tianjin University, Tianjin, 300072, China
2Tianjin Key Laboratory of Earth Critical Zone Science and Sustainable Development in Bohai Rim, Tianjin University, Tianjin, 300072, China
3Tianjin Bohai Rim Coastal Earth Critical Zone National Observation and Research Station, Tianjin University, Tianjin, 300072, China
4State Key Laboratory of Remote Sensing Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100101, China
5Research Center for Remote Sensing Information and Digital Earth, College of Computer Science and Technology, Qingdao University, Qingdao, 266071, China
Received: 23 Nov 2022 – Discussion started: 05 Jan 2023
Abstract. Despite recent developments in geoscientific (e.g., physics/data-driven) models, effectively assembling multiple models for approaching a benchmark solution remains challenging in many sub-disciplines of geoscientific fields. Here, we proposed an automated machine learning-assisted ensemble framework (AutoML-Ens) that attempts to resolve this challenge. Details of the methodology and workflow of AutoML-Ens were provided, and a prototype model was realized with the key strategy of mapping between the probabilities derived from the machine learning classifier and the dynamic weights assigned to the candidate ensemble members. Based on the newly proposed framework, its applications for two real-world examples (i.e., mapping global soil water retention parameters and estimating remotely sensed cropland evapotranspiration) were investigated and discussed. Results showed that compared to conventional ensemble approaches, AutoML-Ens was superior across the datasets (the training, testing, and overall datasets) and environmental gradients with improved performance metrics (e.g., coefficient of determination, Kling-Gupta efficiency, and root mean squared error). The better performance suggested the great potential of AutoML-Ens for improving quantification and reducing uncertainty in estimates due to its two unique features, i.e., assigning dynamic weights for candidate models and taking full advantage of AutoML-assisted workflow. In addition to the representative results, we also discussed the interpretational aspects of the used framework and its possible extensions. More importantly, we emphasized the benefits of combining data-driven approaches with physics constraints for geoscientific model ensemble problems with high dimensionality in space and non-linear behaviors in nature.
Effectively assembling multiple models for approaching a benchmark solution remains a long-standing issue for various geoscience domains. We here proposed an automated machine learning-assisted ensemble framework (AutoML-Ens) that attempts to resolve this challenge. Results demonstrated the great potential of AutoML-Ens for improving estimations due to its two unique features, i.e., assigning dynamic weights for candidate models and taking full advantage of AutoML-assisted workflow.
Effectively assembling multiple models for approaching a benchmark solution remains a...