The Calibrated Rapid Assimilation and Forecasting Technique (CRAFT) for Earth system and ecological modeling using machine learning and Bayesian estimation
Abstract. Increasing the mechanistic complexity of Earth system and ecological model provides the opportunity for improved understanding with numerical experimentation. However, complexity additionally presents greater difficulty in constraining parameters with data. Determining plausible parameter combinations requires a method by which to incorporate data streams, field observations, and their uncertainty. Bayesian methods of integrating datasets are often limited by the computational limits in running these complex mechanistic models. Machine learning can effectively integrate data and physical models by constructing emulations for the finite simulations needed for parameterization. We present the CRAFT (Calibrated Rapid Assimilation and Forecasting Technique) framework for ecological model parameterization and test it using the mechanism rich ecosystem demographic model, FATES-HYDRO (testing 42 parameters and evaluating it for 6 outputs). This framework uses emulation and parameter reduction to construct more rapidly running emulators and test posterior parametric distribution given observational data. We assess whether this mechanism can emulate the model outputs, the variance across the parameter space, and in future prediction (simulations 2020–2100,) using synthetic model runs. Overall random forest models had an out-of-sample accuracy of 92–99 % in reconstructing observational periods and showed no-significant difference with the physical model for change in most parameters (283/288 parameter and output combinations). 95 % CI posterior ranges of parameters produced FATES-HYDRO runs that had an RMSE for gross primary productivity (GPP) of 3.748 g C month-1, for evapotranspiration (ET) 1.33 mm H2O month-1, for soil moisture 0.005 m2 m-2, 0.381 MPa for maximum leaf water potential (LWPmax), 0.44 MPa for minimum leaf water potential (LWPmin), 1.80 m2 m-1 for runoff (RO) when compared to the synthetic data. Future simulations had a RMSE for GPP of 22.55 gC m-2 month-1, ET had a RMSE of 7.82 mm H2O month-1, RO had an RMSE of 88.80 mm H2O month-1, monthly leaf water potential had an RMSE of 0.145 MPa, soil water content at 20 cm had an RMSE of 0.0138 m2 m-2 when compared to the synthetic dataset. Overall, we show the CRAFT framework as a rapid and accurate semi-automated method to assimilate data and calculate posterior distributions in complex physical models. This framework could accelerate our scientific discovery through rapid accuracy improvement in process-based modeling and more mathematically robust prediction with constrained uncertainty in model parameters.