Preprints
https://doi.org/10.5194/egusphere-2026-2136
https://doi.org/10.5194/egusphere-2026-2136
22 Jun 2026
 | 22 Jun 2026
Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

Spatial Predictor Selection for Next-Day Minimum Temperature Forecasting: An Automated Machine Learning Framework Applied Across European Climate Regimes

Éric Duhamel

Abstract. Accurate prediction of near-surface air temperature remains a central challenge in geoscientific modeling, particularly when integrating high-dimensional spatial predictors derived from reanalysis datasets. While Model Output Statistics (MOS) approaches have been widely used, the systematic selection of spatially distributed predictors remains an open methodological issue.

This study proposes a genetic algorithm (GA) framework for automated predictor selection in daily minimum temperature forecasting. The method operates on spatially structured inputs derived from ERA5 reanalysis and is evaluated using observed temperature data from multiple European locations. The GA is designed to explore high-dimensional predictor spaces while controlling model complexity and ensuring compatibility with non-linear learning algorithms.

The approach is assessed using a one-day-ahead forecasting setup and compared against a LASSO-based baseline. Results show that the GA identifies compact predictor subsets that achieve predictive performance comparable to, or slightly better than, the baseline. Across test locations, mean absolute error values remain stable and indicate robust generalization.

Analysis of selected predictors highlights the existence of stable variable categories, although individual spatial selections exhibit variability across runs, reflecting the stochastic nature of the optimization process. These results suggest that predictor relevance should be interpreted in terms of distributions rather than fixed sets.

The proposed framework provides a flexible and reproducible approach to spatial feature selection in geoscientific applications. Its compatibility with complex models and high-dimensional inputs makes it a promising tool for improving forecasting systems based on reanalysis data. A key finding of this study is that spatial predictor selection is inherently non-unique, yet exhibits stable statistical structures at the variable level, suggesting that predictor relevance should be interpreted in probabilistic rather than deterministic terms.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Éric Duhamel

Status: open (until 17 Aug 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Éric Duhamel
Éric Duhamel
Metrics will be available soon.
Latest update: 25 Jun 2026
Download
Short summary
This study presents a method to improve daily minimum temperature forecasting using an evolutionary algorithm. It selects relevant predictors from high-dimensional ERA5 (ECMWF Reanalysis v5) data and is evaluated across multiple European locations. Results show compact predictor sets with performance comparable to or better than a LASSO (Least Absolute Shrinkage and Selection Operator) baseline, while revealing stable statistical patterns despite variability in individual selections.
Share