A Machine Learning Method for Estimating Atmospheric Trace Gas Concentration Baselines
Abstract. Estimates of trace gas baseline mole fractions in high-frequency atmospheric measurement records are crucial for analysing long-term changes in atmospheric composition. Baseline mole fractions are those that would be observed far from emission sources (and hence are representative of background conditions) at specific latitudes in the atmosphere. Previous methods for inferring baseline mole fractions have used statistical or meteorological approaches, or, if available, co-measured tracer species thought only to be emitted from non-baseline wind sectors. Combinations of these techniques have also been employed in some applications. Statistical methods typically fit a baseline to the observations themselves, while meteorological methods use atmospheric models of varying complexity to categorise air mass origins. In this paper, we present a novel machine learning method for estimating trace gas baseline mole fractions, which benefits from the physical basis of model-based filtering without the need for running an expensive simulator. Our approach offers the accessibility and computational cost-effectiveness of statistical models, without the associated smoothing or difficulty in identifying rapid baseline variations. By training on historical Lagrangian particle dispersion model outputs, our model learns to predict baseline mole fractions directly from meteorological fields. This advancement opens new avenues for low-latency trace gas time series data analysis, reconstruction of historical baseline trends, and improved utilisation of tracer measurement air mass classification methods.