A machine-learning reference dataset for SO2 plumes observed by TROPOMI: uncertainties and emission estimates
Abstract. Sulphur dioxide (SO2) is a major atmospheric pollutant from fossil fuel combustion, metal smelting, and volcanic degassing, impacting human health, acid deposition, and climate forcing. Existing emission inventories are often temporally lagged and spatially coarse, failing to capture high-intensity, sporadic events. To address this, we present a novel, near real-time approach using a U-Net image segmentation model to automatically isolate SO2 plumes from over 31,000 TROPOMI satellite swaths (Jan 2019–Dec 2024). The model successfully identified 53,993 individual plumes. The highest annual detection rate in 2019 was attributed to massive stratospheric SO2 injections from the Raikoke and Ulawun volcanic eruptions. Clustering analysis confirmed plume origins around expected volcanic and industrial hotspots (e.g., Iztaccíhuatl, Norilsk), with volcanic sources dominating the top ten clusters. We derived rapid, physics-informed emission rate estimates for each plume, finding a median rate of 14,629 kg hr-1. This detection threshold for this approach, which we estimate to be ~524 kg hr-1, is four orders of magnitude larger than typical fluxes in the EDGAR inventory, demonstrating the utility of the plume database for detecting extreme, high-intensity events. However, the algorithm struggles to detect sources in high-background regions like China, where high SO2 saturation likely prevents individual plume isolation. This study demonstrates machine learning as a powerful tool for transforming atmospheric monitoring, providing the high-cadence, fine-grained quantification of SO2 emissions crucial for validating global inventories and ensuring effective environmental management.