Choosing an operational inference pipeline for internal solitary wave detection in Sentinel-1 SAR imagery: EVA02-Large+XGBoost versus SAR_CNN v2 (Lux.jl)
Abstract. This paper presents a systematic comparative evaluation of two machine learning inference pipelines developed for the Internal Waves Service (IWS), an operational platform for the continuous automated detection of oceanic internal solitary waves (ISWs) in Sentinel-1 synthetic aperture radar (SAR) Wave mode imagery. The IWS ingests imagery from the live Sentinel-1 feed — scaling to approximately 4,000 images per day as the constellation reaches full operational capacity — and is systematically acquiring a historical archive estimated at up to 17 million images back to 2014. The two pipelines compared are a Python pipeline pairing EVA02, a 305-million-parameter pretrained vision transformer, with an XGBoost classifier; and a Julia pipeline built around a 283,329-parameter convolutional neural network implemented in Lux.jl and trained from scratch on domain-specific SAR imagery. Both pipelines were benchmarked across four deployment configurations (each on GPU and CPU) on the service's production server hardware, measuring classification accuracy, inference throughput, GPU energy consumption, and memory footprint. The Python pipeline achieves higher classification accuracy (F1 96.26 % versus 95.00 %; AUC-ROC 99.29 % versus 98.90 %), attributable to the representational capacity of the pretrained vision transformer. The Julia pipeline is 132 times faster on GPU (3,396 versus 25.6 images per second) and consumes 267 times less energy per image (43.7 versus 11,690 mJ), completing a full archive reprocessing pass in 1.4 hours versus 7.7 days. Classification is bit-for-bit identical across GPU and CPU for the Julia pipeline, confirming that the deployment target can be chosen on operational grounds without accuracy trade-offs. Per-image metrics are projected to operational volumes, quantifying annual GPU occupation (2.9 versus 384 hours at the current reprocessing cadence) and throughput headroom for future constellation expansion. Based on these findings, the IWS deploys the Julia pipeline on GPU for all inference, accepting the 1.26-percentage-point accuracy trade-off in exchange for same-day archive reclassification and minimal contention on shared institutional GPU infrastructure. The evaluation methodology — benchmarking on production hardware and projecting to operational volumes — is directly transferable to other Earth observation services evaluating inference pipeline options.