Preprints
https://doi.org/10.5194/egusphere-2026-1798
https://doi.org/10.5194/egusphere-2026-1798
17 Jun 2026
 | 17 Jun 2026
Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

Choosing an operational inference pipeline for internal solitary wave detection in Sentinel-1 SAR imagery: EVA02-Large+XGBoost versus SAR_CNN v2 (Lux.jl)

João Pinelo, Arun Shukla, Gilberto Titericz, Adriana Santos-Ferreira, João Gonçalves, and João Moniz

Abstract. This paper presents a systematic comparative evaluation of two machine learning inference pipelines developed for the Internal Waves Service (IWS), an operational platform for the continuous automated detection of oceanic internal solitary waves (ISWs) in Sentinel-1 synthetic aperture radar (SAR) Wave mode imagery. The IWS ingests imagery from the live Sentinel-1 feed — scaling to approximately 4,000 images per day as the constellation reaches full operational capacity — and is systematically acquiring a historical archive estimated at up to 17 million images back to 2014. The two pipelines compared are a Python pipeline pairing EVA02, a 305-million-parameter pretrained vision transformer, with an XGBoost classifier; and a Julia pipeline built around a 283,329-parameter convolutional neural network implemented in Lux.jl and trained from scratch on domain-specific SAR imagery. Both pipelines were benchmarked across four deployment configurations (each on GPU and CPU) on the service's production server hardware, measuring classification accuracy, inference throughput, GPU energy consumption, and memory footprint. The Python pipeline achieves higher classification accuracy (F1 96.26 % versus 95.00 %; AUC-ROC 99.29 % versus 98.90 %), attributable to the representational capacity of the pretrained vision transformer. The Julia pipeline is 132 times faster on GPU (3,396 versus 25.6 images per second) and consumes 267 times less energy per image (43.7 versus 11,690 mJ), completing a full archive reprocessing pass in 1.4 hours versus 7.7 days. Classification is bit-for-bit identical across GPU and CPU for the Julia pipeline, confirming that the deployment target can be chosen on operational grounds without accuracy trade-offs. Per-image metrics are projected to operational volumes, quantifying annual GPU occupation (2.9 versus 384 hours at the current reprocessing cadence) and throughput headroom for future constellation expansion. Based on these findings, the IWS deploys the Julia pipeline on GPU for all inference, accepting the 1.26-percentage-point accuracy trade-off in exchange for same-day archive reclassification and minimal contention on shared institutional GPU infrastructure. The evaluation methodology — benchmarking on production hardware and projecting to operational volumes — is directly transferable to other Earth observation services evaluating inference pipeline options.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
João Pinelo, Arun Shukla, Gilberto Titericz, Adriana Santos-Ferreira, João Gonçalves, and João Moniz

Status: open (until 12 Aug 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
João Pinelo, Arun Shukla, Gilberto Titericz, Adriana Santos-Ferreira, João Gonçalves, and João Moniz
João Pinelo, Arun Shukla, Gilberto Titericz, Adriana Santos-Ferreira, João Gonçalves, and João Moniz
Metrics will be available soon.
Latest update: 17 Jun 2026
Download
Short summary
Satellite radar images can reveal large underwater waves that affect ocean mixing, ecosystems, and offshore operations. We run a global monitoring service that classifies thousands of these images daily using machine learning. We compared two pipelines on our operational server: one using a large general-purpose model, the other a small purpose-built model. The smaller model is 132 times faster and uses 267 times less energy, with only a minor accuracy trade-off. We chose it.
Share