the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Data-Efficient Deep Transfer Learning Framework for Methane Super-Emitter Detection in Oil and Gas Fields Using Sentinel-2 Satellite
Abstract. Efficiently detecting large methane point sources (super-emitters) in oil and gas fields is crucial for informing stakeholders for mitigation actions. Satellite measurements by multispectral instruments, such as Sentinel-2, offer global and frequent coverage. However, methane signals retrieved from satellite multispectral images are prone to surface and atmospheric artifacts that vary spatially and temporally, making it challenging to build a detection algorithm that applies everywhere. Hence, laborious manual inspection is often necessary, hindering widespread deployment of the technology. Here, we propose a novel deep-transfer-learning-based methane plume detection framework. It consists of two components: an adaptive artifact removal algorithm (low reflectance artifact detection, LRAD) to reduce artifacts in methane retrievals, and a deep subdomain adaptation network (DSAN) to detect methane plumes. To train the algorithm, we compile a dataset comprising 1627 Sentinel-2 images from 6 known methane super-emitters reported in the literatures. We evaluate the ability of the algorithm to discover new methane sources with a suite of transfer tasks, in which training and evaluation data come from different regions. Results show that the DSAN (average macro-F1 score 0.86) outperforms two convolutional neural networks (CNN), MethaNet (average macro-F1 score 0.7) and ResNet-50 (average macro-F1 score 0.77), in transfer tasks. The transfer-learning algorithm overcomes the issue of conventional CNNs that their performance degrades substantially in regions outside training data. We apply the algorithm trained with known sources to an unannotated region in the Algerian Hassi Messaoud oil field and reveal 34 anomalous emission events during a one-year period, which are attributed to 3 methane super-emitters associated with production and transmission infrastructure. These results demonstrate the potential of our deep-transfer-learning-based method towards efficient methane super-emitter discovery using Sentinel-2 across different oil and gas fields worldwide.
- Preprint
(2862 KB) - Metadata XML
-
Supplement
(1575 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2565', Anonymous Referee #1, 06 Nov 2024
[Overview / General comments]
The paper collects a dataset of methane leak events using data from the Sentinel-2 satellite (mainly over 6 known sites, but also tested on new area of interest) and proposes several methods for detection of methane leaks with an emphasis on efficient data sampling.
First, a series of steps together named as "low reflectance artifact detection" (LRAD) is proposed to pre-filter data before the methane detection occurs. Several threshold based rules are used to rule out pixels linked to confounder events such as flare, smoke and water bodies. (Open question - would it be possible to leverage a trained model to do these detections instead of relying on these classically used thresholds?)
For the remaining clear pixels a methane enhancement product is calculated (using B11 and B12 or current and past clear scene) with methods similar to the band ratios proposed by [Varon, D. J., et al. High-frequency monitoring of anomalous methane point sources with multispectral Sentinel-2 satellite observations, Atmos. Meas. Tech., 14, 2771–2785, https://doi.org/10.5194/amt-14-2771-2021, 2021.].Secondly, the "deep subdomain adaptation network" (DSAN) model is used, which extracts high dimensional features from two images (one from the source domain, one from the target domain) and aligns them using both classification losses and domain adaptation losses, in an unsupervised learning manner. The purpose of this training scheme is to limit the number of required labelled samples (in the target domain).
For detected methane leak events, the IME method is used for quantification.As an additional experiment, the paper explores a new area using a year worth of Sentinel-2 data with the proposed approach (with models pre-trained on the training datasets) for detections of methane leak events.
The work shows improved F1 scores over using ResNet50-based and MethaNet models (trained on the same data). In addition, it also reports detection scores over the newly observed area (with 33 true positive detections (although 30 came from one location) from 369 proposed ones by the model and with 1 false negative detection).
Overall, the paper motivation is solid, the experiments were conducted with some care for detail, and the results are interesting. It is also appreciated that the created datasets will be made public. However, there are few shortcomings in the current version of the paper write-up.
-------------------
[Points / Specific comments]
[Major point 1]
The dataset description needs to be improved. As it is now, the description of various parts of the used data is split into many sections. Currently, the description of the training dataset starts in section 2.1, later the validation subset is detailed in 2.5.1 and the test initially mentioned in 2.5.2, while in more detail explained in section 3. Overall this makes it quite difficult to read, which data was used when. -> Please unify this and clear up the existing split sections.
Additionally, some details are missing - practical ones which would make it easier to understand the exact training process used. For example, it is not exactly clear which input resolution does the data have - is it 200x200 px, or 224x224 px? - Please clarify this and explicitly write this in the paper. Please also detail the exact number of tiles used in the different dataset splits.
Also please highlight if and what measures to detect overlap between samples in the created train/val/test datasets were used (temporal split? spatial split?). This is quite clear for the train/test sets, but less so for the validation set used for the other model architectures.
A more explicit description of these details is expected in typical machine learning literature (which is not just reusing existing benchmark datasets).[Major point 2]
Right now, it is not clear how much have the proposed steps helped for the real data prediction. There, we see only one result, but it would be informative to see results from other used models. If it's possible and feasible, please add an ablation study of what would happen if we didn't use the proposed pre-filtering step LRAD? Would the scores degrade significantly? These results don't need to be analysed in such a detail as the rest, a simple single row of results in Table 4 would suffice (as the labeling has been already done for this data, it should be easy to recalulate these scores for another model variants).
[Minor point 1]
The background literature sections is missing some recent works. For example [Růžička, V., Mateo-Garcia, G., Gómez-Chova, L. et al. Semantic segmentation of methane plumes with hyperspectral machine learning models. Sci Rep 13, 19999 (2023). https://doi.org/10.1038/s41598-023-44918-6] proposed a U-Net based model working with a mixture of source instruments and shows performance for both multispectral (WorldView-2) and hyperspectral data (AVIRIS-NG and EMIT). Relevant to this work, it also demonstrated zero-shot generalisation. That is using a model pre-trained on first dataset (with relatively local samples) on data from near-global sensor (with larger diversity of background scenes) -> please relate this to the used source / target domain adaptation used in this work. It also should be added to page 4 among other deep learning techniques used to detect methane leaks in multispectral and hyperspectral data.
[Minor point 2]
Please describe which exact model variant was used for the real data prediction (results shown on Fig 11) - can this be related to one of the already used scenarios (1-1, 5-1, ...)?
[Minor point 3]
On page 8 clarify if the used formula is the multiband-multi-pass (MBMP) method of [Varon, D. J., et al. High-frequency monitoring of anomalous methane point sources with multispectral Sentinel-2 satellite observations, Atmos. Meas. Tech., 14, 2771–2785, https://doi.org/10.5194/amt-14-2771-2021, 2021.] and name it as such, or highlight if there are any notable differences.
Citation: https://doi.org/10.5194/egusphere-2024-2565-RC1 -
RC2: 'Comment on egusphere-2024-2565', Anonymous Referee #2, 03 Dec 2024
This study developed a deep transfer learning framework with an artifact remove algorithm to detect methane emitters from Sentinel-2 satellite images. The manuscript is well written and the results are presented with informative figures. The detection model is compared with other models and applied to discover new methane sources. My detailed comments are listed below:
- In the introduction section, the author discussed two general challenges of satellite-based methane detection; however, many previous works have been done on satellite-based methane detection and the author needs to summary the limitation of previous works.
- To generate the training dataset, Sentinel-2 images of the six emitters were collected during different time periods. Why not use the same sampling time period for all the emitters?
- Fig 2 Step 2, classify 3D ΔR Does the “3D” here mean a serious of imageries of different time were classified separately or some designed pairs of imageries were classified here?
- Although the model is evaluated with 1 to 1 task and 5 to 1 task, the test samples are limited to the 6 emitters and the model performance is relatively poor of emitter#6 with a heterogeneity background. Is it possible to collect more emitter images with varying background for testing? The DSAN model employs ResNet-50 for feature extraction. Thus, the performance of DSAN model is expected to be better than the ResNet-50. I would like to see the comparison results of the LRAD-DSAN model with other previously reported methane detection model.
- The detection of new emitter experiment was conducted in the Hassi Messaoud O&G field in Algeria that has homogeneous background very similar as the training case #4 and #5. Since this detection model is highlighted with the transfer ability, I am wondering the model performance in area with different background.
Citation: https://doi.org/10.5194/egusphere-2024-2565-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
279 | 86 | 10 | 375 | 22 | 5 | 3 |
- HTML: 279
- PDF: 86
- XML: 10
- Total: 375
- Supplement: 22
- BibTeX: 5
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1