the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Scalable Feature Extraction and Tracking (SCAFET): A general framework for feature extraction from large climate datasets
Abstract. This study describes a generalized framework, Scalable Feature Extraction and Tracking (SCAFET) to extract and track features from large climate datasets. SCAFET utilizes novel shape-based metrics that can efficiently identify and compare features from different mean states, datasets, and between distinct regions. Features of interest are extracted by segmenting the data based on a scale-independent bounded variable called shape index (SI). SI gives a quantitative measurement of the local geometric shape of the field with respect to its surroundings. To demonstrate the capabilities of the method, we illustrate the detection of atmospheric rivers, tropical and extratropical cyclones, sea surface temperature fronts, and jet streams. Cyclones and atmospheric rivers are extracted from the ERA5 reanalysis dataset to show how the algorithm extracts both locations and areas from climate datasets. The extraction of sea surface temperature fronts exemplifies how SCAFET effectively handles curvilinear grids. Lastly, jet streams are extracted to demonstrate how the algorithm can also detect 3D features. SCAFET can be implemented to extract and track most weather and climate features.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(26304 KB)
-
Supplement
(16809 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(26304 KB) - Metadata XML
-
Supplement
(16809 KB) - BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-592', Anonymous Referee #1, 06 Jul 2023
The authors present SCAFET as a new framework to extract weather features from large climate datasets. SCAFET follows the standard paradigm of segment, filter, and track. The traditional approach to segmentation utilizes absolute thresholding of appropriate climate variables, which is well known to be sensitive to the particular model, climate state, and even spatial location. This makes it difficult to have a standardized detection algorithm that works uniformly across models and warming scenarios etc. SCAFET is instead shape-based, giving a relative and directional thresholding, making it a much more robust approach for weather feature identification. The authors demonstrate its utility by identifying and tracking atmospheric rivers, cyclones, fronts, and jet streams. SCAFET is a significant advance over the traditional absolute thresholding methods currently used by climate practitioners. With some minor revisions, see below, I recommend the manuscript for publication.
- My main comment or question is related to how sensitive feature identification is to SCAFET parameters. You have shown that it is possible to identify weather features with SCAFET, which is great, but there is no discussion on how sensitive the results are. For example, how sensitive is the detection of ARs in Figure 4 to the parameters used in Table 1? On the one hand, it is intuitive to identify ARs as long, narrow shapes with (relatively) high IVT and precipitation. But concrete numbers must be used to implement that intuition. If you slightly change the SI threshold for Ridges, or the minimum length, or angle coherence, etc. does this totally change the kind of objects identified so that they no longer resemble ARs (I wouldn't think so, but perhaps), or does it slightly change the details of ARs detected? If it is the latter case, how did you decide on the exact values used in Table 1 for the best identification of ARs? I see there is one sentence, "The quantitative values for the properties are obtained from a consensus of previous studies referenced within each section." but I think this requires more elaboration.
- My second question is, what are we supposed to take away from Section 4.1 on Jet Streams? It shows some proof-of-concept that the method can be applied, in principle, to 3D data. To my eye, I don't see a clear jet stream identified by SCAFET in (b), (d), and (f) of Figure 7. So while the method can be applied to 3D data, it is not clear that it is successful in identifying features in 3D data.
- Third, while I think SCAFET is indeed a significant advance, I believe there are some statements made in the paper which are not justified, or I have misunderstood what you are trying to say.
- Around line 40 there is discussion of dataset pre-processing, such as computing IVT fields for AR detection, and how this becomes infeasible for high resolutions and large ensembles. At first I read this as implying pre-processing as a downside of traditional methods, but something that SCAFET would bypass. However, SCAFET itself uses these pre-processed fields in the identification of ARs and cyclones.
- Starting at the end of line 327, there is the sentence "Due to its design, SCAFET does not require a priori climate information to identify features." I am not sure what is meant by this sentence. In the work presented, the shape-based component is only one piece of the full pipeline to identify weather features. Most obvious, the shapes are extracted from pre-processed fields IVT and RV, which are created from "climate information". Even knowing what generic shapes are appropriate for particular weather features I see as climate information.
- Finally, the writing and grammar etc. of the paper need to be cleaned up. Below are some instances I found during my reading:
- line 15: "… and value (5Vs) of climate data (REFs) of climate data."
- starting in line 162: "In the current study, a simple radius is defined and the closest object within the given radius to each object at time n is clustered and identified from time n+1 as the same object in motion." I get the general idea of what you are saying here, but I found this sentence hard to parse.
- line 186-187: "…derive this threshold from dataset directly, …"
- line 204-205: "… each object is used as to filter …"
- line 274: "…, SSTFs are not tracked as ocean fronts are stationary rather than…"
- in the middle of the Figure 6 caption, "In the next step, ridges, caps, and domes are extracted from (b) and weak and small…", do you mean for this to be (a) instead of (b)?
- line 293-294: "Since the scope of this section is limited to the validation of the detection method, we have only shown jet detection in three selected time steps." I'm not fully sure what you are trying to say here. Do you mean that filtering and tracking steps have not been performed here?
- Figure 7 caption: "The 3D jet streams extracted for the corresponding time period is show in …"
- line 362-363: "change of direction of a along the curve."
- In Figure A2, please adjust the legend so it can be read more clearly
Citation: https://doi.org/10.5194/egusphere-2023-592-RC1 - AC1: 'Reply on RC1', Arjun Nellikkattil, 20 Aug 2023
-
RC2: 'Comment on egusphere-2023-592', Anonymous Referee #2, 10 Jul 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-592/egusphere-2023-592-RC2-supplement.pdf
- AC2: 'Reply on RC2', Arjun Nellikkattil, 20 Aug 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-592', Anonymous Referee #1, 06 Jul 2023
The authors present SCAFET as a new framework to extract weather features from large climate datasets. SCAFET follows the standard paradigm of segment, filter, and track. The traditional approach to segmentation utilizes absolute thresholding of appropriate climate variables, which is well known to be sensitive to the particular model, climate state, and even spatial location. This makes it difficult to have a standardized detection algorithm that works uniformly across models and warming scenarios etc. SCAFET is instead shape-based, giving a relative and directional thresholding, making it a much more robust approach for weather feature identification. The authors demonstrate its utility by identifying and tracking atmospheric rivers, cyclones, fronts, and jet streams. SCAFET is a significant advance over the traditional absolute thresholding methods currently used by climate practitioners. With some minor revisions, see below, I recommend the manuscript for publication.
- My main comment or question is related to how sensitive feature identification is to SCAFET parameters. You have shown that it is possible to identify weather features with SCAFET, which is great, but there is no discussion on how sensitive the results are. For example, how sensitive is the detection of ARs in Figure 4 to the parameters used in Table 1? On the one hand, it is intuitive to identify ARs as long, narrow shapes with (relatively) high IVT and precipitation. But concrete numbers must be used to implement that intuition. If you slightly change the SI threshold for Ridges, or the minimum length, or angle coherence, etc. does this totally change the kind of objects identified so that they no longer resemble ARs (I wouldn't think so, but perhaps), or does it slightly change the details of ARs detected? If it is the latter case, how did you decide on the exact values used in Table 1 for the best identification of ARs? I see there is one sentence, "The quantitative values for the properties are obtained from a consensus of previous studies referenced within each section." but I think this requires more elaboration.
- My second question is, what are we supposed to take away from Section 4.1 on Jet Streams? It shows some proof-of-concept that the method can be applied, in principle, to 3D data. To my eye, I don't see a clear jet stream identified by SCAFET in (b), (d), and (f) of Figure 7. So while the method can be applied to 3D data, it is not clear that it is successful in identifying features in 3D data.
- Third, while I think SCAFET is indeed a significant advance, I believe there are some statements made in the paper which are not justified, or I have misunderstood what you are trying to say.
- Around line 40 there is discussion of dataset pre-processing, such as computing IVT fields for AR detection, and how this becomes infeasible for high resolutions and large ensembles. At first I read this as implying pre-processing as a downside of traditional methods, but something that SCAFET would bypass. However, SCAFET itself uses these pre-processed fields in the identification of ARs and cyclones.
- Starting at the end of line 327, there is the sentence "Due to its design, SCAFET does not require a priori climate information to identify features." I am not sure what is meant by this sentence. In the work presented, the shape-based component is only one piece of the full pipeline to identify weather features. Most obvious, the shapes are extracted from pre-processed fields IVT and RV, which are created from "climate information". Even knowing what generic shapes are appropriate for particular weather features I see as climate information.
- Finally, the writing and grammar etc. of the paper need to be cleaned up. Below are some instances I found during my reading:
- line 15: "… and value (5Vs) of climate data (REFs) of climate data."
- starting in line 162: "In the current study, a simple radius is defined and the closest object within the given radius to each object at time n is clustered and identified from time n+1 as the same object in motion." I get the general idea of what you are saying here, but I found this sentence hard to parse.
- line 186-187: "…derive this threshold from dataset directly, …"
- line 204-205: "… each object is used as to filter …"
- line 274: "…, SSTFs are not tracked as ocean fronts are stationary rather than…"
- in the middle of the Figure 6 caption, "In the next step, ridges, caps, and domes are extracted from (b) and weak and small…", do you mean for this to be (a) instead of (b)?
- line 293-294: "Since the scope of this section is limited to the validation of the detection method, we have only shown jet detection in three selected time steps." I'm not fully sure what you are trying to say here. Do you mean that filtering and tracking steps have not been performed here?
- Figure 7 caption: "The 3D jet streams extracted for the corresponding time period is show in …"
- line 362-363: "change of direction of a along the curve."
- In Figure A2, please adjust the legend so it can be read more clearly
Citation: https://doi.org/10.5194/egusphere-2023-592-RC1 - AC1: 'Reply on RC1', Arjun Nellikkattil, 20 Aug 2023
-
RC2: 'Comment on egusphere-2023-592', Anonymous Referee #2, 10 Jul 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-592/egusphere-2023-592-RC2-supplement.pdf
- AC2: 'Reply on RC2', Arjun Nellikkattil, 20 Aug 2023
Peer review completion
Journal article(s) based on this preprint
Data sets
Scalable Feature Extraction and Tracking (SCAFET): A general framework for feature extraction from large climate datasets Arjun Babu Nellikkattil https://doi.org/10.5281/zenodo.7767301
Model code and software
Scalable Feature Extraction and Tracking (SCAFET): A general framework for feature extraction from large climate datasets Arjun Babu Nellikkattil https://doi.org/10.5281/zenodo.7767301
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
407 | 173 | 22 | 602 | 35 | 12 | 14 |
- HTML: 407
- PDF: 173
- XML: 22
- Total: 602
- Supplement: 35
- BibTeX: 12
- EndNote: 14
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Cited
Arjun Babu Nellikkattil
Travis Allen O’Brien
Danielle Lemmon
June-Yi Lee
Jung-Eun Chu
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(26304 KB) - Metadata XML
-
Supplement
(16809 KB) - BibTeX
- EndNote
- Final revised paper