the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
CoCoMET v1.0: A Unified Open-Source Toolkit for Atmospheric Object Tracking and Analysis
Abstract. Advances in performance and analysis capabilities have accelerated the development of object tracking algorithms for atmospheric research. This has resulted in a growing number of studies using Lagrangian tracking techniques to analyze the evolution of atmospheric phenomena and the underlying processes. However, the increasing complexity and variety of tracking algorithms present a steep learning curve for new users and make it difficult for existing users to compare algorithm performance.
We introduce CoCoMET (Community Cloud Model Evaluation Toolkit), an open-source toolkit that addresses these issues. CoCoMET simplifies the process of running multiple tracking algorithms simultaneously and analyzing objects in both model and observational datasets by specifying parameters in a single configuration file. It standardizes input data from different sources into a consistent format and unifies the tracking output across algorithms. CoCoMET enhances the functionality of existing tracking methods by calculating additional properties such as cell growth and dissipation rates, perimeter, surface area, convexity, and irregularity. In addition, CoCoMET includes a novel method for identifying mergers and splits in 2D and 3D tracks and supports the integration of external Eulerian/stationary datasets for process studies. Its potential utility is demonstrated through examples of model intercomparison, model evaluation against observations, and comparisons between tracking algorithms. Designed for open-source environments, CoCoMET will continue to expand with future releases, incorporating more input data types and tracking algorithms.
- Preprint
(6155 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-1328', Anonymous Referee #1, 25 Apr 2025
The manuscript introduces CoCoMET v1.0, an open-source Python toolkit that
- ingests a variety of model and observational data sets,
- launches several established tracking algorithms (tobac, MOAAP, TAMS) from a single configuration file, and
- harmonises and enriches the resulting tracks with advanced diagnostics (perimeter, volume, convexity, 2-D/3-D merging-spliting convective cells detection, etc.).
Case studies demonstrate applications to model intercomparison, model–observation evaluation and tracker intercomparison. By unifying disparate trackers under a common interface and producing a standard, analysis-ready output, CoCoMET tackles a recognised bottleneck in object-based cloud research. The toolkit is potentially a valuable community resource and, once the revisions below are addressed, I expect it to be suitable for publication in GMD. Nevertheless, before publication the manuscript would benefit from a careful language edit before resubmission. Once these points are addressed I would be happy to recommend publication.
General comments:
Please state an SPDX-recognised licence (e.g. MIT, BSD-3-Clause, GPL-3.0), the minimum supported Python version and the recommended installation method (pip or conda).
The manuscript needs a thorough language edit. Several forward-looking sentences currently placed in Sections 2 and 3 would read more naturally in Section 4 (Future development).
Although the text advertises “parallelised execution”, no timing or scaling information is given. Even approximate wall-clock times and memory footprints for different cases would greatly assist prospective users.
Section 2.4 claims a novel merger–split method but offers only schematic examples; please provide quantitative validation (e.g. skill scores or comparison with tobac’s own split/merge routine).
Output format – CoCoMET saves results in Python pickles, which are opaque outside the Python ecosystem and brittle across versions. Please justify the choice against an export option to self-describing format (e.g. NetCDF).
Terminology – The distinction between feature, cell and object is unclear. A short glossary or an early discussion (in Section 2) would help. Explicitly state that v1.0 is designed and validated for convective phenomena (and convective cells), even though the architecture could handle other objects.
New developed function for detecting 2D and 3D merging and splitting: Could you provide an intercomparison between other’s tracker merging and splitting and you method?
CoCoMET output format: The new toolkit CoCoMET highlights that it simplifies the run of trackers on different type of data, the analysis of trackers’ outputs and their intercomparisons. However, most of the trackers output in the NetCDF format which is conveniently easy to use and process, and CoComet outputs have binary pickle format which can’t be read without the CoComet specific routine. Please comment on whether this choice could limit long-term usability for users who rely on NetCDF.
Specific comments
Introduction (lines 55–64). The problem statement would be stronger if you cited concrete examples of prior tracker-intercomparison and model-evaluation studies.
Introduction (74–79). Here the terms feature/cell are introduced but not defined; please add definitions and explain that CoCoMET v1.0 focuses on convective clouds and MCSs.
Also clarify the meaning of “life cycle.” Which lifecycle are you referring to? The one of the “cell” of the “feature”.Introduction (closing paragraph). Highlight explicitly that CoCoMET can accommodate both observational and model data, and summarise its community value.
Lines 87–95 and 404–406. These sentences discuss future extensions (e.g. extension to ERA5 data) and belong in Section 4.
L 90-95. Do you perform any grid remapping before applying the tracking? How does the package handles different grid as it would have to for ICON or other models with non regular meshes?
Section2
Figure 2, together with the surrounding text leave it ambiguous whether CoCoMET executes atmospheric models. Please state explicitly whether model execution is in scope.
In Section 2.1.1 the final paragraph contains essential information but lacks context. A short lead-in sentence would help. In addition, please explain why precipitation rates are recomputed for RAMS and WRF (lines 137–140), whether these fields are required for identifying precipitation cores in MCS tracking, and, if so, consider adding a short dedicated sub-subsection on precipitation. At present this need becomes clear only later in the Meso-NH discussion (line 149).
L139-140. Move variable names RAINC and RAINNC inside parentheses and move the explanatory text outside; re-phrase for clarity.
L 144- Add references for SURFEX and the cloud-microphysics schemes used in Meso-NH. Throughout, write “Meso-NH,” not “MesoNH.”
Generally in section 2, make it clear what belong to models and what belong to CoComet.
Section 2.2 (Implemented trackers)
Briefly explain how additional trackers can be integrated via plug-ins, so the community understands how to contribute.
L146, numerical variables or meteorological variables?
L146-147. Specify whether radar reflectivity is calculated in Meso-NH or inside CoCoMET.
L161 Briefly explain the rationale for selecting 30, 40 and 50 dBZ thresholds.
L170 – Please replace ‘recent updates’ with a version-specific reference.” In 10 years from now, maybe the updates would not be recent anymore.
L189 and L206. Clarify what “types of model outputs” means: file formats, variable sets or both? The three weather models you are describing in the paper?
Section 2.3 (Output unification). State which new diagnostics are unique to CoCoMET and which already exist in individual trackers. Provide a brief explanation of 2-D vs 3-D features for non-experts.
L209- Are the deep convective cell/MCSs trackers really sharing outputs? Or are they just similar?
L221-223. Please explain the new merging/splitting method, or refer to section 2.4.
L252-253. This is nice. I want to know more example for other features.
Equation 1. V and A are introduced without their subscripts. Consider adding the subscripts in the text.
Sec2.3.6. Clarify whether 3-D features are reconstructed from stacked 2-D features.
Section 2.4 (title and content). Define “merge” and “split” early in the subsection title or first sentence
Mergers and splits are not defined (only later L315) and feel like jargon.
Explain all free parameters (20 % perimeter, 110 % search radius, 50 % overlap, look-ahead of 2 time-steps) and discuss sensitivity.
L322- pi should be written with its symbol here. “rsearch”, consider subsripting the “search”.
Equations 2 and 3. The symbol V is reused with a different meaning; please adopt a new symbol for the background threshold, and clarify square root of 2 or of 2 times r_search.
L358: This forward-looking sentence belongs in the Introduction.
Equation 3: same comments as for equation 2.
L369. Specify what the third dimension represents (vertical coordinate?).
Section2.5
L390- clarify whether CoCoMET currently ingests UAV data or whether this is planned.
Section 3
The configuration system is not fully clear to me: does CoCoMET simply provide a master file that forwards settings to each tracker’s native configuration, or does it translate all tracker parameters into a single, unified schema with consistent key names? If the former is true, please clarify the practical benefit of advertising “one configuration file” and explain how this approach adds value relative to running each tracker individually.
L437. Check date format against journal style.
L439. Use consistent SI notation (e.g. m s⁻¹).
L446- A few times you are refering in the text that a config file is available in Weiner et al (2025). That would be nice to have the config file either in your repository or to have a brief discussion of what the config file is.
L450. “increasing trend”, make it intemporal.
L455- Since Hahn et al. is still in review, consider placing the figure in supplementary material.
L460 and Fig. 8 The caption uses “life-cycle bin” whereas the axis label reads “normalised lifetime”. In which unit is the lifetime? Hours?
Section 4 (Future development). Sections 4.2 and 4.3 both discuss linking to additional external data sets; consider merging them or clarifying the distinction.
Explain what you mean by “thermodynamic data sets” and by “external” data.
L492. Check spelling for “PyFLEXTRKR” and cite TempestExtremes.
The final sentence of Section 4 repeats material already discussed. Please also check “Stage VI” throughout the manuscript.
Scaling limitations. Briefly discuss potential bottlenecks for global domains or multi-year archives (memory, I/O, parallel efficiency)
Fig.2 What do the grey arrow mean?. The top ones look like they are mentionning preprocessing or liking data, and the bottom one look like they are for running the tracker. Is the big blue arrow for mentionning the output of CoCoMET? It would be nice to see what are inputs and what are outputs.
Fig. 3. It would be nice to see a feature of convexity = 1 and a feature os convexity very close to 0.
Fig. 4. explain the colour scheme.Figures 5–6 (merge/split examples) lack variable names, thresholds
Fig. 7. Caption: “MesoHN” → “Meso-NH” and unify axis units.
Fig. 8. Observed and simulated, please specify which one is which.
Fig. 9. maybe add a tick for each hour for clarity.
References: Check DOI formatting; several have duplicated “https://doi.org/https://doi.org”.
Replace straight quotes with typographic quotes, and use consistent en-/em-dashes.
Citation: https://doi.org/10.5194/egusphere-2025-1328-RC1 -
RC2: 'Comment on egusphere-2025-1328', Julia Kukulies, 21 May 2025
Review for "CoCoMET v1.0: A Unified Open-Source Toolkit for Atmospheric Object Tracking and Analysis"
The manuscript at hand introduces a new python-based open-source software tool that allows for the application of multiple algorithms that identify and track convective clouds and storms in diverse datasets. The software tool primarily standardizes the input and output data to facilitate a workflow where multiple tracking algorithms are applied to datasets in different data formats. In addition, the tool offers new features and enhances thereby the capabilities of existing tracking tools. These new features include, for instance, the computation of additional storm characteristics such as cell growth and dissipation rates, convexity and irregularity as well as the implementation of a novel method for merging and splitting.
First of all, I would like to apologize for my delayed evaluation. This project uses state-of-the-art software tools and standards, follows open-source and open science principles and addresses a major challenge in weather and climate science data: How to best unify the different data formats and tools that we have available to enable more systematic data analyses of observations and models. The paper is interesting and generally well-organized. The software package is explained in a clear manner, including examples for analyses and applications. I will recommend this paper for publication after my comments have been addressed. I see the need for clarifications at some locations in the text. In particular, the potential for enhancements and how to add new datasets need to be discussed more clearly.
General comments
- Introduction - I think the introduction could be improved by mentioning a few studies that have done model and dataset intercomparisons in a more complicated way. This is to highlight and better explain what kind of studies would benefit from the presented framework.
- Generalizability - I can truly see the challenge that is addressed with this project, since running multiple trackers on multiple datasets can be cumbersome. However, it remains unclear to me how abstract and modular this framework is actually implemented such that new trackers and datasets can be easily added. Could you clarify what the pre-processing steps would be to add new datasets and be a bit more general of what the data structure needs to be. Appendix A addresses this partly, but I am not sure whether there is some flexibility in the variable names, etc. My understanding is also that there would be a tradeoff between a generalizable approach that focuses on using a specific tracker like MOAAP on any dataset (the focus would be on making all data formats and variables work with MOAAP in a pre-processing step) vs. making all trackers compatible with a certain dataset. What is the best approach for an abstract implementation of this?
- Global vs. regional - Related to the question above, are there any restrictions on regional vs. global datasets? The analysis and examples seem to be focused on CONUS, and I am curious if one could easily add global models and observational datasets.
- Unstructured grids - How will this package address the challenge of unstructured grids? It is stated in l. 93, that there is a plan to include models such as ICON. Will there be a pre-processing step in which the data are regridded onto a common grid or how will the different grids be handled? I am also curious if the handling of such models will be model-specific or generally applicable such that other models can be easily added (e.g. Model Prediction Across Scales/MPAS which is the successor and replacement of many WRF applications).
- Version handling - A general challenge that I see with this package is that it is heavily dependent on the versions of the supported tracking algorithms. For example, tobac just released a new version (v1.6.0) in which xarray is supported eliminating the need for iris. How do you plan to maintain this tool ensuring compatibility with the versions of the tracking algorithms? Are you requiring certain versions of the latter or could, for instance, the latest version of tobac be run with the latest installation of CoCoMET? Do you plan to have any active communication and collaboration with the developers of the supported tracking libraries?
- Unit testing - related to the former comment, it seems like unit testing and continuous integration would be very valuable for this type of package. Otherwise, it could be pretty hard to identify where the code breaks when datasets and the tracking algorithms change.
- Atmospheric features beyond deep convective systems - From the README file, it looks like TAMS and MOAAP are only supported for MCS tracking. Is that right or is it, for instance, possible to use MOAAP to track atmospheric rivers in WRF model output using the current version of CoCoMET?
- Documentation - Since the python package has no formal documentation page, I think it is important to mention the user guide. I strongly recommend filling in the section of “how to set up a config file” because that is still not very easy to figure out from the examples.
- Merging and splitting - Is it possible to use the suggested framework to compare the merging and splitting capabilities of MOAAP, TAMS, tobac as well as the novel method?
Detailed comments
- 78 - maybe worth mentioning that this follows tobac’s nomenclature (see e.g., Sokolowsky et al., 2024)
- 164: The segmentation module of tobac enables the spatial definition of the identified or tracked objects and the calculation of bulk statistics of each object. However, it should be highlighted that the segmentation procedure can be done based on the output of the feature detection (as a second step, as currently noted in the manuscript) but also based on the output of the tracking procedure (based on the cell locations as a third step). In addition, users could also decide to not apply the segmentation module, as this is not a required step. This may be useful and save a lot of computation time and data storage when the user mainly needs the time and locations of tracked cells.
- 156: It could be useful to highlight here that both of the implemented trackers as well as the trackers that you plan to implement are all members of the MCSMIP intercomparison (Feng, Z., Prein, A. F., Kukulies, J., Fiolleau, T., Jones, W. K., Maybee, B., ... & Mejia, J. F. (2025). Mesoscale convective systems tracking method intercomparison (MCSMIP): Application to DYAMOND global km‐scale simulations. Journal of Geophysical Research: Atmospheres, 130(8), e2024JD042204.).
- 427: Does this standard output follow tobac output structure or is it different? And do you leave it up to the user to save this in whatever file format they want or is there a standard for this, too?
Fig. 9 - To understand the differences between the trackers in this specific example, could you please clarify what area and lifetime minimum you have chosen for the different trackers and if the three trackers offer the same initial criteria. I am wondering, for example, if the reason for MOAAP showing significantly less cells is that it has by default a more strict requirement of grid cell connectivity and spatial continuity, whereas you can set a minimum area for each object and thresholds in tobac.Citation: https://doi.org/10.5194/egusphere-2025-1328-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
314 | 114 | 10 | 438 | 12 | 18 |
- HTML: 314
- PDF: 114
- XML: 10
- Total: 438
- BibTeX: 12
- EndNote: 18
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1