the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Baseline Climate Variables for Earth System Modelling
Abstract. The Baseline Climate Variables for Earth System Modelling (ESM-BCVs) are defined as a list of 132 variables which have high utility for the evaluation and exploitation of climate simulations. The list reflects the most heavily used elements of the Coupled Model Intercomparison Project phase 6 (CMIP6) archive. Successive phases of CMIP have supported strong results in science and substantial influence in international climate policy formulation. This paper responds both to interest in exploiting CMIP data standards in a broader range of climate modelling activities and a need to achieve greater clarity about the significance and intention of variables in the CMIP Data Request. As Earth System Modelling (ESM) archives grow in scale and complexity there are emerging problems associated with weak standardisation at the variable collection level. That is, there are good standards covering how specific variables should be archived, but this paper fills a gap in the standardisation of which variables should be archived. The ESM-BCV list is intended as a resource for ESM Model INtercomparison Projects (MIPs) developing requests to enable greater consistency among MIPs, and as a reference for modelling centres to enhance consistency within MIPs. Provisional planning for the CMIP7 Data Request exploits the ESM-BCVs as a core element. The baseline variables list includes 98 variables which have modest or minor data volume footprints and could be generated systematically when simulations are produced and archived for exploitation by the WCRP community. A further 34 variables are classed as high volume and are only suitable for production when the resource implications are justified.
- Preprint
(1345 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 26 Oct 2024)
-
CC1: 'Comment on egusphere-2024-2363', Anne Marie Treguier, 30 Aug 2024
reply
Dear authors,
Congratulations for this manuscript! I would like to share a suggestion. Some of the variables proposed in the list are not simple physical parameters like temperature. An example is Omon.mlotst, the ocean mixed layer depth. Its computation requires making nontrivial choices. It would be useful to add for each variable a reference to the paper that documents the method, for example Griffies et al., 2016 https://doi.org/10.5194/gmd-9-3231-2016 for ocean variables. If a change in method is decided relative to the existing reference, this change should also be documented and referenced (this may be the case for Omon.mlotst).
Best regards,
Anne Marie Treguier
Citation: https://doi.org/10.5194/egusphere-2024-2363-CC1 -
CC2: 'Comment on egusphere-2024-2363', Isla Simpson, 05 Sep 2024
reply
I have just a minor comment as I was using this paper as I prepared some opportunities for the CMIP7 data request and I couldn't find the information on what the pressure levels actually are for the various options i.e., the 19, 8 and 3 pressure level options for the atmosphere. I think it would be helpful to have what those pressure levels are actually listed so that this could be a stand alone resource for people to find out about the options available to them from these baseline variables. Sorry if I've missed it somewhere.
Citation: https://doi.org/10.5194/egusphere-2024-2363-CC2 -
CC3: 'Comment on egusphere-2024-2363', Alistair Adcroft, 06 Sep 2024
reply
I'm surprised to see Oday.sos (surface salinity) but not Oday.zos (table A6). I'm unclear what the purpose of sos is at such high frequency? I believe daily zos (and zostoga) would be more widely used (e.g for local sea-level analysis, mesoscale activity, ...) and should be a baseline variable.
Citation: https://doi.org/10.5194/egusphere-2024-2363-CC3 -
CC4: 'Comment on egusphere-2024-2363', Baylor Fox-Kemper, 06 Sep 2024
reply
This is a critically important topic, and it will inform all of the CMIP7 results. I have two suggestions (at this moment) for alterations.
1) Omon.zos should be converted to Oday.zos. The daily sea level is important for extreme event diagnosis (as tos and sos are). This variable is a critical one for both impacts and input for downscaling, and is particularly revealing in showing the *failures* of coarse resolution models to reproduce SSH variance as high resolution models do (see Fig. 9.12 of AR6 WGI, panels g-i, which had to be created using resources outside of CMIP6 ones because 0day.zos was not included).
2) There is an issue with only collecting bigthetao, in that most ocean models do not use the TEOS-10 equation of state. McDougall et al. made a recommendation to address this point (https://doi.org/10.5194/gmd-14-6445-202), but the data details for thetao and bigthetao presently do not allow this option (use whichever is the model "native" variable to calculate OHCA.). Thus, at a bare minimum *either* bigthetao or thetao, whichever is the model native, should be in Omon here. Furthermore, there is an ongoing assessment within the OMIP team noting that bigthetao cannot be compared to observations easily, which are presently mostly categorized in observational climatologies via thetao. Thus, even if a model is using bigthetao, a comparison to observations (e.g., AR6 many figures comparing temperature and OHCA to observed temperatures and OHCA in Chps 2, 7, 9, 10, 11...).
Citation: https://doi.org/10.5194/egusphere-2024-2363-CC4 -
CC5: 'Comment on egusphere-2024-2363', Nathan Gillett, 19 Sep 2024
reply
Congratulations on this manuscript! I have one suggestion. We expect that emissions-driven simulations will play a bigger role in CMIP7, and expect that more models in CMIP7 will include coupled carbon cycles than in CMIP6. If groups submit emissions-driven simulations, it will be essential to know the simulated CO2 concentration in order to interpret the results; and if groups submit concentration-driven simulations it would be very helpful to be able to diagnose compatible CO2 emissions, for example to calculate remaining carbon budgets. Also, a calculation of compatible emissions in the 1pctCO2 simulations would be needed to diagnose Transient Climate Response to Emissions (TCRE). These calculations would require monthly mean atmosphere-ocean CO2 flux, atmosphere-land CO2 flux, and atmospheric CO2 concentration or mass. These variables are included in the “Constructing a Global Carbon Budget” opportunity, but that opportunity includes a large number of other variables, and it is possible that some modelling centres would decide not to output these variables. I suggest adding this minimal set of carbon cycle variables to the baseline – with the understanding that of course these can only be provided for models with a carbon cycle.
Citation: https://doi.org/10.5194/egusphere-2024-2363-CC5 -
CC6: 'Comment on egusphere-2024-2363', Christopher Danek, 19 Sep 2024
reply
Hi
Thanks a lot for your efforts! Please see the following comments.
1) In 2.2 it's not clear to me how r1 and r2 are defined, i.e. how "downloads" are measured. It is certainly possible to count the number of download-clicks in a browser or the number of wget-scripts generated via a browser. But what about direct data usage via ssh access to an ESGF node, which I assume a lot of scientific users have? That cannot be counted I guess? Also, can the (successful) execution of a wget command be counted? If yes, that means I could tweak the download statistics by running a trillion wget-cronjobs of an unpopular variable? I would like to see a sentence more about this technical aspect (I could not find any details on this in the two given references Fiore et al. 2021 and the ESGF dashboard).
2) In my view it would make sense to add seawater density to the baseline variables. Its an important variable but does not get much attention in the literature, at least this is my impression. At the same time its rather cumbersome to post-process seawater density. 1) Downloading the two high volume 4D variables thetao and so is time consuming. 2) Utilizing a seawater equation software (e.g. gsw from TEOS10) on this large amount of data is time consuming as well. 3) Some ocean model output is not provided on its native grid (`gn`) but horizontally and/or vertically interpolated (`gr`). Hence, if I post-process seawater density from such interpolated thetao and so, the obtained result is a less accurate (?) representation of the actual density during ocean model runtime. I am aware that seawater density would yield a high volume 4D variable but I wonder if its worth to include it due to the above points.
3) I would find it useful to add global averages/sums of important variables to the baseline variables (e.g. tosga, sosga, siarean, siareas, siextentn, siextents, sivoln, sivols) as they are 1) easy to compute for the modeling centers but not for the user (downloading a lot of data is necessary) and 2) only need a tiny amount of resources.
4) I would find it user-friendly if the utilized potential density threshold and reference level were added to the title and/or CF standard name of the mixed layer depth (mlotst), e.g. "... Defined by Sigma T of 0.03 kg m-3 wrt to model level closest to 10 m depth" or such.
5) In the appendix tables, why is "Radiation" a realm and what means "Weighted Time-Mean" (e.g. SImon.siconc)?
Thanks a lot and cheers,
ChrisCitation: https://doi.org/10.5194/egusphere-2024-2363-CC6 -
RC1: 'Comment on egusphere-2024-2363', Claire Macintosh, 23 Sep 2024
reply
General comments
This paper represents a substantial and important step forward for the CMIP community looking towards CMIP7. The presented BCV list will form the core of the CMIP7 data request, with the underlying groundwork and philosophy having wide ranging implications across the ESM community.
There is some tension in the paper between the concept of a BCV list as it applies to the WCRP modelling multiverse generally, and the specific implementation of this list as the core of the CMIP7 Data Request and its associated tight timescale. I have tried to make clear in this review which aspect is being addressed by each comment.
In addition to the carefully considered results presented here, the author team should be acknowledged for their approach to the transparency of process in the development of the BCVs, which is an excellent example of good practice in the field.
Specific comments
Please note that I have been asked to give this review in part to provide perspective from the observational community. Some comments reflect that request.
Table 1. Stakeholders of the CMIP DR. Row 1. Examples of “communities studying the global climate” is currently restricted to MIP communities. Other direct users of CMIP data also exist outside of the MIP framework, not least a large number of scientific researchers using CMIP to elucidate specific processes or aspects of the climate system outside a specific MIP.
Section 3 Line 288, Line 303 – see Section 5 comment.
Section 3.4 Role from the data user’s perspective
The BCV list as a whole is aimed primarily at modelling centres. However, the manuscript would benefit from more careful consideration of the wider CMIP user and associated observational communities.
The example of the need of some users for high temporal resolution data presented here is important, but by no means the only consideration from the perspective of the wider CMIP user community.
For context: a search of Scopus lists 4189 papers containing “CMIP6” in the title or abstract. Of these, 1371 (33%) also contain a least one observational keyword (observations OR satellite OR in-situ OR reanalysis)[1], increasing to 1720 (41%) if the word “evaluation” is also included. This inexhaustive list of keywords represents therefore a lower bound on the fraction of the CMIP6 community that is using at least one auxiliary dataset alongside CMIP6 data.
The implications for the BCV list are clear. Given that more than a third of the CMIP community is using some kind of observational data, a key role of the BCV list must be not only that it is common across CMIP modelling centres, but also that it provides enough information to downstream users for observational comparisons and evaluation to be possible. This includes e.g. information on pressure levels, variable names that are consistent across the ECV-BCV boundary, variable choices that are suitable for observational evaluation, considerations of relevant observing resolutions, and clear information on methodological choices to generate BCVs. In short, it must ensure that it is externally facing such that it is sufficient for these analyses.
Some discussion of implications and additional requirements for the BCV list for external communities would be beneficial –
- In the general case: What are the implications for exploitation of the BCVs with and without coordination/interoperability with equivalent observational parameter lists (ECVs, GCIs etc.).
- Do these differ for direct vs indirect users (the latter being more likely to be using derived metrics, where the original form and nuance of both the BCV DR and observational data may be obscured).
- In this phase of CMIP: What input is needed from observational or other auxiliary data communities to maximise the interoperability aims of the BCV DR (for example, development of variables that are more directly comparable with model output – e.g. trivially skin vs layer temperature - or techniques and documentation where comparisons are nuanced e.g. vertical integration to a small number of layers vs observing resolution, pitfalls for regional analysis).
- What actions can the BCV DR take aimed at maximising the uptake of BCV DR across this interface and therefore achieving the overarching aims of the exercise, both for this phase and beyond.
- What gaps exist at the interface that should be filled?
- How might future iterations of the list more systematically address the widespread use of auxiliary datasets in analysis of CMIP or ESM MIP data? What is needed in the longer term?
Section 5 Conclusions: The BCV list has wide implications for ESM MIPs generally, but will also in the near future form the core of the CMIP7 data request. Given that there will be immediate and substantial CMIP community interest in the practical implementation of the list, and that numerous downstream communities will begin to make decisions on their respective implementations in preparation for CMIP7, Section 5 would benefit from some discussion on immediate and future next steps, and an aggregation and expansion of relevant issues identified elsewhere in the paper.
Please note that it is not for the authors to necessarily answer in detail to all aspects of the implementation phase, but rather to highlight in this paper issues that must be addressed by next steps, any potential pitfalls, and further community engagement that is needed in order to maximally exploit the careful and detailed work presented here. For example-
- Governance: how will the list be managed and updated? What issues must be addressed?
- For this phase of CMIP: How will any updates or amendments be transparently curated, deployed and communicated to the community.
- For this phase of CMIP: How will this list interact with the wider CMIP7 data request. For example the passing on of specific variable requests to the wider DR communities, where they are assessed as not part of the BCVDR. What action is needed from within and without the BCV community.
- For future phases: Line 303. How might new or emerging variables be fairly and transparently assessed for inclusion (e.g. new land surface or biosphere variables, that may be disproportionately important in the climate services and impacts communities, but do not appear prominently in the CMIP6 data request, or variables that have an easily assessable observational counterpart but may not be essential for model intercomparisons). How might user groups such as those illustrated by the high-resolution example in Sec 3.4 be identified systematically, rather than ad-hoc[2]?
- Future phases: By definition, the existence of this list will create a feedback effect on the most downloaded variables, a core component of its initial derivation. What are the implications for the methodology to update the list going forward? What other issues must be addressed for evolution of the list in the longer term.
- Curation of the list for this phase of CMIP
- What auxiliary information that is not described in this paper is needed for the full implementation of the BCV list. Where will it be available?
- e.g. Table A2, A3 details on pressure levels if needed, any other methodological details required for derivation of BCVs. (I would also strongly suggest some version control and numbering).
- Line 288 Section 3.1 How will new naming conventions be developed and disseminated to the community, or what is needed to address this. Does this need to happen before the AR7 Fast Track runs begin.
- Implications from external/adjacent communities on maximum exploitation of the BCVDR in this phase of CMIP
- Modelling centres and working groups: Are there issues arising from e.g. methodological choices of modelling centres, that are not the responsibility of the BCV list, but that may directly impact its utility (for example, do definitions of mixed layer depth affect how these variables can be intercompared). Are there additional engagement and documentation needs directly relating to BCVs, are there implications for the BCVs from a lack of this engagement, and how can the respective communities collaborate including across the wider CMIP7DR
- Observational community – addressed in earlier comment
- Other neighbouring auxiliary data communities – e.g. downstream modelling exercises, communities using CMIP as boundary conditions, etc. As for observational community, what is needed in terms of engagement on both sides to maximally exploit the BCVDR.
- Immediate next steps of the BCV community.
Technical/minor comments
Ln125: “from”-> “to”?
Footnote 3 on ECVs. The GCOS ECVs span all observation types including in-situ observations, they are not restricted to Earth Observation.
Section 3 title: Second “and” should be “of”?
Table A3 Omon.masscello is missing descriptors in its row
With thanks to the author team,
Claire Macintosh, ESA.
[1] Equivalent numbers from Dimensions.ai (free to access): 5944 articles mention CMIP6 in the title + abstract, of which 2099 (35%) include an observational keyword. This increases to 2538 (43%) if the word ‘evaluation ‘is included, which typically implies some kind of auxiliary data source. Searches conducted 16-Sept-24.
[2] For illustration: Dimensions.ai search “CORDEX” returns 2364 title + abstract results, “CMIP5” returns 6439, but the overlap (CMIP5 AND CORDEX) is only 262, as the majority of the CORDEX community are indirect users of CMIP data. This community will not show up in the methodology as described but is very large and currently not accounted for except via user engagement surveys. The principle of assessment of indirect users is more widely applicable to the BCV concept.
Citation: https://doi.org/10.5194/egusphere-2024-2363-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
477 | 115 | 91 | 683 | 6 | 4 |
- HTML: 477
- PDF: 115
- XML: 91
- Total: 683
- BibTeX: 6
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1