the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Typhoon statistics in variable resolution Asia-Pacific CAM-SE
Abstract. Three Asia-centric configurations of the Community Atmosphere Model with the Spectral Element dynamical core (CAM-SE) were set up, with horizontal resolutions of approximately 1° globally, 1° increasing to 0.5° over the Asia-Pacific, and 1° increasing to 0.25°. A typhoon tracking algorithm was developed to extract the tracks of typhoons generated by the simulations. The typhoon intensities were bias corrected using scale conversion factors calculated from a comparison of tracks extracted from the European Centre for Medium-Range Weather Forecasts Reanalysis version 5 (ERA5) and the International Best Track Archive for Climate Stewardship (IBTrACS). Typhoon frequency, track density, genesis locations, and energy were calculated from 20 years of equilibrium climate simulations using the three configurations, then compared with the statistics from ERA5 and IBTrACS. The 1° and 0.5° CAM-SE simulations were unable to produce enough “Super Typhoons” (maximum sustained central wind speed ⩾ 51 m s-1) even after bias correction. The 0.25° simulation managed to produce enough “Super Typhoons”, indicating that at least 0.25° horizontal resolution is advisable for global climate simulations to produce appropriate “Super Typhoon” statistics. The regionally refined 0.25° CAM-SE configuration was estimated to be at least two times faster than a globally 0.25° typical configuration.
- Preprint
(7199 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2415', Anonymous Referee #1, 11 Nov 2024
This paper evaluates the effectiveness of variable-resolution (VR) configurations of the Community Atmosphere Model with the Spectral Element dynamical core (CAM-SE) for simulating tropical cyclones (TCs) in the Western Pacific. Using three resolutions - 1deg globally (ne30), 1deg to 0.5deg refinement (ne30x2), and 1deg to 0.25deg refinement (ne30x4) - the authors conducted 20-year equilibrium climate simulations to compare typhoon statistics, such as frequency, intensity, and tracks, against IBTrACS observations and ERA5 reanalysis. While 1deg struggled to reproduce intense TCs, the 0.25deg generated realistic statistics, demonstrating that at least 0.25deg resolution is important for TC climatology in the region. The study notes the computational efficiency of regionally refined grids.
First, I want to note that I genuinely believe the authors have put some effort into preparing this manuscript. I agree that testing VR simulations is useful. I also concur with the finding that 0.25deg does better simulating TCs than 1deg.
However, to be frank, I am unsure what hypothesis the paper is testing or the question being answered. In the abstract, the authors note that these simulations provide evidence that 0.25deg simulations can be used for global TC statistics. However, there are numerous examples of 0.25deg simulations showing this and more; there are open-source, public datasets using 0.25deg simulations with a wide range of models (e.g., see Roberts et al., papers on HighResMIP from ~2020), including 0.25deg versions of the model used here. Why did the authors need to run new simulations just to find this? What value-added exists in this deck that can benefit the community? So this falls flat to me and doesn't seem to offer a lot of novelty (or at least the authors haven't made a compelling case). Of note, the authors *do* cite some of this work, but the paper would be well-served by a better comparison versus just "this has been observed elsewhere..."
I would be more amenable to this publication if the authors were seeking a deeper understanding of the performance of the variable-resolution CAM-SE model, but there are many places (I have highlighted a few below) where they seem almost dismissive or uninterested in a deeper understanding of the configuration. Topography is merely interpolated rather than using supported tools. Timesteps are set very long, seemingly for the sake of computational expediency. Sensitivity analyses of various tunable parameters and/or settings are not used. There is also a large amount of literature regarding variable-resolution climate models that now exist (some names off the top of my head in the US are Herrington, X. Huang, Ullrich, Rahimi, Rhoades, Sakaguchi, Q. Tang, Zarzycki, but there are others) that could help with the setup/configuration/discussion here. Again, I am left wanting.
Lastly, I will also note that the paper was relatively confusing and difficult to follow due to typographical issues and grammar. I am sympathetic to the fact that the authors are not likely native English speakers, but there are many confusing passages ("evaluation of done closer to equilibrium" is one such example) that require multiple re-reads. Any revision desperately needs thorough proofreading.
As is, I cannot recommend publication. The paper needs to be more cohesive, and more care is needed to describe the modeling setup and motivate design choices. A lot of "hand-waving" has been done (which I note below) regarding model configuration and topography. The authors explain that they did not have the resources to test some assumptions but performed a rudimentary timing simulation. The actual results are relatively poorly motivated- the authors spend a great deal of time talking about a tracker that seems to combine previously published trackers but requires a lot of "massaging" to get reasonable climatologies. A lot of the results are based (I believe) on using a linear correction factor based on maximum surface wind (MSW) and 10m winds as output from ERA5 and CAM. At least some formal testing of statistically significant differences needs to be done.
In general, there really isn't anything scientific of substance resulting, with the paper reading very much as a superficial description of model output. Any revision should explicitly state how these simulations offer an improved understanding of how 0.25deg models (either VR or just such models globally in general) simulate TCs for this particular region. Added benefits would be highlighting specific situations where these simulations would be useful and suggesting potential biases to be improved by model developers.
Major comments:
Honestly, there are many things that left me a bit perplexed. I have listed some here, although this is a partial list.
Model configuration. First, the authors offload a significant amount of model description to the appendix -- it should be moved to the main text. Upon initial reading, I was pretty unclear how meshes were generated, what configuration was used, etc. Second, I have many concerns that indicate the authors may not be overly familiar with the modeling system. They use an 1800s (30-minute) physics timestep for even their 0.25deg simulations. This is uncommon in the high-resolution modeling community, where this timestep is probably almost always <=10 minutes at these scales. The authors do note that this choice was mainly for "computational speed," although then the authors make statements like "the differences most probably would not affect the results of this study" with little/no basis I can find. The papers they cite (e.g., Williamson, Reed) actually find that the timestep is an important sensitivity in model precipitation.
Topography. The authors eschew the use of the supported "TOPO" software for interpolated high-resolution data. But one of the critical aspects of the TOPO tool is to be able to differentially smooth topography over variable-resolution meshes, which the authors describe in the paper: https://gmd.copernicus.org/articles/8/3975/2015/ ... Also, the software (at least v1.0) was published almost 10 years ago, so I'm not sure about the comment, "At the time of this study, the TOPO toolkit was still under development, and we excluded its use..." -- how old are these simulations? I am quite concerned about the simulations being valid in general -- I would suspect a lot of gravity wave noise to be generated in the low-resolution part of the domain associated with the interpolated topography (i.e., investigate the mean 500 hPa vertical velocity field).
Tracking methodology. The tracking algorithm, while similar to some other published ones, seems overly complicated, as the authors note. Multiple publicly available tracking algorithms exist, two of which (TempestExtremes and TRACK) are commonly used in the field for evaluating high-resolution climate models. Why did the authors create their own? What are the benefits and drawbacks of their approach? If the results behave similarly to existing algorithms, that would be acceptable, but then the discussion of trackers can be moved to an appendix.
Wind correction. After re-reading a few times, I *think* I've figured out that the authors correct what they view as a low bias in ERA5 wind (shown in Fig. 2) by applying a linear scaling factor between ERA5 winds and IBTRACS winds and then apply that moving forward. That explains why ERA5 in Fig. 3 has typhoon+ storms, but Fig. 2 does not. This needs to be clarified, and it currently reads very opaquely. Suppose the authors are concerned with a low bias in storm intensity in ERA5. In that case, I encourage them to use sea level pressure and then some sort of pressure-wind correction rather than just using a linear correction as a bias correction.
Computational scaling numbers: I am baffled as to how the SE core at ne120 has 5x more cells than one of the VR grids but runs 9x slower. That shouldn't be the case. If anything, the additional "overhead" in the VR simulation would cause a slight penalty versus purely linear scaling (although I would hope it is minor). I suspect the authors are using a longer physics timestep for the SE ne120 runs. What they are seeing in the scaling numbers is that the physics timestep is taking more time than expected (i.e., it's not just a number of cells issue, but rather how often the cells are being calculated upon). However, given that they specifically chose the 1800s, I am not sure why it's worth discussing scaling through this lens (i.e., it makes the VR simulation look "better" on a per-element basis, but it's because they are taking longer physics timesteps, even if the 0.25deg mesh).
The comment, "Hence there would be no reason to use the SE dynamical core except for its variable resolution capabilities, " is a reasonably bold stance in a throwaway paragraph without formal timing comparisons. Aside from the fact that a handful of simulations does not a formal timing estimate make (as any high-performance computing software engineer would note), this ignores previous literature quantitatively comparing the dynamical cores with respect to numerical diffusion and weak/strong parallel scalability.
I encourage the authors to read papers such as:
- Dennis et al., 2011 --> https://journals.sagepub.com/doi/10.1177/1094342011428142 -- authors show local SE method has much better parallel scaling than FV.
- Evans et al., 2013 --> https://journals.ametsoc.org/view/journals/clim/26/3/jcli-d-11-00448.1.xml -- authors show SE has a better "effective resolution" than FV.
Other comments:
Line 181. Why is Bluestein (1992) (a synoptic meteorology textbook) being cited for "the central difference method"?
Line 294. "In the discussion that follows, values are considered different if their ranges extended by standard deviations do not overlap." This is confusing, but my reading of this is if the -1 STD to +1 STD range for variable A and variable B differ, they are considered statistically significantly different. However, no significance testing is performed following this statement that I can find.
Line 446. "The latitudinal bias may be partly explained by the location of the annual mean subtropical high in the simulations." The authors produce no evidence of this statement.
Line 489-490. I cannot find a good explanation as to why the authors analyze two overlapping time periods (2001-2020, 2011-2020). As best I can tell, the 2011-2020 period was chosen because that was all the ERA5 data the authors had available, but they ran the model simulation from 2001-2020. If that's the case, why wasn't the ERA5 data analyzed back to 2001?
Line 636. "Warm start" in a modeling sense usually means a balanced and/or assimilated state. These would be more like a "cold start" simulation. I am not sure what a "cold start" aquaplanet simulation means in this context.
Citation: https://doi.org/10.5194/egusphere-2024-2415-RC1 -
AC1: 'Reply on RC1', Duofan Zheng, 01 Dec 2024
Dear Reviewer,
Thank you for your comments on our manuscript. We appreciate the time and effort you have invested in providing valuable feedback, and have provided preliminary responses to your primary comments below. We hope you can find some time to review these responses and offer us further comments or suggestions based on your expertise.
Regarding the structure of our manuscript, we will move the model description section to the main text and relocate the part concerning the tracking algorithm to the appendix.
Model configuration
Regarding your query on our use of an 1800s physics time step at a 0.25deg simulation, we discussed this issue in lines 617–624 of the manuscript. Our choice for the physics time step was informed by the following literature. Williamson (2013) found that the deep convection scheme in CAM becomes ineffective at very short time steps. Additionally, Reed (2012) concluded that a time step of 1800s results in a more reasonable partitioning of convective and large-scale precipitation, and hence, more realistic intensities of tropical cyclones. Finally, the 1800-second setting was also employed in the study by Zarzycki and Jablonowski (2014). We have tested different values of physics time step, se_nsplit and se_rsplit in Table A1 (lines 637–639 of the manuscript) for the ne30×4 simulation. In the ne30x2 and ne30 simulations, the parameters se_nsplit and se_rsplit were not consistent with the ne30x4 simulation, and our statement “The differences most probably would not affect the results of this study” was made regarding this. The reviewer seems to disagree, and we are uncertain whether these discrepancies will really significantly impact the experimental results. Does the reviewer recommend that we retest the ne30x2 and ne30 simulations due to the settings of se_nsplit and se_rsplit?
Of course, we have not tested every single possible parameter in the model configuration, which is not feasible. We were unable to find sufficient information in the literature we accessed regarding the effects of all these parameters. Could you please provide some references or resources on the settings of these parameters and their potential impacts, or recommend some critical parameters that we should test?
Topography
Regarding the topography issue, we acknowledge that not using the TOPO toolkit developed by NCAR is a limitation of our simulations. We prepared the topography in August and September 2022. While preparing the topography files, we found this bug report on the the website https://wiki.ucar.edu/display/MUSICA/Bugs+and+Updates#BugsandUpdates-Topographyfileerror. Based on this bug report, if we had used topography files generated by TOPO toolkit at that time, we may also face the issue of gravity wave noise. The issue of gravity wave noise might have existed regardless of the method used to generate the topography files. We do intend to use the TOPO tool in our future research.
During our testing, we have found some abnormalities over topographically complex regions. However, our study primarily investigates typhoons and the region of interest is the ocean, we do not think the choice of topography files is a critical deal-breaker.
We do recognise the TOPO toolkit as an important tool for variable resolution simulation studies. We would also like to explain that we have tested the impact of topography files generated by the TOPO toolkit and bilinear interpolation on the experimental results. Using the F2000climo case, we selected the results of the FV dynamical core at one-degree resolution as the control group. We chose surface temperature and precipitation as the variables and found that the results of the bilinear interpolation experiments showed only minor differences compared to the FV experiments over most land areas. This is one of the reasons we chose to use bilinear interpolation topography files. We must admit that due to our limited knowledge, we did not consider the potential issue of gravity wave noise caused by bilinear interpolation topography files. We sincerely apologize for this oversight.
Tracking methodology
We are indeed aware of the existence of TempestExtremes and TRACK. However, as a student, I aim to understand the details of tropical cyclone tracking algorithms and chose to write my own code rather than directly adopting the tools developed by esteemed predecessors. I admit that my code may not be as refined as TempestExtremes and TRACK, but it represents a small step in my academic journey.
Wind correction
We will refine our descriptions of the methodology to it easier for readers to understand.
Regarding wind speed correction, we have tried some simple scaling methods with sea level pressure but the results were not very ideal. This could be because the 10-min maximum sustained wind speed a measure of gust strength.
Computational scaling numbers
Regarding the computational scaling numbers, we have checked our settings and confirm that we have indeed made an error in our ne120 simulation, by using the default setting of a 450s physics time step. We will conduct a simulation with the correct setting of 1800s physics time step. We agree with the reviewer that our statement is excessively strong "(Hence there would be no reason to use the SE dynamical core except for its variable resolution capabilities.") We have overlooked the literature raised by the reviewer and will accordingly edit the manuscript.
Other comments:
Line 181. We cited this article because the NCL website references it regarding the use of the central difference method for calculating relative vorticity. Here is the link to the NCL webpage: https://www.ncl.ucar.edu/Document/Functions/Built-in/uv2dv_cfd.shtml. We acknowledge that we may not have clearly stated this in our manuscript.
Line 294. The reviewer’s reading is correct. We will revise the manuscript so that this is clear. Tables 2 and 3 show bolded values which were considered significant by this method. We are considering using a standard statistical test instead.
Line 446. We made this statement because we observed that the center position of the 500 hPa geopotential height field in our model simulations is located further north compared to that in reanalysis data.
Line 489-490. The reviewer’s point is noted. We will extend the ERA5 analysis 2001–2020.
Line 636. We will revise the manuscript to clarify that this means spin-up from an initial unbalanced state, using zonally-symmetrical SST. To ensure that we had not made errors creating the variable-resolution meshes, simulations were initially run without topography, driving only by zonally-symmetrical SST.
Thank you once again for taking the time to review this manuscript. Your constructive feedback has been immensely valuable to us. If you have any further comments or suggestions regarding the manuscript, we would greatly appreciate any additional feedback you can provide.
References
Williamson, D. L.: The effect of time steps and time-scales on parametrization suites, Q. J. R. Meteorol. Soc., 139, 548–560, https://doi.org/10.1002/qj.1992, 2013.
Reed, K. A., Jablonowski, C., and Taylor, M. A.: Tropical cyclones in the spectral element configuration of the Community Atmosphere Model, Atmos. Sci. Lett., 13, 303–310, https://doi.org/10.1002/asl.399, 2012.
Zarzycki, C. M. and Jablonowski, C.: A multidecadal simulation of Atlantic tropical cyclones using a variable‐resolution global atmospheric general circulation model, J. Adv. Model. Earth Syst., 6, 805–828, https://doi.org/10.1002/2014MS000352, 2014.
Citation: https://doi.org/10.5194/egusphere-2024-2415-AC1
-
RC2: 'Comment on egusphere-2024-2415', Anonymous Referee #2, 04 Dec 2024
This paper examines typhoon statistics in three configurations of the CAM-SE model, two of which have a variable resolution grid with increased resolution over Asia/west Pacific. A new tracking algorithm has been developed to extract the typhoon tracks. The tracking algorithm has been run on ERA5 to compare its performance to IBTrACS and to calculate bias correction factors to apply to the model wind speeds. They find that the two lower resolution models are unable to produce enough super typhoons even after bias correction.
Although it’s well known that higher resolution models can simulate more intense wind speeds in tropical cyclones, I can see value in documenting the performance of these model configurations as well as the tracking algorithm, especially if the authors are planning to use these configurations in further studies. However, whilst I can believe the higher resolution model has better typhoon statistics, I don’t think that has been convincingly demonstrated here. Before I can recommend publication I would like to see some additional analysis, as well as some areas of clarification, therefore I suggest major revisions. Specific comments are listed below.
Major comments:
- Bias correction/scale conversion: Given that the same scaling conversion has been applied to the wind speeds for each resolution, I am not surprised that the bias correction doesn’t correct the number of super typhoons in the lower resolution models. Around L290 it’s stated that the regression coefficients of IBTrACS MSW onto ERA5 U10m for the matched tracks are similar across all resolutions, but I would like to see the scatter plots to see how good the fits are. For the lower resolutions, I suspect the fit is not so good for the highest wind speeds.
- Related to the above, I also wonder if the bi-linear interpolation of ERA5 to lower resolutions is sufficient to mimic the effect of running models at lower resolution? The relationship between grid spacing and resulting wind averaging is discussed in Davis et al (2018). Could you comment on this?
- Given the issues with the bias correction method for wind speeds, drawing conclusions on cyclone intensity based on wind speeds alone could be misleading. I think the authors need to show how the distribution of TCs as a function of another measure such as minimum MSLP changes with resolution.
- Significance testing: L294 states “values are considered different if their ranges extended by the standard deviation do not overlap” but comparing the annual mean TC/ACE/TC lifetime numbers in this way does not take into account expected noise due to sample size. A more accepted way to compare means is a t-test, or use the standard error (standard deviation/sqrt(sample size)) to estimate the confidence intervals (with 95% confidence interval = mean +/- 2*standard error). In Figure 4 it would also be helpful to delineate regions where the differences in track density are statistically significant.
- It is confusing that two time periods have been analysed in observations – why was this? It would greatly simplify the figures and text if ERA5 could be analysed for the longer period.
- In Figure 3, why does the new tracking algorithm on ERA5 appear to have far fewer tracks tracking north of 40 degrees compared to IBTrACS (comparing fig 3c and b)?
- L445: Why only plot the ne30x4 genesis locations in Fig 6? Are they improved compared to ne30x2? It’s hard to tell from Figure 5. Does the position of the subtropical high change with resolution?
- There are several typos/grammar errors. I have listed a couple in the minor comments but there are more – the manuscript needs a full proofread.
Minor comments:
L79-83: The authors state that the SSTs are prescribed and based on a variety of observational datasets, but they don’t say whether the SSTs are constant or time-varying. If the latter, what observational time period do they cover? What is the time frequency that the SST boundary conditions update (e.g. are they 6-hourly, daily, monthly, etc means)?
L237: “extremely rate” -> extremely rare
Figure 2 caption: It would be good to mention that these are tracks before scaling
L302: “less inaccurate” – don’t you mean “less accurate”?
L485: ACE is usually calculated only for track points above 34 knots – did you use this threshold?
L686: The link is for the wrong report (“Climate Assessment for 1998” instead of 1999)
References:
Davis, C. A. (2018). Resolving tropical cyclone intensity in models. Geophysical Research Letters, 45, 2082–2087. https://doi.org/10.1002/2017GL076966.
Citation: https://doi.org/10.5194/egusphere-2024-2415-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
195 | 68 | 19 | 282 | 6 | 7 |
- HTML: 195
- PDF: 68
- XML: 19
- Total: 282
- BibTeX: 6
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1