Intercomparison of tropopause height climatologies: High-Resolution radiosonde measurements versus ERA5 reanalysis
Abstract. The tropopause plays a critical role in stratosphere-troposphere exchange and climate change. Its height is widely defined based on the World Meteorological Organization (WMO) threshold temperature gradient. High-resolution (5–10 m) soundings, therefore, are expected to substantially minimize uncertainties of tropopause height (TH) arising from limited vertical resolution and imprecise temperature measurements. The near-global coverage of high-resolution radiosonde data, accumulated from 2000 to 2023, offers valuable insights into climatological tropopause variability. While radiosonde observations are limited by spatiotemporal coverage, European Centre for Medium–Range Weather Forecasts Reanalysis v5 (ERA5) reanalysis datasets offer globally complete tropopause representations. To leverage both the high resolution of radiosonde measurements and the global coverage of ERA5, this study compares their tropopause height estimates and analyzes long-term trends across different latitude zones and seasons. The results indicate that the mean and absolute differences (radiosonde minus ERA5) in TH were 32 m and 336 m, respectively, with larger discrepancies observed during the spring season in the tropics (±20°). Overall, point-to-point comparisons (with strict spatio-temporal matching) indicate that ERA5 effectively captures climatological tropopause height variations in both time and space. Long-term trend analyses revealed increases of +5 m/year (radiosonde) and +3 m/year (ERA5) based on point-to-point comparisons. However, these site-specific trends may differ substantially from the long-term trends observed in ERA5 with complete spatiotemporal resolution, even showing opposite trends. Therefore, continued accumulation of high-resolution radiosonde profile data is crucial to further characterize tropopause changes in a warming climate.
Review of "Intercomparison of tropopause height climatologies: High-Resolution radiosonde measurements versus ERA5 reanalyses" by Yu Gou et al.
General comments
This new study examines differences and consistencies between high-resolution (5–10 m) radiosonde observations and ERA5 reanalysis data in estimating tropopause height (TH). Using data spanning the years 2000–2023, the authors find strong agreement overall, with radiosonde THs averaging about 30 m higher than ERA5 and an absolute mean difference of about 340 m. Seasonal and latitudinal analyses reveal consistent variability, although discrepancies are more pronounced in the subtropics, likely linked to the subtropical tropopause break. Long-term trend analysis shows a significant upward shift in TH, with radiosonde data indicating a rise of +50 m/decade compared to +30 m/decade in ERA5. The study highlights ERA5’s reliability for large-scale climatological assessments but underscores the necessity of continued accumulation of high-resolution radiosonde data to refine understanding of fine-scale tropopause dynamics and their role in climate change.
Overall, the study addresses an important and timely topic, as accurate assessments of TH are critical for understanding climate variability and long-term atmospheric changes. The scope is clearly focused on the intercomparison between high-resolution radiosonde observations and ERA5 reanalysis, making the paper short, concise, and well-structured. I found it interesting and informative to read, with results that are both relevant and valuable to the scientific community. However, I believe there are several major and minor issues that need to be addressed before the manuscript can be considered for publication.
Specific comments
- Line 14: The authors describe the high-resolution radiosonde dataset as providing 'near-global coverage.' However, this characterization is misleading. The distribution of stations is heavily concentrated in North America (particularly the United States) and Europe, with far fewer stations in Africa, South America, Australia, and large parts of Asia. There is also no coverage over the oceans. This uneven spatial distribution means the dataset cannot reasonably be described as 'global' or 'near-global.' I recommend revising this phrasing to more accurately reflect the actual coverage.
- Line 34: The phrase 'so-called very short-lived substances' could be reconsidered. The tropopause may not be especially relevant for the category of very short-lived species; just 'short-lived' may be more appropriate here. Also, the qualifier 'so-called' seems unnecessary and could be removed.
- Line 69: The manuscript refers to ERA-Interim as a 'modern' reanalysis dataset. However, ERA-Interim was introduced nearly two decades ago and has since been replaced by ERA5 as the state-of-the-art product. Please rephrase, as calling ERA-Interim 'modern' is outdated.
- Line 88: I wonder if it is accurate to generalize that IGRA radiosonde data have a vertical resolution of approximately 300–400 m. IGRA typically includes measurements at standard mandatory pressure levels plus significant levels, which are reported when notable deviations in lapse rate occur. This means the vertical resolution varies substantially between soundings and over time. Please verify this with IGRA documentation and clarify.
- Line 117: Please consider rewording 'Globally, radiosondes are launched...' for consistency with the earlier comment on line 14.
- Section 2.1: The information on radiosondes and ERA5 is somewhat mixed together. I suggest splitting this into two subsections for clarity. In addition, please provide more technical details on ERA5 (e.g., hourly temporal resolution, horizontal resolution, vertical resolution).
- Section 2.1: Please clarify whether the high-resolution radiosonde data used here are assimilated into ERA5. This is very likely the case, and if so, the datasets are not independent. Thus, this study cannot be considered a 'validation' of ERA5; it should be framed as an 'evaluation' or 'intercomparison' (as it is already properly reflected in the title). Please make this distinction explicit.
- Line 137: Why was cubic spline interpolation chosen to resample the radiosonde data from their original 5–10 m spacing to a uniform 10 m grid? At such fine spacing, cubic splines can introduce oscillations. (This question is likely not too relevant for the present study, since the authors used derived data, but would be nice if they could clarify the rationale.)
- Figure 2: It may be informative to show not only the nearest ERA5 profile but those from the four surrounding grid points. This would illustrate local variability in temperature and tropopause structure and show how representative the nearest grid point is.
- Figure 3: The scatterplot shows some very large outliers. For example, ERA5 tropopause heights at 15–17 km (typical for the tropics) relate to radiosonde values as low as 5–6 km. Please comment on these extreme discrepancies. Do they reflect limitations of the WMO definition applied to high-resolution profiles, local inversions, or possible data issues?
- Lines 189–194: The authors state that the mean difference (bias) in TH improves over time while the mean absolute difference (MAD) remains roughly constant. However, Table 1 shows a clear transition around 2006, coinciding with the introduction of COSMIC GPS-RO assimilation in the ECMWF reanalyses. After 2006, the radiosonde–ERA5 bias decreases, but MAD increases from ~250 m to ~350 m. This suggests that assimilation of GPS-RO data reduced the bias but increased the spread of differences, disrupting time series homogeneity. Please revisit and discuss this interpretation.
- Section 3.2: The methodology for trend analysis needs clarification. Was multivariate regression applied to account for factors such as seasonality, QBO, ENSO, and volcanic activity, or were linear fits applied directly? Simple linear fits can be misleading, as TH variability is strongly modulated by these processes. Multivariate regression has become standard in TH trend analyses; please clarify your approach and discuss limitations if only linear fits were used.
- Table 2: The purpose of the 'ERA5-F' zonal mean comparison is unclear. The one-to-one 'ERA5-P' comparison based on collocated profiles seems more appropriate and shows better agreement. Please clarify why ERA5-F is included, and consider emphasizing ERA5-P.
- Section 4: The findings on TH uncertainties and trends of this study should be placed more explicitly in the context of recent reanalysis studies (e.g., Xian and Homeyer, 2019; Tegtmeier et al., 2020; Hoffmann and Spang, 2022; Zou et al., 2023). This would situate the results within the broader literature and highlight the contribution of this work.
Technical corrections
- The language is generally clear, but the authors should proofread carefully, as there are minor grammatical and stylistic issues.
- Line 167: The term 'rightward shift' of the temperature profile is awkward; consider using 'warm bias' instead.
- Lines 176 and 180: Similarly, please rephrase 'lower-left region' and 'upper-right region' of the plots for clarity.