Atmospheric geopotentials from ERA5 linked to the daily maximum temperature record-breaking in Spain (1960&ndash;2023)

Barrio-Torres, Elsa; Abaurrea, Jesús; Asín, Jesús; Castillo-Mateo, Jorge; Cebrián, Ana Carmen; Gracia-Tabuenca, Zeus

doi:10.5194/egusphere-2026-832

Preprints

https://doi.org/10.5194/egusphere-2026-832

Preprints

10 Apr 2026

| 10 Apr 2026

Atmospheric geopotentials from ERA5 linked to the daily maximum temperature record-breaking in Spain (1960–2023)

Elsa Barrio-Torres, Jesús Abaurrea, Jesús Asín, Jorge Castillo-Mateo, Ana Carmen Cebrián, and Zeus Gracia-Tabuenca

Abstract. As the frequency of extreme temperature events increases, so does the need for robust tools to understand them. This work develops and applies a methodological framework to model the occurrence of T_x calendar-day records and their relationship with geopotentials. The analysis includes T_x data from 36 Spanish stations (1960–2023) and geopotentials at 300, 500, and 700 hPa. Exploratory analysis revealed a non-stationary trend in records, a higher frequency in the interior of the peninsula, and decreasing spatial co-occurrence with distance. A hierarchical spatio-temporal logistic regression algorithm prioritizing interpretability was designed. The approach involves: (1) fitting local models per station; (2) applying a spatial consensus filter to reduce initial 1620 parameters to 17 in a base model; and (3) incorporation of interaction terms. Among the tested models, a global model that enhances the base model with geodetic interactions was selected for optimal balance between predictive performance and complexity. Geopotentials at 700 hPa are most relevant for characterizing records, while 300 hPa dominates in the southern corners and 500 hPa in the northern corners. The model demonstrates high predictive accuracy at interior stations, good performance at coastal stations, and adequately reproduces the persistence of record runs and spatial co-occurrence.

Received: 11 Feb 2026 – Discussion started: 10 Apr 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 3983 KB)

Supplement (2316 KB)

Download & links

Elsa Barrio-Torres, Jesús Abaurrea, Jesús Asín, Jorge Castillo-Mateo, Ana Carmen Cebrián, and Zeus Gracia-Tabuenca

Status: final response (author comments only)

RC1:
'Comment on egusphere-2026-832', Anonymous Referee #1, 12 May 2026

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-832/egusphere-2026-832-RC1-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2026-832-RC1
- AC2: 'Reply on RC1', Elsa Barrio-Torres, 27 Jul 2026
  
  Please find our detailed response in the attached PDF.
  
  Citation: https://doi.org/10.5194/egusphere-2026-832-AC2
RC2:
'Comment on egusphere-2026-832', Anonymous Referee #2, 20 May 2026

GENERAL COMMENTS
This manuscript presents a timely and interesting contribution to the statistical modelling of heat-related extremes in Spain. The authors develop a hierarchical spatio-temporal logistic-regression framework to model the occurrence of daily maximum-temperature calendar-day records using ERA5 geopotential covariates at 300, 500, and 700 hPa. The topic is relevant to NHESS because record-breaking temperatures are high-impact climate hazards with implications for public health, agriculture, drought stress, wildfire risk, energy demand, and heat-stress management.
The strengths of the manuscript are the use of a long observational record over Spain, the focus on calendar-day records with well-known theoretical properties under stationarity, the interpretable modelling strategy, and the attempt to link surface temperature records to large-scale atmospheric circulation. The open repository is also a positive aspect for reproducibility.
However, several methodological and presentation issues should be addressed before publication. In particular, the manuscript needs a clearer treatment of temporal non-stationarity, the intrinsic time dependence of record probabilities, calibration of rare-event probabilities, uncertainty in variable selection, and consistency in the training/testing periods. I therefore recommend major revisions.
SPECIFIC COMMENTS
1. The manuscript is relevant to natural hazards, but the introduction and discussion should make the hazard link clearer. Record-breaking daily maximum temperatures should be framed not only as a statistical climate indicator, but also as an operationally relevant heat-hazard signal. The authors should explain how this record-based approach complements classical heatwave definitions based on absolute or percentile thresholds.
2. The title and abstract refer to 1960–2023, but the data description states that the testing period covers 2012–2024, while later sections refer to validation over 2011–2023. This is confusing and must be corrected. Please clearly state the exact observational period, training years, validation/testing years, and whether 2024 is included or not.
3. Include a baseline record-probability structure: a key property of record events is that, under stationarity and independence, the probability of a record at time t is 1/t. The manuscript discusses this in the exploratory analysis, but it is not clear whether the final logistic models include this baseline record-age effect. Please compare the proposed model against at least three baselines: (i) a stationary 1/t record-probability model; (ii) a time-trend-only or year-smooth model; and (iii) a geopotential-only model. This would show the added value of the atmospheric covariates beyond the changing record baseline and the long-term warming trend.
4. The manuscript shows that both temperature records and geopotentials exhibit upward trends. This creates a possible confounding issue: the model may partly capture long-term warming rather than a dynamically meaningful circulation–record relationship. Please test whether the selected geopotential predictors retain skill after detrending or anomaly-standardizing them relative to a moving climatology. A useful sensitivity analysis would compare raw geopotentials, standardized anomalies, detrended geopotentials, and models including an explicit year effect.
5. IThe manuscript relies strongly on AUC. AUC is useful but not sufficient for rare-event prediction, especially when non-records dominate the dataset. Please add calibration and rare-event metrics, including Brier score, log score, reliability diagrams, calibration slope/intercept, precision–recall AUC, and sensitivity/specificity at selected thresholds. If the model is proposed as a downscaling or prediction tool, calibration is as important as discrimination.
6. The proposed local-to-global selection procedure is interesting, but it relies on stepwise regression, z-score thresholds, and AIC-like penalties. Stepwise procedures can be unstable with correlated atmospheric predictors and spatially/temporally dependent observations. Please add a stability analysis, for example using block bootstrapping by year or leave-one-year/block-out resampling. Report how often each predictor is selected and provide uncertainty intervals for key model coefficients or partial effects.
7. The global model concatenates binary indicators across stations and days, but the manuscript itself shows strong temporal persistence and spatial co-occurrence. This dependence may affect standard errors, z-scores, and variable-selection thresholds. Please clarify whether standard errors are corrected for clustering by day, year, or station. If not, either add cluster-robust/block-bootstrap uncertainty estimates or state clearly that coefficient significance is used only heuristically for screening.
8. The exploratory analysis shows strong persistence in record occurrence, and the manuscript evaluates whether simulated series reproduce run lengths. However, the fitted model does not appear to include an explicit previous-day record term or lagged Tx anomaly. Please explain whether persistence is reproduced indirectly through geopotential persistence. A useful sensitivity test would compare the selected model with and without a previous-day record indicator or lagged Tx anomaly.
9. Discuss spatial co-occurrence more cautiously: the model-simulated Jaccard indices show a positive relationship with the empirical Jaccard indices, but the simulated values are systematically lower in magnitude. Therefore, the model partially captures the spatial co-occurrence pattern but underestimates its strength. Please avoid wording suggesting that spatial dependence is fully reproduced.
10. Provide more detail on data quality and station homogenization: please specify whether the ECA&D Tx series are homogenized, how missing values are handled in the record calculation, and whether urban heat island or station-change effects were screened.
11. The use of 12:00 geopotential at 1° resolution should be better justified. Since Tx often occurs in the afternoon, the authors should explain why 12:00 was selected instead of 00:00, 06:00, 18:00, daily mean fields, or geopotential-height anomalies. It would also be useful to clarify whether results change if geopotential is expressed as geopotential height in meters, which is more interpretable meteorologically.
12. Clarify the spatial prediction maps.

The paper presents maps of predicted record probability during the August 2023 event. Please explain exactly how these maps are produced from a station-based model. Are predictions made only at station locations and interpolated, or are covariates and geodetic terms evaluated continuously over the grid? What climatic covariates are assigned to non-station grid cells?
TECHNICAL CORRECTIONS
The manuscript is generally readable, but several editorial corrections are needed, authors should:
- correct the study-period inconsistency: 1960–2023, 2012–2024, and 2011–2023 cannot all be correct simultaneously.

- correct “eventSs” in the Introduction.

- complete the sentence referring to NOAA/NCEI records; the phrase “see Similarly” appears to be missing a link or punctuation.

- write “Mediterranean Sea,” “Pacific–North American pattern,” “North Atlantic Oscillation,” and “Pacific Decadal Oscillation” with proper capitalization.

- correct “grid were geopotential variables were extracted” to “grid where geopotential variables were extracted.”

- use “co-occurrence” consistently; “co-ocurrence” appears in several places.

- remove the comma in “50,km”.

- define all abbreviations at first use, including Tx, AUC, AIC, ROC, Jaccard index, and LOESS.

- use consistent notation for pressure levels: 300 hPa, 500 hPa, and 700 hPa.

- consider reporting geopotential height units in meters in addition to geopotential units in m2 s-2 for meteorological interpretability.

- ensure all equation numbers are referenced in the text.

- avoid overly strong wording such as “high predictive accuracy” unless supported by calibrated rare-event metrics, not only AUC.

- replace “This type of tools have” with “This type of tool has” or “These types of tools have.”

- ensure that all supplementary figures and tables are cited in numerical order.
RECOMMENDATION
I recommend major revisions. The current version requires stronger treatment of temporal non-stationarity, baseline record probability, rare-event calibration, uncertainty in variable selection, and validation-period consistency. Addressing these points would substantially improve the robustness and NHESS relevance of the study.

Citation: https://doi.org/10.5194/egusphere-2026-832-RC2
- AC1: 'Reply on RC2', Elsa Barrio-Torres, 27 Jul 2026
  
  Please find our detailed response in the attached PDF.
  
  Citation: https://doi.org/10.5194/egusphere-2026-832-AC1

Elsa Barrio-Torres, Jesús Abaurrea, Jesús Asín, Jorge Castillo-Mateo, Ana Carmen Cebrián, and Zeus Gracia-Tabuenca

Supplement

https://doi.org/10.5194/egusphere-2026-832-supplement

Elsa Barrio-Torres, Jesús Abaurrea, Jesús Asín, Jorge Castillo-Mateo, Ana Carmen Cebrián, and Zeus Gracia-Tabuenca

Viewed

Total article views: 582 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
356	193	33	582	59	25	26

HTML: 356
PDF: 193
XML: 33
Total: 582
Supplement: 59
BibTeX: 25
EndNote: 26

Views and downloads (calculated since 10 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	166	46	16	228
May 2026	145	99	8	252
Jun 2026	15	9	4	28
Jul 2026	30	36	5	71
Aug 2026	3	0	3

Cumulative views and downloads (calculated since 10 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	166	46	16	228
May 2026	145	99	8	252
Jun 2026	15	9	4	28
Jul 2026	30	36	5	71
Aug 2026	3	0	3

Viewed (geographical distribution)

Total article views: 572 (including HTML, PDF, and XML) Thereof 572 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 02 Aug 2026

Short summary

Extreme temperatures are becoming more frequent, posing risks to people and the environment. We analyzed the link between historical temperature records from Spanish stations and atmospheric geopotentials at multiple heights to reveal patterns of record-breaking heat in peninsular Spain. Our method identifies where and when extreme events occur and which geopotentials drive them, offering a tool to characterize and predict extreme heat events applicable to any region worldwide.


Total:	0
HTML:	0
PDF:	0
XML:	0