the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The CMIP6-downscaled CORDEX-Southeast Asia (SEA) ensemble: evaluation and benchmarking for megacities of SEA
Abstract. A 21-member ensemble of regional climate simulations has been produced for Southeast Asia (SEA) by dynamically downscaling Coupled Model Intercomparison Project Phase 6 (CMIP6) Global Climate Models (GCMs) under the World Climate Research Programme’s Coordinated Regional Climate Downscaling Experiment (CORDEX). The ensemble was generated by several modelling institutes using three regional climate models (RCMs) with eight distinct model configurations, resulting in a total of 62 simulations spanning the historical period and multiple future emissions scenarios. Model performance for mean, daily maximum/minimum temperature, and precipitation was evaluated against multiple observations at annual, seasonal, and daily time scales over SEA and its two subregions: Mainland and Maritime Continent (MC). Despite large observational uncertainties in precipitation intensity, the CMIP6 CORDEX-SEA ensemble captures the spatial and seasonal rainfall distribution reasonably well but tends to substantially overestimate observed rainfall. Wet biases, evident in about two-thirds of the models, are regionally and seasonally heterogeneous and larger over monsoon-dominated regions and seasons (e.g., MC during November–April and the Mainland during May–October). All RCMs showed widespread, statistically significant cold biases in daily mean temperature, which were largest during boreal winter, over the Mainland, and in simulations that have significant wet biases. These cold biases primarily arise from the models’ underestimation of daily maximum temperature. The MC remains a challenging region since models struggle to accurately capture the spatial variability of rainfall and the internal variability of temperature. A standardised benchmarking framework was applied to precipitation and temperature, which ultimately identified 15 historical simulations that met our a priori model performance expectations. Analysing the range of future projections and model independence shows that simulations from the same RCM family exhibit similar bias structures, highlighting the importance of RCM setup and the selection of statistically independent models. From this process, eight simulations spanning three RCM configurations were selected for further kilometre-scale dynamical downscaling over megacities of SEA.
- Preprint
(4230 KB) - Metadata XML
-
Supplement
(4837 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
CEC1: 'Comment on egusphere-2026-1325 - No compliance with the policy of the journal', Juan Antonio Añel, 28 Mar 2026
-
AC1: 'Reply on CEC1" Revised Code and Data availability"', Phuong Loan Nguyen, 29 Mar 2026
Dear Chief Editor,
Thank you very much for your careful assessment of our manuscript and for highlighting the issues related to compliance with the GMD's Code and Data Policy. We sincerely appreciate your detailed guidance and fully understand the importance of ensuring transparency, reproducibility, and long-term accessibility of code and data.
In response to your comment, we have taken substantial steps to bring our manuscript into compliance with the GMD Code and Data Policy. The Code and Data Availability section has been fully revised, and the required repositories and persistent identifiers have now been provided as the following:
Code and Data Availability
Code for benchmarking the CMIP6-downscaled CORDEX-Southeast Asia (SEA) performance (Isphording, 2024) is available at https://doi.org/10.5281/zenodo.8365065 (Isphording, 2023).
Model source codes used in this study are available as follows:
- RegCM4 is available at https://zenodo.org/records/4603556 (Coppola et al., 2021)
- RegCM5 is available at https://zenodo.org/records/17348623 (Giorgi et al., 2024)
- CCAM-2307 source code is available at https://doi.org/10.5281/zenodo.19303856 (Nguyen et al., 2026)
- HadGEM3-RA07 configuration and model information are available through https://doi.org/10.5194/gmd-16-1713-2023 (Bush et al., 2023)
The supporting dataset for this study is archived in Zenodo:
Nguyen, P. L., Alexander, L., Ngo-Duc, T., Cruz, F. A., Santisirisomboon, J., Juneng, L., Permana, D. S., Chung, J. X., Dado, J. M., McGregor, J. L., Redmond, G., Po, W., Tangang, F., Phan-Van, T., Truong, S. C. H., Thatcher, M., Trinh-Tuan, L., Ma'rufah, U., Tibay, J., White, S. (2026). Dataset for "The CMIP6-downscaled CORDEX-Southeast Asia (SEA) ensemble: evaluation and benchmarking for megacities of SEA" (Version v1). Zenodo. https://doi.org/10.5281/zenodo.19279601
This repository contains representative subsets of CMIP6-downscaled CORDEX-SEA simulations and observational datasets used in the analysis, which are sufficient to reproduce the evaluation workflow and figures presented in the manuscript.
Full CMIP6-downscaled CORDEX-SEA simulation data are being made available through the Earth System Grid Federation (ESGF) and the Australian National Computational Infrastructure (NCI) and are planned to be publicly released in 2026 due to their large data volume.
The original observational and reanalysis datasets used in this study are publicly available from the following sources:
- APHRODITE version V1101R1 and V1101XR (Yatagai et al., 2012): https://www.chikyu.ac.jp/precip/english/index.html
- SACA&D (Van den Besselaar et al., 2017): https://sacad.bmkg.go.id/
- CRU TS4.08 (Harris et al., 2020): https://crudata.uea.ac.uk/cru/data/hrg/
- ERA5 (Hersbach et al., 2020): https://doi.org/10.24381/cds.bd0915c6 (Hersbach et al., 2023)
- Berkeley Earth Surface Temperature (BEST) (Rohde and Hausfather et al., 2020): https://berkeleyearth.org/data/
- CHIRPS-v2, REGEN_ALL, and GPCC_FDD_2022 from the FROGs database (Roca et al., 2019): https://frogs.ipsl.fr/
Note that this part, along with full citation for the main DOI links included in the section "Code and data availability" will be added to the "References" section of the manuscript in the next revision.
We hope that this revision will address the concerns raised and bring the manuscript into compliance with the GMD Code and Data Policy.
Thank you once again for your guidance, and we are happy to make any further adjustments if required.
Kind regards,
Phuong Loan Nguyen
on behalf of all co-authorsCitation: https://doi.org/10.5194/egusphere-2026-1325-AC1 -
CEC2: 'Reply on AC1', Juan Antonio Añel, 30 Mar 2026
Dear authors,
Many thanks for your quick reply. Unfortunately, you have not addressed the issues with the data. You continue to say that the output of the simulations will be available along 2026, which we can not accept. In such a case, the sensible thing to do regarding your manuscript is to stall the review process, and wait until the data is public. Actually, the good practice would have been not submit the manuscript until the data are available. As I mentioned in my previous comment, If the size of the simulations do not make possible to store them elsewhere, then we would expect that you state it, providing the total size of the data, and that you share a repository that contains at least a sensible amount of sample files containing the necessary variables.
Also, for the "original observational and reanalysis datasets" you continue to cite webpages that are not repositories, and we can not accept. You should store the observational and reanalysis data that you have used in an appropriate repository, unless the term and conditions of such datasets do not allow it.
Therefore, please, reply to this comment addressing the mentioned issues.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2026-1325-CEC2 -
AC2: 'Reply on CEC2', Phuong Loan Nguyen, 31 Mar 2026
Dear Chief Editor,
Thank you very much for your patience and for your clarification. As requested, we have updated the Code and Data Availability section of the manuscript to fully address the concerns raised in your previous comment.
In particular, we have now archived the observational datasets used in our analysis (APHRODITE, REGEN_ALL, GPCC_FDD, CHIRPSv2, SCADA, BEST, ERA5) on Zenodo: https://doi.org/10.5281/zenodo.19334323. We have also uploaded the simulation data, including all variables used in the analysis (pr, tas, tasmax, tasmin), to Zenodo at https://doi.org/10.5281/zenodo.19334179. The Code and Data Availability section has been revised accordingly and is provided below for your review.
Code and Data Availability
Code for evaluation and benchmarking of the CMIP6-downscaled CORDEX-Southeast Asia (SEA) ensemble is available at https://doi.org/10.5281/zenodo.8365065 (Isphording, 2023).
Model source codes used in this study are available as follows:
- RegCM4: https://zenodo.org/records/4603556 (Coppola et al., 2021)
- RegCM5: https://zenodo.org/records/17348623 (Giorgi et al., 2024)
- CCAM-2307: https://doi.org/10.5281/zenodo.19303856 (Nguyen et al., 2026a)
- HadGEM3-RA07: https://doi.org/10.5194/gmd-16-1713-2023 (Bush et al., 2023)
All simulations and variables used in this study are archived in repositories as follows:
- Simulation data: https://doi.org/10.5281/zenodo.19334179 (Nguyen et al., 2026b)
- Reference datasets: https://doi.org/10.5281/zenodo.19334323 (Nguyen et al., 2026c)
We hope that these updates fully meet GMD’s requirements for transparency, reproducibility, and compliance with the journal’s Code and Data Policy. Please let us know if any further adjustments or additional information is required. We would be happy to make any further revisions if needed.
Thank you again for your guidance and support in ensuring that the manuscript complies with GMD’s Code and Data Policy.
Best regards,
Phuong Loan Nguyen
On behalf of all authorsCitation: https://doi.org/10.5194/egusphere-2026-1325-AC2 -
CEC3: 'Reply on AC2', Juan Antonio Añel, 01 Apr 2026
Dear authors,
Thanks for addressing this issue so quickly. I have checked the repositories and we can consider now the current version of your manuscript in compliance with the code policy of the journal.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2026-1325-CEC3
-
AC2: 'Reply on CEC2', Phuong Loan Nguyen, 31 Mar 2026
-
AC1: 'Reply on CEC1" Revised Code and Data availability"', Phuong Loan Nguyen, 29 Mar 2026
-
RC1: 'Comment on egusphere-2026-1325', Anonymous Referee #1, 13 Apr 2026
1. The introduction is comprehensive and well supported by relevant literature. However, the overall narrative could be clearer. The progression from CORDEX to CMIP5 and CMIP6 is somewhat non-linear. The distinction between background, previous studies, and the specific contribution of this study is not always clear. As a result, the main focus of the paper, standardized evaluation and model selection for future city-scale downscaling, is not sufficiently emphasized. A clearer structure would improve readability and highlight the key contribution.
2. The description of the study objectives is somewhat unclear. In Section 2.3.1, model evaluation is referred to as the “second objective”, and in Section 2.3.2, model selection is described as the “third objective”. However, the first objective is not explicitly defined. It seems that the construction or documentation of the CMIP6 CORDEX-SEA ensemble (Section 2.1) may correspond to the first objective, but this is not clearly stated. It would help to clearly define all study objectives earlier in the introduction. In addition, the distinction between Sections 2.2 (data) and 2.3.1 (evaluation methodology) could be made clearer to avoid possible overlap.
3. Around line 321, the text refers to Fig. 1 for tas biases, while the results appear to correspond to Fig. 2.
4. The discussion of the seasonal-cycle benchmark over the Maritime Continent is reasonable. However, a brief explanation of why the unimodal benchmark is not applicable in this region would improve clarity. Adding a supporting reference may also help.
Overall, this is a solid and valuable study with clear relevance to regional climate modelling. The methodology is sound and the results are useful, although the presentation and narrative structure could be improved to better highlight the key contributions.
Citation: https://doi.org/10.5194/egusphere-2026-1325-RC1 -
RC2: 'Comment on egusphere-2026-1325', Anonymous Referee #2, 02 Jun 2026
The manuscript describes an ensemble of CMIP6 downscaling simulations and evaluates them to identify a subset that can be further downscaled to city scales for climate impact assessment. As per the GMD Review Criteria, my Scientific Significance score is Excellent (it furnishes a wealth of information on the performance of different regional climate models and their configurations); Scientific Quality is Excellent (the simulations have huge potential for further research leading to significant scientific results); Scientific Reproducibility is Excellent (data and models are sufficiently described and all links in the Code and Data Availability section work); and Presentation Quality is Good (figures are clear, manuscript is well organised, but some descriptions are a little confusing). Overall the manuscript is perfectly suited for GMD and the authors have done a good job evaluating such a large ensemble of data and distilling actionable information from it.
My comments mainly concern the organisation of the manuscript and ensuring clarity in the communication of results, where I found some descriptions confusing. There are also a few topics I would like to see more discussion on. These issues should be straightforward to address and I expect the manuscript can be published thereafter. I recommend major revisions rather than minors because of the number of points I’ve raised.
Major Comments
- The objectives of the study could be made clearer which aren’t stated up front but referred to throughout the manuscript. In the benchmarking section, line 256 states “the third objective of this study, which is to identify a subset of CMIP6- downscaled RCMs that meet predefined performance criteria so as to be considered for further high-resolution dynamical downscaling”. On line 235: “the second objective of this study, which is to evaluate the ability of the CMIP6 CORDEX-SEA ensemble in simulating climatology of four model core variables”. And later on in line 695: “the objective here is to select suitable RCMs for km-scale downscaling”. Please make these objectives explicit in the introduction (e.g. the penultimate paragraph).
- I’m not familiar with dendrograms (Figure 14) and would appreciate a little more description on how to read this figure. What does it tell us? I can see that CCAM models are yellow and RegCM models are green, and so are similar (which makes sense). However, I’m not seeing what the manuscript says, which is that there are three “clusters” in Figs 14a,c in contrast to two “branches” in Figs 14b. I see two branches in each panel (yellow and green). Moreover, I’m not sure how this figure informs us about which models to select for further dynamical downscaling. Unless its utility to this task can be clarified, I recommend dropping this figure.
- I think some more discussion on the potential causes of the models’ inabilities is warranted. A principal finding is that downscaled simulations substantially overestimate precipitation over the maritime continent, especially during the monsoon season. I think it’s worth adding some hypothesising to why the models have large wet cold biases. My first thought on the wet bias was perhaps the models are not adequately simulating the temperature gradient between the continent and the ocean. However another principal finding is that the simulations tended to underestimate near-surface temperature (particularly over the mainland). Perhaps the models do not represent regional SST patterns and magnitudes. I think this intrigue deserves some discussion.
Minor Comments
- Line 28. General Circulation Model.
- Line 51. Missing word between in “accurately mesoscale” ?
- Line 54. There is a mix of British English (e.g., modelling, standardise) and American English spellings (e.g. standardize, urbanization, parameterization). The choice is unimportant, of course, but I wanted to point it out if the authors wish to consistently adopt one over the other.
- Line 70. Resolution and grid spacing are not technically the same thing. Recommend replacing “at 25-km resolution” with “with a grid spacing of 25 km”.
- Line 74–75. This sentence is nonsensical as it compares a multi-model ensemble with itself: The multi-model ensemble provides an improved representation of climatological precipitation but also exhibits larger inter-model variability compared with multi-model ensemble.
- Line 84. “the observed and present-day climate” are the same thing. Better to say “reproducing the historical climate”.
- Line 90. The previous sentence specifies what the authors will present. The first sense on line 90, however, I think is a general statement. When it says “Cross-ensemble evaluations of model ensembles over SEA have been conducted using various approaches” I think this is a statement about what’s in the literature, not what the authors have done. Potential confusion could be avoided here by inserting “using various approaches in the literature …”.
- Line 90–91. The sentence “The most common…specific applications” seems like it’s missing a word or was maybe meant to be tied to the previous or next sentence?
- Line 95. I think this is the first use of “RegCM4” but has not been defined.
- Line 129. I think this sentence flows better when deleting the “and” before “spanning”.
- Line 253. Since the area score metric (ASM) is not a traditional metric and won’t be familiar to readers, I think more details are warranted (e.g. how it is calculated, what a perfect score would be, and appropriate equations). Since ASM is the area between the model distribution and the observed one, is it always positive i.e. can it differentiate between model underestimation and overestimation?
- Line 275. Since one aim of the article is to identify models for further downscaling I think this sentence would work better if it defines the models for inclusion, rather than exclusion.
- Line 297. Should this be “It’s worth noting”?
- Line 298. “also exhibits a cold biases”.
- Line 299. “exhibits”.
- Line 308. Is this a two-sided test on the observed and model mean bias errors (MBE)?
- Line 351. “ASM” undefined.
- Line 379. “performance over the mainland is somewhat improved compared with that in the Mainland.” Wrong word?
- Line 392–394. “Despite this, several simulations (e.g., CCAM-Mod2021-ACCESS-CM2, CCAM-2017-ACCESS393 CM2, CCAM-2017-EC-Earth3-Veg, and HadGEM3-RA7-EC-Earth3-Veg) show relatively good agreement with observations 394 across the full distribution and across multiple reference datasets (Figs. S8-10).” This seems like an important point when identifying simulations for further downscaling. Maybe make sure this is highlighted in the conclusions.
- Line 404. Should this sentence end “tas and tasmin” rather than “tas and tasmax”?
- Line 406–408. I think it could be beneficial to mention here which models did actually do a good job at capturing the spatial temperature distribution.
- Line 420. “the bias- and variance-corrected SST forcing”. Does HadGEM3-RA7 also use bias- and variance-corrected SST like mentioned for CSIRO-CCAM mentioned on line 174? If so, please note it in the methods section.
- Lines 423–433. When describing model performance please cite the appropriate figure panels.
- Line 427–428. Why is the GCM nudging technique mentioned when it’s the same for both models and CCAM-2017 improved over CCAM-2021? Wouldn’t it be better to cite differences in the model configurations?
- Line 430. “The pattern of the NDJFMA reveals…” Missing word?
- Line 430–431. “RCMs show better skill in capturing the spatial distribution of precipitation, with dry biases over the mainland and wet biases over MC.” The first time I read this sentence it sounded perfectly straightforward, but on a second reading I’m a little confused. What is being compared to when you say “better”? It doesn’t seem to follow that the models capture the spatial distribution of precipitation better than they do that of temperature because Scors are substantially higher in Figs 3 and 4’s annotations than in Figs 6 and 7’s. Please clarify.
- Lines 446–448. “The RCMs generally reproduce the peak rainfall over the Mainland; however, most RegCM simulations fail to adequately capture the precipitation peaks over the MC.” Is this true? Looking at Fig S12, there is indeed a large wet bias like over the Mainland, but most models seem to dip during June–October. Also, rather than “peaks” isn’t there only a single peak over the maritime continent (Nov–April) that only appears as two because of how it’s plotted (with Jan and Dec on opposite ends)?
- Lines 478–480. Initialisms help us say things more efficiently, but there are nine in this sentence. I think it’s easier to understand when writing out MSM and BMF.
- Line 492. Annual average of daily mean near-surface temperature?
- Line 497. “testing” or further dynamical downscaling for CARE for SEA cities?
- Line 500. This sentence is a little confusing. What is the “strategy” mentioned here to identify thresholds? They simply seem chosen. I think the manuscript can be clearer about why these thresholds were chosen.
- Line 503. “for [the] spatial correlation metric”?
- Line 506. Floating &?
- Line 509–510. In what sense are gauge stations drier? Relative to what? Is this sentence trying to say that gauge stations in regional datasets tend to be from climatologically dry regions? Please clarify.
- Line 511–512. Delete the period. I’m confused by the word of “exceeding thresholds for at least three reference datasets. – regardless of the reference used”. (presumably the period also wasn’t meant to be there).
- Line 519–520. This sentence is a bit confusing to me. I can see that RegCM-CESM2 and RegCM-MIROC6 are below the critical thresholds across almost all reference datasets. But Fig. 10 condenses the spatial correlation into a single number. So when the manuscript says the models fail to capture the “spatial contrast”, I feel like that references Figures 6 and 7, and I’m not sure what’s being highlighted. Please clarify.
- Line 536. Orange cells are very hard to see and too similar to the other colors in the colorbar. Recommend boxing out simulations that don’t meet the benchmarking expectations, as in Figure 10.
- Line 559. Please add how confidence intervals are calculated to the caption.
- Line 587. Please add the meaning of blue highlighted numbers to the caption.
- Lines 627–629. Do high range, mid range and low range correspond to emission scenarios? Please clarify.
- Line 639. Period instead of a comma?
- Line 649. I think this sentence sounds can be simplified to give readers an easier time understanding it. By “spatial contrast” and “temporal mean precipitation distribution” do we mean ‘spatial distribution’ and ‘seasonality’? The first sentence of (2) is very clear and a good template.
- Line 650. “tends to substantially overestimate observed precipitation intensity”.
- Line 654. What is a “process-based metric”? Please clarify.
- Line 658. There’s a space in CRUTS.
- Line 681.
- Models with 25 km grid spacing will assuredly have difficulty predicting rainfall over mountain terrain. But how does the complex terrain result in an inability to capture the spatial variability? At 25 km, don't the models have sufficient resolution to show where the topography actually is (as we see in the wet biases)? Can you clarify? This makes me wonder again about the ability of the models (GCMs and RCMs) to represent regional SST.
- Line 708. Replace “a range of climate variables” with “temperature and precipitation”.
- Line 712. What case study? A case study is not mentioned anywhere else in the manuscript.
Citation: https://doi.org/10.5194/egusphere-2026-1325-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 1,090 | 456 | 82 | 1,628 | 123 | 55 | 73 |
- HTML: 1,090
- PDF: 456
- XML: 82
- Total: 1,628
- Supplement: 123
- BibTeX: 55
- EndNote: 73
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
The main problem with your manuscript is that the data corresponding to the simulations described in your manuscript are not available. It must be clear that manuscripts can only be submitted to Geosci. Model Dev. and accepted for Discussions and peer review after all the code and data is available publicly. Also, from a strict point of view, we can not accept the ESGF as a trusted repository for long-term archival, and you state that your data will be hosted in it. Meanwhile we can grant you an exception in this case to this rule due to the fact that probably the size of your simulations can not be hosted adequately elsewhere, at least, you should store representative data of the full model outputs in a repository we can accept according to the policy of the journal, and do it before submitting the manuscript.
Also, for your work you have used many models, and do not provide a repository containing the code of each one of them, which our policy requires.
Additionally, in the "Code and data availability" section you cite several sites for other datasets, which are not long-term repositories we can accept because they do not comply with GMD’s requirements for a persistent data archive:
* They do not appear to have a published policy for data preservation over many years or decades (some flexibility exists over the precise length of preservation, but the policy must exist).
* They do not appear to have a published mechanism for preventing authors from unilaterally removing material. Archives must have a policy which makes removal of materials only possible in exceptional circumstances and subject to an independent curatorial decision,
* They do not appear to issue a persistent identifier such as a DOI or Handle for each precise dataset.
If we have missed a published policy which does in fact address this matter satisfactorily, please post a response linking to it. If you have any questions about this issue, please post them in a reply.
The GMD review and publication process depends on reviewers and community commentators being able to access, during the discussion phase, the code and data on which a manuscript depends, and on ensuring the provenance of replicability of the published papers for years after their publication. Please, therefore, publish your code and data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible. We cannot have manuscripts under discussion that do not comply with our policy.
The 'Code and Data Availability’ section must also be modified to cite the new repository locations, and corresponding references added to the bibliography.
I understand that the work that this involves in the case of your manuscript could be substantial. However, I must note that if you do not reply to this comment and perform reasonable efforts to bring your manuscript in compliance with the policy of the journal fixing the above mentioned issues, we cannot continue with the peer-review process or accept your manuscript for publication in GMD.
Juan A. Añel
Geosci. Model Dev. Executive Editor