the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A synoptic clustering-based definition of South China Sea summer monsoon onset and application to seasonal prediction
Abstract. Accurately predicting the South China Sea summer monsoon (SCSSM) onset date is of crucial importance for effective water resource management, agricultural planning, and disaster prevention across East Asia and the Western North Pacific. However, reliable predictions at seasonal timescales remain challenging. To address this, we propose a synoptic circulation–based approach that defines monsoon onset through persistent large-scale circulation regimes identified by clustering low-level atmospheric fields. Using ECMWF SEAS5 seasonal forecasts, we evaluate onset predictions derived from the proposed regime-based definition against those based on a conventional zonal wind criterion. The regime-based definition yields systematic improvements in deterministic correlations, potential predictability, and categorical and probabilistic skill metrics. Skill gains are evident during both a dependent training period and an independent forecast period, with enhanced performance persisting at long lead times of up to five months. The improved predictability reflects the multi-timescale controls on monsoon onset, wherein slowly varying boundary forcing modulates large-scale circulation and subseasonal variability triggers the onset transition. By emphasizing regime persistence and structural evolution, the proposed framework better isolates the predictable component of onset variability and enhances forecast robustness. These findings demonstrate that circulation-regime–based definitions offer a physically grounded and effective alternative for seasonal prediction of monsoon onset.
- Preprint
(7547 KB) - Metadata XML
-
Supplement
(1005 KB) - BibTeX
- EndNote
Status: open (until 20 May 2026)
-
RC1: 'Comment on egusphere-2026-358', Anonymous Referee #1, 19 Mar 2026
reply
-
AC1: 'Reply on RC1', Dzung Nguyen-Le, 21 Apr 2026
reply
We sincerely thank Reviewer #1 for the careful reading of our manuscript and for the constructive and insightful comments. In response, we have revised the manuscript substantially. The main revisions include:
(i) adding OLR and MTG diagnostics to evaluate the thermodynamic and convective consistency of the circulation-based onset definition;
(ii) restructuring Section 3.2 to improve the transition from climatological evolution to interannual forcing;
(iii) adding cross-index validation against the conventional W04 benchmark;
(iv) revising the discussion of extreme years more cautiously and adding representative case analyses; and
(v) softening the interpretation of subseasonal triggering and multi-timescale control.A detailed point-by-point response is provided in the attached supplement / below.
-
AC1: 'Reply on RC1', Dzung Nguyen-Le, 21 Apr 2026
reply
-
CC1: 'Comment on egusphere-2026-358', Peng Hu, 30 Mar 2026
reply
The manuscript presents a synoptic circulation–based clustering approach (NL26) for defining the South China Sea summer monsoon onset. Overall, the study is well-conceived and methodologically sound. The proposed approach demonstrates substantial improvements in deterministic, categorical, and probabilistic prediction skill relative to the conventional U850-based onset definition (Wang et al., 2004; W04). The analysis is comprehensive, and the findings are potentially valuable for improving seasonal forecasts. However, several aspects of the manuscript require further clarification and elaboration before the work can be considered for publication.
(1) Abbreviations and notation: This manuscript employs numerous abbreviations (e.g., OD, NL26, W04, RPC, ACC, HSS, RPSS, BSS), which may impede readability. It is strongly recommended that the authors provide a dedicated table summarizing all abbreviations and their definitions for clarity.
(2) Differences between NL26 and W04 onset dates: Figure 2 indicates that the NL26 onset dates differ from W04 in certain years, but the current explanation is rather generic (e.g., “reflecting the persistence and maturity criteria”). The authors should provide a more detailed, year-specific analysis of these discrepancies. For example, are delayed onsets due to brief westerly intrusions being excluded by the persistence criterion? Are earlier onsets associated with an early and coherent transition to monsoon-type circulation? A more mechanistic discussion would enhance the reader’s understanding of the advantages and physical basis of the NL26 definition.
(3) Limitations of the proposed method: The manuscript primarily emphasizes the benefits of the NL26 approach but does not adequately discuss its potential limitations, which may include: (a) applicability in real-time operational forecasting contexts, (b) complexity of implementing the SOM plus K-means workflow, which could pose challenges for replication, (c) sensitivity to extreme events or limited sample years. A discussion of these limitations would provide a more balanced and rigorous assessment of the method.
(4) Weakening relationship between SCSSM onset and ENSO: The manuscript discusses the modulation of SCSSM onset by ENSO. However, recent studies suggest a interdecadal weakening of the ENSO–SCSSM onset relationship (Hu et al. 2022; Hu et al. 2026). The authors should acknowledge this weakening relationship and discuss its potential implications for forecast skill in the independent verification period (2017–2024).
Hu P, Chen W, Chen S, et al. The weakening relationship between ENSO and the South China Sea summer monsoon onset in recent decades. Advances in Atmospheric Sciences, 2022, 39(3): 443-455.
Hu P, Chen W, Cai Q, et al. Delayed tropical Asian summer monsoon onset in recent decades. Geophysical Research Letters, 2026, 53(1): e2025GL120825.
(5) Please delete the phrase “also known as the East Sea in Vietnam”, as this name is not widely recognized or accurate for the international audience.
Citation: https://doi.org/10.5194/egusphere-2026-358-CC1 -
AC3: 'Reply on CC1', Dzung Nguyen-Le, 21 Apr 2026
reply
We sincerely thank Prof. Peng Hu for the careful reading of our manuscript and for the constructive comments and helpful suggestions. In response, we have revised the manuscript substantially. In particular, we added a table of abbreviations and key forecast verification metrics, expanded the discussion of differences between NL26 and W04, added a more balanced limitations discussion, acknowledged the recently weakened ENSO–SCSSM onset relationship and its implications for forecast skill, and removed the phrase “also known as the East Sea in Vietnam.”
A detailed point-by-point response is provided in the attached supplement / below.
-
AC3: 'Reply on CC1', Dzung Nguyen-Le, 21 Apr 2026
reply
-
CC2: 'Comment on egusphere-2026-358', Peng Hu, 30 Mar 2026
reply
The manuscript presents a synoptic circulation–based clustering approach (NL26) for defining the South China Sea summer monsoon onset. Overall, the study is well-conceived and methodologically sound. The proposed approach demonstrates substantial improvements in deterministic, categorical, and probabilistic prediction skill relative to the conventional U850-based onset definition (Wang et al., 2004; W04). The analysis is comprehensive, and the findings are potentially valuable for improving seasonal forecasts. However, several aspects of the manuscript require further clarification and elaboration before the work can be considered for publication.
(1) Abbreviations and notation: This manuscript employs numerous abbreviations (e.g., OD, NL26, W04, RPC, ACC, HSS, RPSS, BSS), which may impede readability. It is strongly recommended that the authors provide a dedicated table summarizing all abbreviations and their definitions for clarity.
(2) Differences between NL26 and W04 onset dates: Figure 2 indicates that the NL26 onset dates differ from W04 in certain years, but the current explanation is rather generic (e.g., “reflecting the persistence and maturity criteria”). The authors should provide a more detailed, year-specific analysis of these discrepancies. For example, are delayed onsets due to brief westerly intrusions being excluded by the persistence criterion? Are earlier onsets associated with an early and coherent transition to monsoon-type circulation? A more mechanistic discussion would enhance the reader’s understanding of the advantages and physical basis of the NL26 definition.
(3) Limitations of the proposed method: The manuscript primarily emphasizes the benefits of the NL26 approach but does not adequately discuss its potential limitations, which may include: (a) applicability in real-time operational forecasting contexts, (b) complexity of implementing the SOM plus K-means workflow, which could pose challenges for replication, (c) sensitivity to extreme events or limited sample years. A discussion of these limitations would provide a more balanced and rigorous assessment of the method.
(4) Weakening relationship between SCSSM onset and ENSO: The manuscript discusses the modulation of SCSSM onset by ENSO. However, recent studies suggest a interdecadal weakening of the ENSO–SCSSM onset relationship (Hu et al. 2022; Hu et al. 2026). The authors should acknowledge this weakening relationship and discuss its potential implications for forecast skill in the independent verification period (2017–2024).
Hu P, Chen W, Chen S, et al. The weakening relationship between ENSO and the South China Sea summer monsoon onset in recent decades. Advances in Atmospheric Sciences, 2022, 39(3): 443-455.
Hu P, Chen W, Cai Q, et al. Delayed tropical Asian summer monsoon onset in recent decades. Geophysical Research Letters, 2026, 53(1): e2025GL120825.
(5) Please delete the phrase “also known as the East Sea in Vietnam”, as this name is not widely recognized or accurate for the international audience.
Citation: https://doi.org/10.5194/egusphere-2026-358-CC2 -
AC4: 'Reply on CC2', Dzung Nguyen-Le, 21 Apr 2026
reply
We sincerely thank Prof. Peng Hu for this additional comment. As this comment overlaps substantially with the points raised in CC1, our detailed response is provided in our reply to CC1. We have revised the manuscript accordingly.
Citation: https://doi.org/10.5194/egusphere-2026-358-AC4
-
AC4: 'Reply on CC2', Dzung Nguyen-Le, 21 Apr 2026
reply
-
RC2: 'Comment on egusphere-2026-358', Anonymous Referee #2, 15 Apr 2026
reply
The manuscript presents a synoptic circulation–based clustering approach (NL26) for defining the South China Sea summer monsoon onset. Overall, the study is well-conceived and methodologically sound. The proposed approach demonstrates substantial improvements in deterministic, categorical, and probabilistic prediction skill relative to the conventional U850-based onset definition (Wang et al., 2004; W04). The analysis is comprehensive, and the findings are potentially valuable for improving seasonal forecasts. However, several aspects of the manuscript require further clarification and elaboration before the work can be considered for publication.
(1) Abbreviations and notation: This manuscript employs numerous abbreviations (e.g., OD, NL26, W04, RPC, ACC, HSS, RPSS, BSS), which may impede readability. It is strongly recommended that the authors provide a dedicated table summarizing all abbreviations and their definitions for clarity.
(2) Differences between NL26 and W04 onset dates: Figure 2 indicates that the NL26 onset dates differ from W04 in certain years, but the current explanation is rather generic (e.g., “reflecting the persistence and maturity criteria”). The authors should provide a more detailed, year-specific analysis of these discrepancies. For example, are delayed onsets due to brief westerly intrusions being excluded by the persistence criterion? Are earlier onsets associated with an early and coherent transition to monsoon-type circulation? A more mechanistic discussion would enhance the reader’s understanding of the advantages and physical basis of the NL26 definition.
(3) Limitations of the proposed method: The manuscript primarily emphasizes the benefits of the NL26 approach but does not adequately discuss its potential limitations, which may include: (a) applicability in real-time operational forecasting contexts, (b) complexity of implementing the SOM plus K-means workflow, which could pose challenges for replication, (c) sensitivity to extreme events or limited sample years. A discussion of these limitations would provide a more balanced and rigorous assessment of the method.
(4) Weakening relationship between SCSSM onset and ENSO: The manuscript discusses the modulation of SCSSM onset by ENSO. However, recent studies suggest a interdecadal weakening of the ENSO–SCSSM onset relationship (Hu et al. 2022; Hu et al. 2026). The authors should acknowledge this weakening relationship and discuss its potential implications for forecast skill in the independent verification period (2017–2024).
Hu P, Chen W, Chen S, et al. The weakening relationship between ENSO and the South China Sea summer monsoon onset in recent decades. Advances in Atmospheric Sciences, 2022, 39(3): 443-455.
Hu P, Chen W, Cai Q, et al. Delayed tropical Asian summer monsoon onset in recent decades. Geophysical Research Letters, 2026, 53(1): e2025GL120825.
(5) Please delete the phrase “also known as the East Sea in Vietnam”, as this name is not widely recognized or accurate for the international audience.
Citation: https://doi.org/10.5194/egusphere-2026-358-RC2 -
AC2: 'Reply on RC2', Dzung Nguyen-Le, 21 Apr 2026
reply
We sincerely thank Reviewer #2 for the constructive comments and helpful suggestions. In response, we have revised the manuscript substantially to improve clarity, balance, and presentation. The main changes include:
(i) adding Table 1 summarizing abbreviations and key forecast verification metrics;
(ii) expanding the discussion of differences between NL26 and W04, including representative year-specific examples;
(iii) adding a more balanced discussion of methodological limitations;
(iv) acknowledging the recently weakened ENSO–SCSSM onset relationship and its implications for forecast skill during the independent period; and
(v) removing the phrase “also known as the East Sea in Vietnam.”A detailed point-by-point response is provided in the attached supplement / below.
-
RC4: 'Reply on AC2', Anonymous Referee #2, 22 Apr 2026
reply
Thank you very much for the authors’ responses and revisions. I have no further comments, and the manuscript can be accepted for publication in its current form.
Citation: https://doi.org/10.5194/egusphere-2026-358-RC4
-
RC4: 'Reply on AC2', Anonymous Referee #2, 22 Apr 2026
reply
-
AC2: 'Reply on RC2', Dzung Nguyen-Le, 21 Apr 2026
reply
-
RC3: 'Comment on egusphere-2026-358', Anonymous Referee #3, 16 Apr 2026
reply
This study proposed a synoptic circulation-based approach to define the SCS monsoon onset date and found that the onset prediction using regime-based definition yields improvements compared to the conventional zonal wind criterion. The results are important for the seasonal prediction of monsoon onset and the paper is overall well-written. However, some analyses and conclusions need to be elucidated more clearly. Specific comments are listed below.
Major comments:
- Comparisons of regime-based definition (NL26) and zonal wind criterion (W04): (1) This paper shows that NL26 outperforms W04 in predicting monsoon onset, because W04 is sensitive to synoptic disturbances while NL26 requires persistence and capture the subseasonal variations. Is W04 defined by zonal wind on one day? If a persistence criterion applies to the zonal wind, can the predicting skills also obviously improve? If yes, then the simple definition based on zonal wind seems more easily applicable and what’s the advantage of NL26? (2) Lines 378-381: The composites based on zonal wind criterion are also supposed to demonstrate stable differences between pre-monsoon and monsoon periods. I think the transient wind reversal only occurs in some cases. (3) Line 450: How do you deduce that the local zonal-wind threshold is sensitive to short-lived wind reversals and synoptic noise? (4) Line 466: How do you deduce that the clustering-based definition effectively filters out unpredictable synoptic-scale fluctuations
- The author emphasized that this study aims to extend the lead time of monsoon onset prediction from three months to four or five months. I think the improvements of predicting skills under the same lead time should also be highlighted.
- Methods: (1) Why is the studied domain chosen as 5°S–25°N and 95°E–135°E? Do the results sensitive to the spatial domain? (2) Line 150: How to obtain d=144? Do you mean that the first 144 EOF modes are extracted, which totally explain 99% of the total variance? Please clarify it.
- Figure 2: (1) Overlapping NL26 on Figure 2a rather than W04, so as to match with the transition of circulation regimes. (2) In some years, NL26 and W04 are quite different. What happen in these years? The authors can specifically analyze the circulation evolution in these years to understand the differences. Albeit a specific day is classified to one cluster by the SOM or K-means methods, the actual circulation on the day might differ from the composite patterns. Case analyses might help better understand the remarkably different indices.
- Fig. 6: The correlation coefficients in Jan (Mar) is smaller than Dec (Feb). Why do the predicting skills fluctuate from December to April rather than stably increase as the lead time shortens?
Minor comments:
- Line 26: “the South China Sea summer monsoon” should not be divided as “the South China Sea, summer monsoon”
- Figure 1: The white color is not contained in the color bar. Can the subplots be resorted by pre-monsoon and monsoon regimes? This makes it more easily to remember the six patterns.
- Line 365: “northeasterly” change to “southeasterly”?
- Line 565: “WL04” should be “W04”
Citation: https://doi.org/10.5194/egusphere-2026-358-RC3 -
AC5: 'Reply on RC3', Dzung Nguyen-Le, 21 Apr 2026
reply
We sincerely thank Reviewer #3 for the careful reading of our manuscript and for the constructive comments. In response, we have further clarified the distinction between the proposed regime-based definition and the conventional zonal-wind criterion, as well as the interpretation of the forecast-skill results. The revisions include:
(i) clarifying that both W04 and NL26 are applied to pentad-mean circulation fields;
(ii) explaining more clearly why NL26 differs from a persistence-filtered zonal-wind index;
(iii) clarifying the domain choice and EOF truncation procedure;
(iv) adding representative case analyses for years when NL26 and W04 differ; and
(v) discussing the non-monotonic lead-time dependence of forecast skill more carefully.A detailed point-by-point response is provided in the attached supplement / below.
-
RC5: 'Reply on AC5', Anonymous Referee #3, 22 Apr 2026
reply
The authors have clearly addressed my previous concerns. I further recommend adding some references to illustrate the influences of the atmospheric circulation over the South China Sea and the application of clustering methods.
1. Tang, S., Chen R., Chen Z., 2025: Spatial diversity of the summer marine heat waves in the South China Sea. Journal of Climate, 38, 1051-1065.
2. Chen, R., Wen Z., Lin W. and Qiao Y., 2023: Diverse relationship between the tropical night in South China and the water vapor transport over the South China Sea and the plausible causes. Atmospheric Research, 296, 107080.
3. Li, X., Chen R., and Qiao Y., 2024: Assessing the extended-range forecast skills of the extreme heat events over South China based on three S2S models. Atmospheric Science Letters, 25(9), e1253.
Citation: https://doi.org/10.5194/egusphere-2026-358-RC5 -
AC6: 'Reply on RC5', Dzung Nguyen-Le, 23 Apr 2026
reply
We thank the reviewer for this helpful suggestion. In the revised manuscript, we added these references in the Introduction to broaden the regional context of the study. Tang et al. (2025) and Chen et al. (2023) are now cited to emphasize that SCS circulation variability is also linked to regional oceanic and thermal extremes, while Li et al. (2024) is cited to highlight the broader relevance of circulation-based prediction for related extended-range forecast applications over South China.
Citation: https://doi.org/10.5194/egusphere-2026-358-AC6
-
AC6: 'Reply on RC5', Dzung Nguyen-Le, 23 Apr 2026
reply
-
RC5: 'Reply on AC5', Anonymous Referee #3, 22 Apr 2026
reply
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 206 | 100 | 28 | 334 | 41 | 16 | 26 |
- HTML: 206
- PDF: 100
- XML: 28
- Total: 334
- Supplement: 41
- BibTeX: 16
- EndNote: 26
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This study proposes a novel synoptic clustering-based approach (NL26) using Self-Organizing Maps (SOM) to define the South China Sea Summer Monsoon (SCSSM) onset based on persistent large-scale circulation regimes. Evaluated using the ECMWF SEAS5 seasonal hindcasts, the author concludes that this regime-based definition yields systematic improvements over the conventional zonal wind-based criterion (W04) in deterministic and probabilistic skill metrics up to a 5-month lead time.
The manuscript addresses a highly relevant and challenging topic in seasonal monsoon prediction. The methodology is interesting, and the motivation aligns well with the scope of Weather and Climate Dynamics. However, before the manuscript can be recommended for publication, there are several major scientific concerns that need to be addressed. These primarily relate to the physical consistency of the new index, the validity of the "improved predictability" during extreme delayed years (e.g., 2018), and a lack of diagnostic evidence supporting the claims regarding subseasonal-to-interannual timescale interactions. I recommend a Major Revision.
Major Comments
1. Physical Consistency with Thermodynamic Metrics. Figure 2 shows a correlation of 0.74 between the NL26 and W04 indices. While the SOM clustering is designed to capture large-scale circulation regimes, the abrupt onset of deep convection and precipitation remains the most critical thermodynamic characteristic of the SCSSM onset. Since precipitation was not selected as an input variable for the clustering, it is vital to verify whether this pure circulation-based index remains physically consistent with actual convective activities. The author should include a comparative analysis between the NL26 index and key thermodynamic/convective indicators, specifically the Meridional Temperature Gradient (MTG) and Outgoing Longwave Radiation (OLR), to ensure that the identified circulation regimes robustly correspond to the onset of the monsoon rainfall.
2. Logical Transition from Climatology to Interannual Forcing. There is a noticeable logical gap between the climatological evolution presented in Figure 3 and the interannual Sea Surface Temperature (SST) patterns associated with early/late onset years shown in Figure 4. The transition from mean state characteristics to specific interannual boundary forcings feels abrupt and lacks sufficient diagnostic bridging in the text. Consider moving Figure 4 to the Supplementary Material, or significantly expand the dynamical explanation in the text to justify the transition to SST forcing patterns at this stage of the manuscript.
3. Cross-Validation against Operational Benchmarks. In Section 4.1.1 (Forecast skill assessment), Figure 5 validates the W04 model forecasts against W04 observations, while Figure 6 validates the NL26 forecasts against NL26 observations. Although internally consistent, this comparison is insufficient to demonstrate the practical forecasting value of the new method. Given that W04 is widely applied as a standard benchmark in operational forecasting services, please add an evaluation comparing the NL26 model forecasts directly against the W04 observational values. This cross-index validation is necessary to quantify the actual added value of the NL26 approach in real-world operational contexts.
4. Forecast Skill Evaluation in Extreme Years (e.g., 2018). In Section 4.2.1, the manuscript highlights that the correlation coefficient for the NL26 prediction is significantly higher than that of W04. However, a detailed comparison of Figures 8 and 9, alongside the lead-time performance (December to April), reveals a critical issue. The apparent improvement in NL26 is largely driven by its performance in specific years, such as 2018. In reality, 2018 featured an extremely late SCSSM onset. The NL26 index defines the "true" onset date for 2018 as May 8, which allows the model to register a "hit" at all lead times. However, this entirely misses the actual physical delay of the monsoon onset that year. The author must provide an in-depth discussion on extreme years where traditional ENSO-based seasonal forecasts typically fail (e.g., 2018 and 2019). It is essential to clarify whether the claimed "enhanced forecast robustness" is a genuine improvement in capturing anomalous monsoon dynamics or merely a mathematical artifact resulting from redefining the onset target.
5. Evidence for Subseasonal Variability Claims. The abstract and discussion state that the improved predictability reflects "multi-timescale controls" and that "subseasonal variability triggers the onset transition." The author suggests that the NL26 index better isolates the predictable component when the ENSO-monsoon relationship is weak. However, the main text lacks concrete case studies or dynamical diagnostics to substantiate this. Existing literature shows that predicting the onset is more difficult in years dominated by subseasonal signals. To support the current claims, the author should include specific case studies [such as 2019 or other years with strong intraseasonal oscillation but weak ENSO forcing] to explicitly demonstrate how the NL26 index outperforms traditional indices in capturing the superimposition of subseasonal signals onto the large-scale circulation.
Other suggestion
1. Typographical Error in Figure 2: In the final row of Figure 2, the subplot labels should be corrected to C1, C3, and C6 to align with the SOM clustering nomenclature used throughout the manuscript.