the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Hybrid Method for Winter Road Surface Temperature Prediction Using Improved LSTMs and Stacking-Based Ensemble Learning
Abstract. Wintertime low temperatures and snow cover usually diminish the friction coefficient of asphalt pavements, thereby elevating accident and congestion risks. Road surface temperature (RST) is an important parameter for maintaining traffic safety under extreme winter weather conditions, as it helps predict road icing events. Aiming to enhance the precision and robustness of RST prediction, this paper introduces a forecasting framework combining optimized Long Short-Term Memory (LSTM) architectures with a stacking-based ensemble strategy. Two improved LSTMs are constructed: (1) KNN-LSTM, integrating K-nearest neighbors to capture local spatiotemporal similarity patterns, and (2) Attention-BiLSTM, employing bidirectional temporal modeling with dynamic attention weighting mechanisms. These models function as base learners in the stacking ensemble, with Bayesian ridge regression utilized as the meta-learner to consolidate their predictions. The proposed hybrid model was trained and validated using minute-resolution winter meteorological data (2020–2024) collected from a station located on the Longhai Railway Bridge in Jiangsu, China. Experimental results show that the KNN-LSTM and Attention-BiLSTM models exhibit complementary advantages in capturing localized and global temporal features. The ensemble model demonstrates superior performance over individual models, achieving a 1-hour MAE of 0.074, MSE of 0.010, and MAPE of 46.7 % with a significant reduction compared with the best-performing single model. Under extended prediction horizons (3-hour and 6-hour), including low-temperature below 0 °C conditions and typical weather backgrounds, the ensemble model sustains high prediction accuracy and stability. These findings underscore the efficacy of integrating local pattern extraction with attention-based mechanisms via ensemble learning, thereby enhancing RST prediction. This study presents a scalable and adaptable framework for intelligent road weather management systems, offering practical insights for operational deployment.
- Preprint
(2630 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 20 Mar 2026)
-
CC1: 'Comment on egusphere-2025-3638', Yue Zhou, 27 Nov 2025
reply
-
AC1: 'Reply on CC1', Li Wanting, 23 Jan 2026
reply
Dear Reviewer,
Thanks for the comments and corrections. Please find our response in the attached pdf.
Best regards,
Wanting Li
-
AC1: 'Reply on CC1', Li Wanting, 23 Jan 2026
reply
-
CC2: 'Comment on egusphere-2025-3638', Fan Lingli, 08 Dec 2025
reply
This manuscript presents a hybrid framework combining improved LSTM architectures with stacking-based ensemble learning for winter road surface temperature prediction. Validated using high-resolution (minute-level) data from a highway meteorological station in Jiangsu Province (2020–2024), the model demonstrates performance across multiple forecasting horizons (1-hour, 3-hour, 6-hour), sub-zero conditions (<0°C), and typical synoptic backgrounds. The research exhibits clear engineering value (e.g., road icing warning) and interdisciplinary innovation at the intersection of machine learning and meteorology. However, improvements are needed in data representativeness, physical mechanism integration, and model generalizability, as discussed below.
Major Comments
(1)Insufficient Data Representativeness and Spatial Generalizability
The model is trained and validated exclusively on data from the Longhai Railway Bridge in Jiangsu, a region with a warm temperate semi-humid monsoon climate. This limits conclusions about performance in diverse geographies where RST dynamics differ due to terrain, vegetation, or pavement materials. And, the dataset lacks explicit coverage of extreme weather years (e.g., severe cold waves), raising questions about model reliability during rare but critical events.
(2)Integrate Meteorological Physics to Enhance Interpretability
Incorporate key parameters from the road surface energy balance equation (e.g., albedo, thermal conductivity, estimated solar radiation) as inputs or constraints.
(3)Inadequate Benchmarking Against State-of-the-Art Models
The manuscript claims superiority over "individual models" (LSTM, KNN-LSTM, Attention-BiLSTM) but lacks comparisons with recent hybrid methods in RST prediction. Quantitative metrics (e.g., MAE, MSE) against these models are absent, weakening claims of methodological advancement.
Minor Comments
(1)Ambiguities in Figure and Table Presentations
Figure 6 (RST periodicity) lacks confidence intervals, making it impossible to assess the statistical significance of diurnal variations.
(2)Inconsistent Terminology and Citation Errors
The term "Attention-LSTM" is used in Figure 12's caption but not defined in the main text; it should be corrected to "Attention-BiLSTM" for consistency.
In Section 4.2, "STM" is referenced in Figure 12's legend but not defined, causing confusion.
Summary
This study contributes a novel ensemble framework for winter RST prediction, with promising results in reducing short-term prediction errors. However, its scientific impact is limited by narrow data coverage, inadequate physical grounding, and insufficient benchmarking. Addressing the major concerns—particularly data representativeness, physical mechanism integration, and comparative validation—will strengthen the manuscript’s validity and relevance to operational road weather management. I recommend minor revisions prior to reconsideration for publication.Citation: https://doi.org/10.5194/egusphere-2025-3638-CC2 -
AC2: 'Reply on CC2', Li Wanting, 23 Jan 2026
reply
Dear Reviewer,
Thanks for the comments and corrections. Please find our response in the attached pdf.
Best regards,
Wanting Li
-
AC2: 'Reply on CC2', Li Wanting, 23 Jan 2026
reply
-
CC3: 'Comment on egusphere-2025-3638', Yan Ji, 06 Jan 2026
reply
The manuscript is closely aligned with the practical demands associated with wintertime low temperatures and road icing risks. By incorporating a KNN-LSTM model to capture locally similar historical patterns, integrating an Attention-BiLSTM to model global temporal dependencies, and employing Bayesian Ridge Regression for stacking-based fusion, the study constructs a coherent and well-motivated hybrid framework. Moreover, targeted evaluations under sub-zero (below 0 °C) conditions and representative weather scenarios are conducted, effectively demonstrating the robustness and advantages of the proposed model in key operational contexts.
Nevertheless, it should be noted that the analysis of model interpretability remains somewhat limited. Although an attention mechanism is introduced, the physical meaning of the attention weights is not explicitly visualized or discussed. Providing analyses of the importance of critical time steps or influential features would further enhance the physical interpretability and scientific insight of the proposed approach.
Citation: https://doi.org/10.5194/egusphere-2025-3638-CC3 -
AC3: 'Reply on CC3', Li Wanting, 23 Jan 2026
reply
Dear Reviewer,
Thanks for the comments and corrections. Please find our response in the attached pdf.
Best regards,
Wanting Li
-
AC3: 'Reply on CC3', Li Wanting, 23 Jan 2026
reply
-
RC1: 'Comment on egusphere-2025-3638', Anonymous Referee #1, 18 Feb 2026
reply
This manuscript proposes a hybrid framework for winter road surface temperature (RST) prediction that uses two improved LSTM variants (KNN-LSTM and Attention-BiLSTM) as base learners and combines them via stacking with Bayesian ridge regression as the meta-learner. The experiments use multi-winter observations from a single road-weather station near the Longhai Railway Bridge (Jiangsu, China) and report improved MAE/MSE/MAPE relative to selected neural baselines, including for longer horizons and sub-zero RST conditions.
The topic is relevant to winter road management and applied meteorology. However, the manuscript’s scientific positioning and primary contribution remain unclear. The current narrative emphasizes metric improvements, while the methodological novelty appears incremental; therefore, the paper would benefit from a sharper problem statement, clearer positioning, and more operationally meaningful interpretation of the reported gains. I hope the comments below will help the authors further improve the manuscript.
MAJOR COMMENTS
- The manuscript’s positioning is currently ambiguous across geoscience, Machine Learning (ML) methodology, and winter road operations. From a geoscientific perspective, the paper does not clearly advance physical understanding of the processes controlling RST. From an information engineering perspective, the modeling framework (improved LSTMs + stacking) appears to extend established components with limited methodological novelty. From a winter road management perspective, the operational meaning of the improved accuracy (e.g., how it would change decision-making or warning performance) is not yet sufficiently articulated, despite claims about operational deployment. Please state clearly what the primary contribution is (geoscientific, methodological, or operational) and align the framing and discussion accordingly.
- Applying established ML components to a new application domain can be valuable, but when novelty is incremental the contribution should be justified primarily by the domain-specific gap it addresses and by operationally meaningful evaluation and discussion. At present, the manuscript focuses on improved error metrics, but it is not yet explicit what unmet need in winter road weather operations the approach resolves. Please clarify whether the main contribution is application-driven, and if so, strengthen the Introduction/related work by explicitly synthesizing prior RST/icing-related prediction approaches already discussed and by stating a concrete gap and objective that this work fills.
- The Introduction includes extensive background (e.g., general LSTM and ensemble learning) but the problem formulation is not defined early enough, so the narrative does not yet flow cleanly as “Background → Research gap → Objective → Contribution.” Please consider restructuring so that the paper quickly specifies what is predicted (RST at 1/3/6-hour horizons at hourly resolution), why it is needed (specific winter road maintenance/icing-warning use-cases), what remains unresolved in prior work, and what this study contributes (novelty and value).
- Section 3.3.3 states that all variables were standardized using Z-score normalization, but it is unclear whether the target RST was standardized, whether predictions were inverse-transformed before evaluation, and therefore whether the reported MAE/MSE/MAPE values are in °C or in standardized units. This matters because the paper reports very small errors (e.g., MAE = 0.074) while figures label MAE in °C. Please clarify the evaluation scale, and also clarify how μ and σ were computed (training only vs. full dataset). If normalization parameters are computed using the full dataset (including the test winter), that introduces information leakage; please confirm normalization is fitted on training only and applied to validation/test.
- Bayesian ridge regression is presented as a key meta-learner and the manuscript states it provides a probabilistic framework that can model uncertainty, but the Results section reports point-estimate metrics only and does not describe predictive uncertainty. Please provide the mathematical formulation used for Bayesian ridge regression (including prior assumptions and how the posterior is obtained), and describe precisely how the stacking meta-learner is trained in your time-series setting, since Section 2.3 states that cross-validated base-learner outputs are used for supervised learning. Please also clarify whether the base learners output deterministic point predictions; if so, explain what “uncertainty” refers to in this implementation and either report predictive intervals (and how they would be used operationally) or temper the uncertainty-related claims. Finally, please explain why this meta-learner is substantively different from a simple weighted linear combination in this specific implementation, beyond regularization, given that only two base-model predictions appear to be combined.
- Solar radiation is excluded due to missing data, but the discussion under “representative synoptic conditions” emphasizes strong solar radiation effects. If solar radiation is not an explicit input, please clarify how the model is expected to capture radiation-driven diurnal forcing, for example indirectly via lagged RST and the 24-hour input window that encodes the day–night cycle. The discussion should clearly distinguish between external meteorological interpretation and what is actually represented in the model inputs, and the manuscript could briefly note possible proxy features (e.g., time-of-day) as future work while avoiding overinterpretation.
MINOR COMMENTS
- The Abstract states that the model is trained/validated using “minute-resolution” data, while the Methods describe 5-minute observations that are resampled to hourly intervals for modeling. Please make the Abstract consistent with the actual modeling resolution and forecasting setup.
- The paper states that meteorological factors and RST were resampled at hourly intervals, but the resampling method is not specified (e.g., mean/median/last value; rainfall aggregation). Because this is downsampling, please describe the resampling procedure and any steps taken to avoid introducing artifacts.
- The manuscript mentions cleaning, outlier removal, and missing-value imputation, but the specific criteria and methods are not described in sufficient detail. Please add reproducible information on thresholds/algorithms and the frequency of these operations.
- The manuscript states that Spearman correlation is suitable for assessing “non-linear dependencies,” but Spearman primarily captures monotonic relationships; please rephrase or justify this statement. In addition, wind direction is a circular variable (0° ≡ 360°), so correlation computed directly on degrees can be misleading; even though wind direction is not selected, this limitation should be acknowledged in the feature-selection description.
- The manuscript evaluates 1-hour, 3-hour, and 6-hour forecasting intervals, but the method section does not clearly explain how the 3-hour and 6-hour forecasts are generated without using future observations. For example, Eq. (3) defines the KNN-LSTM output as the predicted RST at t+1, and the Attention-BiLSTM formulation does not explicitly state how the forecast horizon is handled. Please clarify whether separate direct models are trained for each horizon or whether longer-horizon forecasts are generated recursively, and describe what information is available at time t for the 6-hour forecast. From an information-leakage perspective, future observed values (future RST and future observed meteorological variables) should not be used to generate forecasts beyond time t; if future meteorological inputs are needed, please clarify whether weather-forecast/NWP products are used or what alternative assumption is made. For the BiLSTM-based model, please also clarify that the backward pass operates only within the historical input window up to time t and does not access observations beyond time t, to avoid confusion regarding leakage.
- Section 4.3 analyzes two “representative” periods (Feb 8–10, 2024 and Feb 23–25, 2024), but the objective criteria for selecting these periods and their operational representativeness are not clearly described. Please explain how “stable synoptic” and “overcast/rainy” conditions were defined and why these cases were chosen, and clarify how this analysis complements the dedicated sub-zero evaluation in Section 4.2 (i.e., what additional insight it provides for winter road management). In addition, Section 4.3 attributes behavior to strong solar radiation, while solar radiation is excluded from inputs due to missing data; this linkage should be explained carefully (see Major Comment 6).
Citation: https://doi.org/10.5194/egusphere-2025-3638-RC1 -
RC2: 'Comment on egusphere-2025-3638', Anonymous Referee #2, 24 Feb 2026
reply
General Comments
This manuscript presents a stacking-based ensemble framework that combines two improved LSTM variants, KNN-LSTM and Attention-BiLSTM, for winter road surface temperature (RST) prediction, using Bayesian ridge regression as the meta-learner. The paper addresses a practically relevant problem in transportation meteorology. The experimental structure, which evaluates performance across multiple forecast horizons, sub-zero conditions, and contrasting synoptic regimes, is well-organized. The provision of code and data via Zenodo is commendable and supports reproducibility.
However, I have several major concerns regarding the novelty of the approach, the experimental evaluation, and certain methodological aspects that, in my view, require substantial revision before the manuscript can be considered for publication in GMD. I therefore recommend major revision.
Major Comments
1. Limited novelty
The two base learners, KNN-LSTM (Luo et al., 2019) and Attention-BiLSTM (Zhou et al., 2019), are drawn directly from prior work, and the stacking ensemble with Bayesian ridge regression is a standard technique. The paper’s contribution thus rests on combining these existing components and applying them to RST prediction. While application-oriented contributions are valid, the manuscript does not provide a sufficient theoretical or empirical justification for why this specific combination of base learners is expected to be complementary. The claim that KNN-LSTM captures “local” patterns and Attention-BiLSTM captures “global” patterns (e.g., Abstract, Section 4.1) remains largely qualitative. A more rigorous analysis, such as examining residual correlation structure, error decomposition between the two base learners, or an ablation study testing alternative base learner pairings would significantly strengthen the novelty claim.
2. Insufficient baseline comparisons
The manuscript compares the proposed ensemble model only against LSTM and its own two base learners. This is a significant weakness. To establish the practical value and added complexity of the proposed framework, comparisons should include:
- Simple baselines such as persistence forecasting (i.e., assuming RST at time t+h equals RST at time t) and linear regression, which provide a lower-bound reference.
- Established machine learning methods such as Random Forest, XGBoost, or gradient boosting, which are widely used in meteorological prediction tasks and referenced in the manuscript’s own literature review (e.g., Zhang et al., 2024; Dai et al., 2023).
- Physics-based or hybrid models such as METRo (Crevier and Delage, 2001), which the authors cite but do not benchmark against.
Without these comparisons, it is impossible to judge whether the complexity of the stacking ensemble is justified relative to simpler approaches.
3. Single-station validation
The entire study relies on data from a single observation station on the Longhai Railway Bridge. While the station provides a useful testbed, the authors repeatedly claim the framework is “scalable and adaptable” (Abstract, Conclusion) without providing any evidence of generalizability across sites, climates, or road surface types. For a paper positioning itself as presenting a “framework,” validation on at least 2–3 stations with differing microclimatic characteristics, surface materials, or geographical settings would be expected. At minimum, the authors should substantially temper their generalizability claims, or better yet, include additional validation sites.
4. MAPE as a performance metric for near-zero data
Mean Absolute Percentage Error (MAPE) is known to be problematic when observed values approach or cross zero, as the denominator in the MAPE formula (Eq. 12) causes extreme inflation. Winter RST data routinely includes values near 0 °C, making MAPE an unreliable metric in this context. Despite this, MAPE is prominently featured in the abstract, all results tables, and the discussion. The authors never acknowledge this well-known limitation. I recommend either (a) replacing MAPE with a more appropriate metric such as symmetric MAPE (sMAPE) or normalized RMSE, or (b) at minimum, providing a thorough discussion of why MAPE values appear inflated and why they should not be interpreted at face value.
5. Potential data leakage in stacking
Stacking ensemble methods are susceptible to data leakage if out-of-fold predictions are not properly generated during training. The meta-learner must be trained on predictions produced by base learners that have not seen the corresponding training samples (typically via K-fold cross-validation). The manuscript’s description of the stacking training procedure (Sections 2.3–2.4) is insufficiently detailed on this point. The authors should explicitly describe how base learner predictions for the meta-learner training set were generated and confirm that proper out-of-fold procedures were followed.
6. Scope fit for GMD
GMD focuses on the description and evaluation of geoscientific models, including numerical models, analytical models, and model evaluation frameworks. While the RST prediction problem is relevant to the geosciences, the manuscript reads primarily as a machine learning application paper rather than a contribution to geoscientific model development. The discussion of physical processes underlying RST dynamics is minimal, and the model architecture does not incorporate any physics-based constraints or domain knowledge beyond feature selection. The authors should consider strengthening the geoscientific dimension of the manuscript, for instance, by discussing how the model’s learned representations relate to known thermodynamic processes, or by comparing against physics-informed approaches.
Minor Comments
7. R² discrepancy between text and Figure 9
The text (line 334) states the ensemble model maintains “an R² value of 0.766” for the 6-hour interval, but Figure 9 panel (l) clearly shows R² = 0.796. Similarly, the text (line 337) reports the LSTM R² drops to 0.638 at the 6-hour interval, whereas Figure 9 panel (c) displays R² = 0.642. These discrepancies, though relatively small, undermine confidence in the reported results. The authors should verify all quantitative values reported in the text against the corresponding figures and tables, and ensure full consistency throughout the manuscript. If the figures and text were generated from different model runs, this must be clarified.
8. Misrepresentation of cited references
Two citations appear to misrepresent the content of the referenced works:
- Lines 85–86: Yang et al. (2010) is cited for “reducing prediction errors by fusing multiple LSTM sub-models.” However, the referenced work (Yang and Chen, 2010) concerns weighted clustering ensembles for temporal data and predates the widespread use of LSTM in ensemble frameworks. The description does not accurately reflect the cited work.
- Lines 89–90: Guo et al. (2013) is cited as combining “attention mechanisms with stacking in medical image analysis.” The referenced paper concerns multilevel feature selection for pulmonary nodule detection and does not employ attention mechanisms in the modern deep learning sense. This characterization is misleading.
The authors should carefully verify all citation, claim correspondences throughout the manuscript.
9. Figure 9 caption error
The caption for Figure 9 (line 350) states: “The term ‘Ensemble’ in panels (i), (k), and (l).” However, inspecting the 4×3 grid layout, panels (g), (h), and (i) correspond to the Attention-BiLSTM row, while the Ensemble row occupies panels (j), (k), and (l). The correct reference should therefore be “panels (j), (k), and (l).”
10. Figure 12 axis labeling
In Figure 12, the x-axis displays sample index numbers (0–1400) rather than datetime labels. In contrast, Figures 10 and 13 use proper date/time axes. Since Figure 12 specifically examines sub-zero RST conditions, temporal context (time of day, date) would greatly aid interpretation, for example, it would allow the reader to identify whether prediction failures coincide with specific diurnal phases or weather events. The authors should replace the sample index with corresponding datetime labels for consistency with other figures.
11. Variable naming inconsistency in Figure 7
The Spearman correlation heatmap (Figure 7) uses the variable labels “INWindSpeed” and “INWindDirection,” whereas the text and Table 1 refer to these variables as “Wind speed” and “Wind direction.” The “IN” prefix is not defined anywhere. The authors should harmonize variable naming across all figures, tables, and text.
12. Citation formatting inconsistency
In Section 2.2 (line 162), the citation reads “Zhou Wenye et al. (2019)” using the author’s full first name, whereas all other citations use surname only. This should be corrected to “Zhou et al. (2019)” for consistency.
13. K-iteration procedure (lines 153–155)
The description stating that “the value of K is incremented by 1, and the loop continues until K reaches M” is confusing, and the flowchart in Figure 2 confirms this loop structure (K = K + 1 with termination at K ≤ M). If this is indeed part of the inference procedure, it implies running the full KNN-LSTM pipeline M times with progressively increasing K, which would be computationally prohibitive and conceptually unusual. If it is instead a hyperparameter search strategy, it belongs in Section 3.3.2 rather than the model architecture section. The authors must clarify the purpose and computational cost of this loop.
14. Climatological representativeness of the test period
The test set consists of a single winter (December 2023–February 2024). Was this winter climatologically typical for the region? An anomalously warm or cold winter could bias the evaluation. A brief discussion comparing test-period conditions to climatological normals would be valuable.
15. Missing discussion of solar radiation
The authors note that “due to missing data, solar radiation was not considered in this study” (line 239). Solar radiation is a primary driver of RST variability, as the authors themselves implicitly acknowledge when discussing diurnal periodicity (Figure 6) and performance degradation during high-temperature daytime conditions (Section 4.1.1). Its omission should be discussed more thoroughly as a limitation, with consideration of how its inclusion might affect model performance.
16. Equation formatting
Equation 2 contains a stray subscript “i” on the distance term in the denominator — d(Xt, Xi)i ?
17. Density scatter plot presentation
The density scatter plots in Figure 9 use different color bar scales across panels (e.g., the 1-hour panels have color bars ranging to ~1.75 while 6-hour panels range to ~0.30–0.40). While this is expected given the different density ranges, it makes visual cross-comparison between time horizons difficult. The authors should consider using a consistent color bar range, or at minimum note in the caption that scales differ across panels.
18. Language and terminology
- Line 35: “The intensification of climate change” would read better as “Ongoing climate change” or “The intensification of climate change impacts.”
- Line 76: A period is missing after “meta-learners” and before “In subsequent research.”
- Line 100: “stacking integration” should be “stacking-based integration” or “stacking ensemble” for consistency with the rest of the manuscript.
- Line 398: Double period at end of sentence: “…underestimation of RST values..”
- Throughout: Some sentences are overly long and could benefit from restructuring for clarity (e.g., lines 105–110).
Citation: https://doi.org/10.5194/egusphere-2025-3638-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 499 | 214 | 41 | 754 | 18 | 17 |
- HTML: 499
- PDF: 214
- XML: 41
- Total: 754
- BibTeX: 18
- EndNote: 17
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript introduces a forecasting framework combining optimized Long Short-Term Memory (LSTM) architectures with a stacking-based ensemble strategy, aimed at predicting road surface temperature (RST). The application of this method can provide certain support for winter RST forecasting. The manuscript has a clear research motivation, and the experimental results show improvements. However, there are still several issues that require further revision.
Major Comments:
1、The discussion on the impact of weather conditions on RST is insufficient and needs to be strengthened.
2、In fact, this study uses hourly data for analysis. A detailed description of data quality and characteristics should be provided. Additionally, how does the authors’ method of converting minute-level data to hourly data differ from that used by meteorological departments?
3、The manuscript devotes substantial space to introducing methodologies. For mature methods, the focus should be on citations and brief descriptions, with emphasis on the application value and innovative points of these methods in this study.
4、A comprehensive introduction to the observation site is required: is it a station on a highway bridge or a regular road surface? The impact of surface latent heat on RST varies significantly between these two settings, and this should be clearly clarified.
5、There are almost no references cited to support the analysis in the main text. Relevant research achievements in the field should be supplemented as theoretical support to enhance the scientific rigor and credibility of the discussion, especially comparative analyses with other RST forecasting methods and results.
6、For RST prediction, forecasting under low-temperature and overcast/rainy conditions is particularly critical. However, the manuscript provides insufficient analysis of RST prediction results under these weather conditions. More relevant analyses should be added, along with physical mechanism explanations for how weather conditions influence prediction performance.
Minor Comments:
1、The descriptions of data in Table 1 are of little significance, as they only cover conventional meteorological variables. More attention should be paid to data distribution and quality control details.