Refining Predictive Models for Sea Surface Currents: A Focus on Variable Configuration and Time Sequence Analysis

Aldini, Ittaka; Permanasari, Adhistya; Hidayat, Risanuri; Ramdhani, Andri

doi:https://doi.org/10.5194/egusphere-2024-3142

Preprints

https://doi.org/10.5194/egusphere-2024-3142

Preprints

25 Oct 2024

| 25 Oct 2024

Refining Predictive Models for Sea Surface Currents: A Focus on Variable Configuration and Time Sequence Analysis

Ittaka Aldini, Adhistya Permanasari, Risanuri Hidayat, and Andri Ramdhani

Abstract. Accurate prediction of sea surface currents is crucial for understanding ocean dynamics, climate variability, and marine ecosystem health. Despite advancements in statistical modeling, challenges remain in terms of optimizing model parameters and variable configurations to enhance prediction accuracy. This study employed high-frequency (HF) radar data from the Bali Strait (2018–2021) to develop a statistical modeling approach for sea surface current prediction. We utilize random forest regression (RFR) as the primary machine learning technique. The data were subjected to a rigorous preprocessing pipeline to ensure robustness, including selection, cleaning, and imputation. We define 11 distinct model configurations with various input parameters, such as moving averages (avgh3, avgh6, or avgh12) and previous day values (h-24, h-48, and h-72). Our analysis focused on three prediction schemes: seasonal (P1) and monthly (P2 and P3), each with tailored training and testing data allocations. This study evaluates the models using root mean square error (RMSE) and Coefficient of Determination (R²). Results indicate that combining moving-average predictors significantly enhances the accuracy of long-term forecasts, whereas short-term predictions benefit from utilizing recent data. Our findings highlight specific variable configurations, particularly those incorporating moving averages, which lead to superior performance in sea surface current prediction. The results indicate that models employing configurations F1, F5, and F8 yield the best results, highlighting the importance of optimizing model variables to achieve high-accuracy predictions.

Received: 13 Oct 2024 – Discussion started: 25 Oct 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Ittaka Aldini, Adhistya Permanasari, Risanuri Hidayat, and Andri Ramdhani

Status: closed

RC1:
'Comment on egusphere-2024-3142', Anonymous Referee #1, 09 Dec 2024

Major comments:
The study aims to improve the predictability of surface currents remotely sensed by oceanographic HF radars, installed in the Bali Strait on the Indonesian coast. The subject is of great interest. So, this research is amply justified. A statistical modeling approach is used, more specifically Random Forest Regression (RFR) which belongs to machine learning techniques. Various configurations of the prediction model have been tested, and the results delivered by some of the models are very impressive demonstrating the validity of the method chosen by the authors. The paper is well written, well structured, and contains sufficient information to support the conclusions of this study.
However, I don't think it's correct to say that surface currents are being studied. This is my major point of criticism and I invite author to carefully revise the paper and clearly explain what quantity has been analyzed and forecast. I understand that they are not oceanographers and their knowledge of the technologies used in oceanography is limited. Below I give some information which I hope will be helpful for improving the paper which is submitted to the oceanographic journal.
1/ The use of the term "U and V component radial current velocity" is misleading. HF radars measure the radial velocity of surface currents which is a projection of real current vector on radar beams. U and V components are used (not always) to correctly represent the radial velocity vector without using the polar coordinate system (direction and distance from the radar site). In reality, only this quantity, radial velocity, is estimated from the backscattered power received by the radars. As it is a projection of the current velocity vector, it can reach large values if the radar beam is almost parallel to the current velocity vector, but can also have zero values when the beam direction is normal to the current direction. This means that radial velocity may have a zero value (both U and V components are zero) even if the true current velocity is large. Moreover, a region of nearly-zero radial velocity can be fairly large while the currents can be strong.
Please explain how this was handled in your analysis.
2/ The current vector reconstruction is done by combining the radial velocities from two, or more, radar sites. I am not sure that this was done in the study. If it was done, please provide description how the current vectors have been reconstructed. The two techniques most widely used in the radar community are described in (Yaremchuk and Sentchev, Continental Shelf Res. 2009). Only in this case, u and v-velocity components of the "total" velocity vector (as opposed to radial velocity) are used. If current vectors were not reconstructed, please explain how the analyzed quantity accounts for the surface current variability in a wide range of scales.
3/ In the Introduction and Conclusion sections, the authors indicate that their method is useful for addressing the complexity of sea surface current forecasting issues. I think that it could be useful to add a figure showing a) the study region, b) the spatial coverage by one or two radars used, to describe surface currents and provide a vector map of surface currents (instantaneous velocities, time averaged , others) to illustrate spatial complexity. Again, I insist that the ms is addressed to the oceanographic journal.
4/ Words such as "careful curating", "pertinent for scientific objectives" (ln 81 and in some other lines) should be avoided if not supported by clear indications how the data selection process was organized and what are the targeted objectives.
5/ Text in ln 96-98 should be completed by including a description of spatial variations in data availability. Usually, there are much less data available at far ranges and along the side beams. Is it your case? Are the regions with poor coverage (less than 50%) were excluded from analysis/training/forecast? As the study region is not shown, it's difficult to have an idea of the spatial expansion of the area with high percentage of the data return (~90% Tab. 1).
6/ Ln 133. It is explained that the model was trained to accurately represent only temporal, seasonal variability. Spatial variability is not mentioned. I assume that it was not accounted for in the present study. This should be clearly indicated. Here again, I would point out that space variations in radial velocities alone cannot shape correctly the spatial variability of surface currents. One of the reasons is given in 1.
7/ There is some inconsistency in definition of model variables avgh3, 6, … In the abstract, the moving average is mentioned, while in section 2.2, ln148, a simple average is given. Please revise the text.
8/ Ln 135: One of the advantages of RFR is its robustness against missing data. However, the results of the study are contradictory and showed a large sensitivity of RFR to missing data, ln 420-424. Please, provide explanation.
Other comments
The text in ln 44-45 should be revised: Current models (what models in particular?), based on historical data (what kind of data?) may not capture the dynamics … I believe that modern circulation models are sufficiently efficient to represent the multi-scale variability of ocean currents. If it doesn't appear in Zhao et al., use other references.
Check throughout the whole ms the name of R2, introduced in ln 175-179. It doesn't have unit (ln. 180) and is not exactly equal to correlation coefficient (ln 330).
Figure captions should be completed by providing description of all variables plotted. The description of results plotted should be more detailed. In Fig. 4, for ex., where the future data starts from? What the left figure brings for understanding the results?
In Fig. 5 middle and bottom panels, velocity variations of tidal origin clearly appear. The model forecasting score for this variability is high and this is not surprising. It seems that tidal current variability was not discussed. Please include it in discussion.
I've always thought that the Asian monsoon has only two seasons, winter and summer which are not equal to seasons in the mid-latitudes. How can you justify the choice of four monsoon seasons? Do they really exist?

Citation: https://doi.org/10.5194/egusphere-2024-3142-RC1
- AC1:
  'Reply 1A on RC1', Ittaka Aldini, 20 Dec 2024
  Dear Expert Reviewers #1,
  
  We are thankful for your expert review, thoughtful insights and comments toward our manuscript.
  
  We already wrote the responses to your comments divided into 2 sections (1 to 7) and (8 to 13).
  
  Here is our response toward every of your valuable review:
  
  Response to Comments no 1 to 7
  
  ================================
  We appreciate your insightful comment regarding our use of terminology related to surface currents measured by HF radar. To clarify, our study utilized U and V components derived from radial velocities obtained through HF radar observations. The internal process of HF radar systems calculates U (east-west flow) and V (north-south flow) by combining the radial velocities from multiple radar beams using trigonometric transformations based on their respective angles. We are referring to a study by Paduan and Washburn (2013) in the construction of U and V data from radial velocity [1]. Thus, the U and V components data used in this study are the projections of the actual velocity captured by our HF radar system. We want to add that this study uses the final U and V data and did not perform the preprocessing or transformation from radial velocity to U and V data. We acknowledge that the internal processes in the HFR system, such as the transformation process, can introduce cases where radial velocities may exhibit zero values despite significant true currents due to geometric factors. We are only focusing on the existing HF radar data, both for training and testing; therefore, we will include an additional note in our discussion section for other researchers to potentially study any instances of resulting zero values despite significant true currents, especially in studies that compare prediction results with actual measured currents from different instruments. We will revise and add this explanation to our manuscript accordingly to provide a clearer understanding.
  
  We appreciate your comments; we clarify that the U and V data used in our analysis were the direct output from the HF radar system, without an explicit vector reconstruction step performed in our study. So, it had done in the internal process of the HF Radar system before the U and V data (that we used) are generated. As noted in the review by Paduan and Washburn (2013).
  
  In addition, High-Frequency (HF) radar systems have gained recognition for their ability to capture surface current variability across a wide range of spatial and temporal scales. Here are few supporting statement and references:
  
  Kaplan et al. [2] further illustrate the capability of HF radar to capture surface circulation patterns by presenting a case study off Bodega Bay, California (Kaplan et al., 2005). Their analysis demonstrates how HF radar data can be utilized to assess tidal influences and other dynamic processes affecting surface currents. The findings underscore the effectiveness of HF radar in providing high-resolution current measurements that account for variability over time.
  
  Solabarrieta et al. [3] validate the accuracy of HF radar-derived current measurements by comparing them with in-situ data from drifters (Solabarrieta et al., 2014). Their study reveals that HF radar can reliably capture the variability in surface currents, thus reinforcing the notion that the U and V data derived from these systems are suitable for analyzing current dynamics across different scales.
  
  The work of Ullman et al. [4] also supports the assertion that HF radar data can effectively account for surface current variability. They demonstrate how trajectory predictions based on HF radar surface currents can provide insights into the uncertainties associated with current measurements (Ullman et al., 2006). This research highlights the robustness of HF radar data in capturing the dynamic nature of ocean currents.
  
  Zhu et al. [5] also contribute to this discussion by comparing tidal currents in the Pearl River Estuary using HF radar data and model simulations (Zhu et al., 2022). Their findings illustrate the capability of HF radar to detect and represent current patterns effectively, reinforcing the notion that these systems can account for variability in surface currents across different environmental conditions.
  
  Moreover, the study by Li et al. [6] highlights the advantages of HF radar in monitoring ocean dynamics, particularly its ability to provide continuous observations over large areas (Li et al., 2023). The authors point out that HF radar systems can detect various oceanographic parameters, including wind fields and flow fields; so that the HFR system is account for surface current variability.
  
  We will include a figure showing the study region and radar coverage.
  
  Thank you for pointing this out; we will revise the language to provide clearer descriptions of our data selection process.
  
  We appreciate your feedback; we already include the description of data availability in the manuscript (in the section 2.1.2 line 90, and the data availability is indeed available at more than 87%. We will add the statement of the study area in the manuscript to enhance the clarity of the study.
  
  We appreciate the reviewer's insightful comments regarding the spatial variability of surface currents for our study. We will clarify that our analysis primarily focused on addressing the temporal, seasonal variability of surface currents, and did not explicitly account for the spatial variability across the study area.
  
  As stated in the manuscript, our study utilized time-series data from a single grid point located along the ship transportation route in the Bali Strait. This approach allowed us to develop robust predictive models for the temporal aspects of surface current variability, which is crucial for applications such as marine operations and ecosystem monitoring. However, we acknowledge that the use of a single grid point may limit our ability to fully capture the spatial variability of surface currents in the region. In future work, we plan to explore the integration of data from multiple grid points to better account for the spatial variability of surface currents.
  
  We apologize if there was a mistake in using the term. We only used the moving average parameter, not the standard average, and we will make revisions to the incorrect terms.
  
  References of Citations:
  
  [1] Paduan, J. D., & Washburn, L. (2013). High-frequency radar observations of ocean surface currents. Annual review of marine science, 5(1), 115-136. https://www.annualreviews.org/content/journals/10.1146/annurev-marine-121211-172315
  
  [2] Kaplan, D. M., Largier, J. L., & Botsford, L. W. (2005). Hf radar observations of surface circulation off bodega bay (northern california, usa). Journal of Geophysical Research: Oceans, 110(C10). https://doi.org/10.1029/2005jc002959
  [3] Solabarrieta, L., Rubio, A., Castanedo, S., Medina, R., Charria, G., & Hernández, C. (2014). Surface water circulation patterns in the southeastern bay of biscay: new evidences from hf radar data. Continental Shelf Research, 74, 60-76. https://doi.org/10.1016/j.csr.2013.11.022
  [4] Ullman, D. S., O’Donnell, J., Kohut, J., Fake, T., & Allen, A. A. (2006). Trajectory prediction using hf radar surface currents: monte carlo simulations of prediction uncertainties. Journal of Geophysical Research: Oceans, 111(C12). https://doi.org/10.1029/2006jc003715
  
  [5] Zhu, L., Lu, T., Yang, F., Liu, B., Wu, L., & Wei, J. (2022). Comparisons of tidal currents in the pearl river estuary between high-frequency radar data and model simulations. Applied Sciences, 12(13), 6509. https://doi.org/10.3390/app12136509
  [6] Zhu, L., Lu, T., Yang, F., Liu, B., Wu, L., & Wei, J. (2022). Comparisons of tidal currents in the pearl river estuary between high-frequency radar data and model simulations. Applied Sciences, 12(13), 6509. https://doi.org/10.3390/app12136509
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC1
- AC2:
  'Reply 1A on RC1', Ittaka Aldini, 20 Dec 2024
  Dear Expert Reviewers #1,
  
  We are thankful for your expert review, thoughtful insights and comments toward our manuscript.
  
  We already wrote the responses to your comments divided into 2 sections (1 to 7) and (8 to 13).
  
  Here is our response toward every of your valuable review:
  
  Response to Comments no 1 to 7
  
  ================================
  We appreciate your insightful comment regarding our use of terminology related to surface currents measured by HF radar. To clarify, our study utilized U and V components derived from radial velocities obtained through HF radar observations. The internal process of HF radar systems calculates U (east-west flow) and V (north-south flow) by combining the radial velocities from multiple radar beams using trigonometric transformations based on their respective angles. We are referring to a study by Paduan and Washburn (2013) in the construction of U and V data from radial velocity [1]. Thus, the U and V components data used in this study are the projections of the actual velocity captured by our HF radar system. We want to add that this study uses the final U and V data and did not perform the preprocessing or transformation from radial velocity to U and V data. We acknowledge that the internal processes in the HFR system, such as the transformation process, can introduce cases where radial velocities may exhibit zero values despite significant true currents due to geometric factors. We are only focusing on the existing HF radar data, both for training and testing; therefore, we will include an additional note in our discussion section for other researchers to potentially study any instances of resulting zero values despite significant true currents, especially in studies that compare prediction results with actual measured currents from different instruments. We will revise and add this explanation to our manuscript accordingly to provide a clearer understanding.
  
  We appreciate your comments; we clarify that the U and V data used in our analysis were the direct output from the HF radar system, without an explicit vector reconstruction step performed in our study. So, it had done in the internal process of the HF Radar system before the U and V data (that we used) are generated. As noted in the review by Paduan and Washburn (2013).
  
  In addition, High-Frequency (HF) radar systems have gained recognition for their ability to capture surface current variability across a wide range of spatial and temporal scales. Here are few supporting statement and references:
  
  Kaplan et al. [2] further illustrate the capability of HF radar to capture surface circulation patterns by presenting a case study off Bodega Bay, California (Kaplan et al., 2005). Their analysis demonstrates how HF radar data can be utilized to assess tidal influences and other dynamic processes affecting surface currents. The findings underscore the effectiveness of HF radar in providing high-resolution current measurements that account for variability over time.
  
  Solabarrieta et al. [3] validate the accuracy of HF radar-derived current measurements by comparing them with in-situ data from drifters (Solabarrieta et al., 2014). Their study reveals that HF radar can reliably capture the variability in surface currents, thus reinforcing the notion that the U and V data derived from these systems are suitable for analyzing current dynamics across different scales.
  
  The work of Ullman et al. [4] also supports the assertion that HF radar data can effectively account for surface current variability. They demonstrate how trajectory predictions based on HF radar surface currents can provide insights into the uncertainties associated with current measurements (Ullman et al., 2006). This research highlights the robustness of HF radar data in capturing the dynamic nature of ocean currents.
  
  Zhu et al. [5] also contribute to this discussion by comparing tidal currents in the Pearl River Estuary using HF radar data and model simulations (Zhu et al., 2022). Their findings illustrate the capability of HF radar to detect and represent current patterns effectively, reinforcing the notion that these systems can account for variability in surface currents across different environmental conditions.
  
  Moreover, the study by Li et al. [6] highlights the advantages of HF radar in monitoring ocean dynamics, particularly its ability to provide continuous observations over large areas (Li et al., 2023). The authors point out that HF radar systems can detect various oceanographic parameters, including wind fields and flow fields; so that the HFR system is account for surface current variability.
  
  We will include a figure showing the study region and radar coverage.
  
  Thank you for pointing this out; we will revise the language to provide clearer descriptions of our data selection process.
  
  We appreciate your feedback; we already include the description of data availability in the manuscript (in the section 2.1.2 line 90, and the data availability is indeed available at more than 87%. We will add the statement of the study area in the manuscript to enhance the clarity of the study.
  
  We appreciate the reviewer's insightful comments regarding the spatial variability of surface currents for our study. We will clarify that our analysis primarily focused on addressing the temporal, seasonal variability of surface currents, and did not explicitly account for the spatial variability across the study area.
  
  As stated in the manuscript, our study utilized time-series data from a single grid point located along the ship transportation route in the Bali Strait. This approach allowed us to develop robust predictive models for the temporal aspects of surface current variability, which is crucial for applications such as marine operations and ecosystem monitoring. However, we acknowledge that the use of a single grid point may limit our ability to fully capture the spatial variability of surface currents in the region. In future work, we plan to explore the integration of data from multiple grid points to better account for the spatial variability of surface currents.
  
  We apologize if there was a mistake in using the term. We only used the moving average parameter, not the standard average, and we will make revisions to the incorrect terms.
  
  References of Citations:
  
  [1] Paduan, J. D., & Washburn, L. (2013). High-frequency radar observations of ocean surface currents. Annual review of marine science, 5(1), 115-136. https://www.annualreviews.org/content/journals/10.1146/annurev-marine-121211-172315
  
  [2] Kaplan, D. M., Largier, J. L., & Botsford, L. W. (2005). Hf radar observations of surface circulation off bodega bay (northern california, usa). Journal of Geophysical Research: Oceans, 110(C10). https://doi.org/10.1029/2005jc002959
  [3] Solabarrieta, L., Rubio, A., Castanedo, S., Medina, R., Charria, G., & Hernández, C. (2014). Surface water circulation patterns in the southeastern bay of biscay: new evidences from hf radar data. Continental Shelf Research, 74, 60-76. https://doi.org/10.1016/j.csr.2013.11.022
  [4] Ullman, D. S., O’Donnell, J., Kohut, J., Fake, T., & Allen, A. A. (2006). Trajectory prediction using hf radar surface currents: monte carlo simulations of prediction uncertainties. Journal of Geophysical Research: Oceans, 111(C12). https://doi.org/10.1029/2006jc003715
  
  [5] Zhu, L., Lu, T., Yang, F., Liu, B., Wu, L., & Wei, J. (2022). Comparisons of tidal currents in the pearl river estuary between high-frequency radar data and model simulations. Applied Sciences, 12(13), 6509. https://doi.org/10.3390/app12136509
  [6] Zhu, L., Lu, T., Yang, F., Liu, B., Wu, L., & Wei, J. (2022). Comparisons of tidal currents in the pearl river estuary between high-frequency radar data and model simulations. Applied Sciences, 12(13), 6509. https://doi.org/10.3390/app12136509
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC2
- AC5: 'Reply 1B on RC1', Ittaka Aldini, 24 Dec 2024
  
  9) Thank you for your suggestion; we will revise this section to specify the models and data types discussed.
  10) We will review the manuscript to ensure that the definition and units of R² are correctly presented.
  
  11) We will improve the figure captions to ensure they provide adequate descriptions of all variables and their relevance.
  
  12) We will include a discussion of tidal current variability and its impact on model forecasting scores.
  
  13) Yes, We will provide justification for the classification of four monsoon seasons by citing relevant literature, confirming that the Bali Strait experiences DJF, MAM, JJA, and SON seasons.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC5
- AC6: 'Reply 1C on RC1', Ittaka Aldini, 27 Dec 2024
  
  We appreciate the evaluation regarding the robustness of Random Forest Regression (RFR) in the context of missing data. While RFR is typically robust against missing values in numerous other studies, we will investigate to check if there are any potential sources of bias that may have arisen from our data preprocessing procedures. We will include explanation of this in the manuscript revision.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC6
RC2:
'Comment on egusphere-2024-3142', Anonymous Referee #2, 10 Dec 2024

There's a fundamental issue here that need to be clarified, and has also been pointed out by another reviewer. Authors confuse systematically the U,V, u,v, of radial-vector maps in their manuscript. I believe Author use vector maps - they refer specifically to .tuv files which in standard seasonde nomenclature identify vector maps as produced by conventional least-squares fitting of radial velocities from two or more sites onto a regular grid. So, please, remove any references to radial velocities as they are inappropriate in this context.
L95-110, linear data interpolation should be carefully applied to only short gaps in time. Authors should provide evidences that there is no systematic bias introduced in the data set by the interpolation techniques and also sound a bit more time and resources into alternative gap filling methods that are extensively used for HFR data sets.
there is little to no information on the way HFR data are processed and if any QC is applied. this is of paramount importance as to avoid feeding the forecast models with wrong data and bias the results. frequency, bandwidth, resolution, radial-to-vector mapping approach, grid resolution, type of radar calibration and validation, ..., are missing.
No systematic literature review is provided in regards to different forecasting methods applied to HFR data. I remember reading somewhere using ARIMA approaches with good results.
Text in Line 221is shadowed by Figure 2 and not readable.
Text in Lines 252-255 is confusing. What quantities do Authors refer to when using 'various statistical parameters' and how they were used (and what units are they expressed?) 'The U and V components of the ocean surface current data are characterized by various statistical parameters, including the

count, minimum, standard deviation, and maximum values, which were categorized based on four distinct monsoon seasons: DJF (December to February), MAM (March to May), JJA (June to August), and SON (September to November). These seasonal statistics are presented in Table A1.'
The manuscript focuses primarily on statistical forecasting methodologies completely ignoring the oceanography of the region, which is a bit disappointing. the ms would definitely benefit from some description of the main signals that the HFR is capturing, their time scale, their deterministic nature (i.e., tides) , their relationship to the main drivers and so forth. That would be likely explaining why some models fail in their prediction skills during some seasons of the year (L420-425)
Validation metrics and statistics refer to spatially averaged U, V, or across the entire domain? please clarify

Citation: https://doi.org/10.5194/egusphere-2024-3142-RC2
- AC3: 'Reply 2A on RC2', Ittaka Aldini, 23 Dec 2024
  
  Dear Expert Reviewers #2,
  
  Thank you for your careful evaluation and feedback on our study.
  
  We are already wrote the responses to your comments divided into 2 sections, 2A (1, 2, 3, 4, 5, 6, 8) and 2B (7).
  
  Here is our response toward every of your constructive review:
  
  1) Thank you for your feedback; we will clarify the distinction between the u and v components derived from HF radar measurements and remove any references to radial velocities to prevent confusion.
  
  2) We have previously conducted a preliminary study using several gap filling techniques and found that some of the more complex interpolation methods have implications that can alter the original data plot, and thus decrease the prediction accuracy. For this reason, we used a simpler linear interpolation. In addition, the focus of our current study is to investigate the effect of applying several model variables on different time scales.
  3) Thank you for highlighting this important aspect; We realized the importance to provide additional details on the HF radar data processing and quality control measures applied in our study.
  
  We acknowledged that the details that the reviewers asked is actually happened in the internal process of HF Radar system, while the data used in this study represented the final product generated by the HF Radar system. Thus,this manuscript only explaining the post processing procedures. However, we recognize the necessity to include some description about the pre-processing happened in HFR system for better transparency and understanding. We will revise it accordingly.
  
  4) As we note before, we are actually have conducted separated literature review study about the Prediction of Ocean Surface Current [1]. However, we ascertain that this was pivotal and thus we will enrich the literature review in this manuscript, as you suggested.
  
  5) Thank you, We will revise the manuscript to ensure that the text in line 221 is clearly visible and not obscured by Figure 2.
  6) We will clarify the statistical parameters mentioned in lines 252-255, is actually relating to the Table A1 in the Appendix A.
  
  8) We will clarify the validation metrics and statistics refer to certain grid of U and V, to enhance transparency in our findings.
  
  Citation:
  
  ========
  
  [1] Aldini, I., Permanasari, A. E., Hidayat, R., & Ramdhan, A. (2024). Prediction of ocean surface current: Research status, challenges, and opportunities. A review. Ocean Systems Engineering, 14(1), 85-99. https://koreascience.kr/article/JAKO202411643844960.page
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC3
- AC4:
  'Reply 2B on RC2', Ittaka Aldini, 24 Dec 2024
  7) We appreciate your valuable feedback regarding the need for deeper oceanographic context. We acknowledged that incorporating the regional oceanographic characteristics would strengthen our manuscript's scientific contribution.
  Before we conducted this prediction study, we had already completed another study that preceded this work, which included a literature review of ocean current prediction [1] and the identification of the main drivers of ocean current data in our study region, specifically tidal and wind [2].
  
  We will incorporate more detailed results, particularly regarding their relation to oceanographic factors. Here is a brief summary of the prediction results:
  
  Relation to Prediction Models
  The prediction models that only include the moving average variables F1, F5, and F8 (in addition to the (h-1) to (h-3) data, which is also used as the default prediction variable for the 11 models) produce higher prediction accuracy values and lower RMSE compared to the other 8 prediction models out of the 11 tested. Between F1, F5, and F8; the F1 model consistently provides R² results that are always higher and RMSE values that are always lower. The better performance of configurations F1, F5, and F8 can be attributed to their ability to capture the dominant oceanographic signals:
  Moving averages can retaining primary signal patterns
  
  Recent previous values (h-1 to h-3) can capture immediate changes
  
  This combination can provides representation of both deterministic and stochastic components.
  
  Relation to Time Scale
  Prediction schemes that include more data, particularly the additional data from 2019 and 2020, namely for prediction schemes P1 and P3, exhibit R² and RMSE results with a tighter range (the difference in R² and RMSE results is smaller among the 11 models).
  
  Relation to Main Oceanographic Factors in the Study Region
  As noted in our previous study [2], we conducted a separated study about the analysis of tidal and wind factors on HF radar sea surface current data. We confirm that the Bali Strait is indeed influenced by tide and wind, with the tide having a greater influence than the wind in our study region. We also have a breakdown analysis for each month and season, which we also attached in the following image.
  
  in brief, we concluded the following for the seasonal variability
  
  The seasonal patterns in prediction accuracy results (Figure 7) reflect the region's oceanographic characteristics:
  Lower RMSE during DJF: Corresponds to periods of stronger tidal dominance
  
  Higher RMSE during MAM: Coincides with increased wind influence and transitional periods
  
  The V-component shows greater seasonal variation, consistent with the stronger influence of monsoon winds on north-south flows
  
  This analysis demonstrates how the prediction models' performance is intrinsically linked to the underlying oceanographic processes, and provides the context for selecting appropriate prediction strategies based on seasonal and monthly oceanographic conditions in the study region.
  
  Citation:
  
  [2] Aldini, I., Permanasari, A. E., Hidayat, R., & Ramdhan, A. (2024). Prediction of ocean surface current: Research status, challenges, and opportunities. A review. Ocean Systems Engineering, 14(1), 85-99. https://koreascience.kr/article/JAKO202411643844960.page
  [3] Aldini, I., Permanasari, A. E., Hidayat, R., & Ramdhani, A. (2024, December). Tidal-induced and wind-driven ocean surface current data analysis derived from HF radar in Bali Strait. In IOP Conference Series: Earth and Environmental Science (Vol. 1412, No. 1, p. 012006). IOP Publishing. https://iopscience.iop.org/article/10.1088/1755-1315/1412/1/012006/pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC4
- AC7: 'Reply 2C on RC2', Ittaka Aldini, 02 Jan 2025
  
  #Additional Clarification#
  Dear Expert Reviewers,
  Thank you for your thoughtful review and attention to the details of our data handling methodology. We appreciate the Expert Reviewers taking the time to provide this feedback - it will really help us improve the clarity and accuracy of our manuscript.
  We want to make an important clarification regarding the description of our interpolation method. The manuscript incorrectly stated that we used linear interpolation. However, upon further review, we realize that no interpolation method was actually implemented in our study. The appearance of linear connections in our visualizations was simply an artifact of the plotting software's default rendering of discontinuous data points, and not a result of any real interpolation on our part. We will correct this mistake in the methodology section so that it accurately reflects our actual data handling approach.
  
  As the actual approach, we utilized the raw HF radar data with all of its original gaps intact. The Random Forest Regression algorithm handles the missing values naturally, which aligns with best practices in oceanographic time series analysis. This approach helps us avoid introducing any potential artificial patterns that could come from interpolation, and ensures the integrity of the data.
  Regarding the apparent sensitivity to missing data that we observed, particularly during the month of May, we believe this is attributable to natural oceanographic phenomena rather than any issues with our data handling. This is evidenced by (PDF Document Attached):
  
  - A significant drop in tidal correlation (U_MIN DCCA = 0.2217), Fig 5.
  
  - Reduced V-component correlation (V_MAX DCCA = 0.6161) Fig 5.
  
  - The residual DCCA analysis (Figure 6) indicates increased wind influence during that time period, and the higher variability in both U and V components.
  
  - Consistent patterns across all prediction schemes (P1, P2, P3), consistently lower performance during May (Fig 7).
  
  These changes in the underlying driving forces seem to explain the higher variability and lower performance metrics we reported for that month, across all of our prediction schemes.
  We hope this clarification is useful in understanding our approach and the reasons behind the patterns we observed.
  
  Please let us know if you have any other questions; we can discuss this further.
  Thank you again for your thoughtful review. It is greatly appreciated.
  Sincerely,
  Author Team
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC7

Status: closed

RC1:
'Comment on egusphere-2024-3142', Anonymous Referee #1, 09 Dec 2024

Major comments:
The study aims to improve the predictability of surface currents remotely sensed by oceanographic HF radars, installed in the Bali Strait on the Indonesian coast. The subject is of great interest. So, this research is amply justified. A statistical modeling approach is used, more specifically Random Forest Regression (RFR) which belongs to machine learning techniques. Various configurations of the prediction model have been tested, and the results delivered by some of the models are very impressive demonstrating the validity of the method chosen by the authors. The paper is well written, well structured, and contains sufficient information to support the conclusions of this study.
However, I don't think it's correct to say that surface currents are being studied. This is my major point of criticism and I invite author to carefully revise the paper and clearly explain what quantity has been analyzed and forecast. I understand that they are not oceanographers and their knowledge of the technologies used in oceanography is limited. Below I give some information which I hope will be helpful for improving the paper which is submitted to the oceanographic journal.
1/ The use of the term "U and V component radial current velocity" is misleading. HF radars measure the radial velocity of surface currents which is a projection of real current vector on radar beams. U and V components are used (not always) to correctly represent the radial velocity vector without using the polar coordinate system (direction and distance from the radar site). In reality, only this quantity, radial velocity, is estimated from the backscattered power received by the radars. As it is a projection of the current velocity vector, it can reach large values if the radar beam is almost parallel to the current velocity vector, but can also have zero values when the beam direction is normal to the current direction. This means that radial velocity may have a zero value (both U and V components are zero) even if the true current velocity is large. Moreover, a region of nearly-zero radial velocity can be fairly large while the currents can be strong.
Please explain how this was handled in your analysis.
2/ The current vector reconstruction is done by combining the radial velocities from two, or more, radar sites. I am not sure that this was done in the study. If it was done, please provide description how the current vectors have been reconstructed. The two techniques most widely used in the radar community are described in (Yaremchuk and Sentchev, Continental Shelf Res. 2009). Only in this case, u and v-velocity components of the "total" velocity vector (as opposed to radial velocity) are used. If current vectors were not reconstructed, please explain how the analyzed quantity accounts for the surface current variability in a wide range of scales.
3/ In the Introduction and Conclusion sections, the authors indicate that their method is useful for addressing the complexity of sea surface current forecasting issues. I think that it could be useful to add a figure showing a) the study region, b) the spatial coverage by one or two radars used, to describe surface currents and provide a vector map of surface currents (instantaneous velocities, time averaged , others) to illustrate spatial complexity. Again, I insist that the ms is addressed to the oceanographic journal.
4/ Words such as "careful curating", "pertinent for scientific objectives" (ln 81 and in some other lines) should be avoided if not supported by clear indications how the data selection process was organized and what are the targeted objectives.
5/ Text in ln 96-98 should be completed by including a description of spatial variations in data availability. Usually, there are much less data available at far ranges and along the side beams. Is it your case? Are the regions with poor coverage (less than 50%) were excluded from analysis/training/forecast? As the study region is not shown, it's difficult to have an idea of the spatial expansion of the area with high percentage of the data return (~90% Tab. 1).
6/ Ln 133. It is explained that the model was trained to accurately represent only temporal, seasonal variability. Spatial variability is not mentioned. I assume that it was not accounted for in the present study. This should be clearly indicated. Here again, I would point out that space variations in radial velocities alone cannot shape correctly the spatial variability of surface currents. One of the reasons is given in 1.
7/ There is some inconsistency in definition of model variables avgh3, 6, … In the abstract, the moving average is mentioned, while in section 2.2, ln148, a simple average is given. Please revise the text.
8/ Ln 135: One of the advantages of RFR is its robustness against missing data. However, the results of the study are contradictory and showed a large sensitivity of RFR to missing data, ln 420-424. Please, provide explanation.
Other comments
The text in ln 44-45 should be revised: Current models (what models in particular?), based on historical data (what kind of data?) may not capture the dynamics … I believe that modern circulation models are sufficiently efficient to represent the multi-scale variability of ocean currents. If it doesn't appear in Zhao et al., use other references.
Check throughout the whole ms the name of R2, introduced in ln 175-179. It doesn't have unit (ln. 180) and is not exactly equal to correlation coefficient (ln 330).
Figure captions should be completed by providing description of all variables plotted. The description of results plotted should be more detailed. In Fig. 4, for ex., where the future data starts from? What the left figure brings for understanding the results?
In Fig. 5 middle and bottom panels, velocity variations of tidal origin clearly appear. The model forecasting score for this variability is high and this is not surprising. It seems that tidal current variability was not discussed. Please include it in discussion.
I've always thought that the Asian monsoon has only two seasons, winter and summer which are not equal to seasons in the mid-latitudes. How can you justify the choice of four monsoon seasons? Do they really exist?

Citation: https://doi.org/10.5194/egusphere-2024-3142-RC1
- AC1:
  'Reply 1A on RC1', Ittaka Aldini, 20 Dec 2024
  Dear Expert Reviewers #1,
  
  We are thankful for your expert review, thoughtful insights and comments toward our manuscript.
  
  We already wrote the responses to your comments divided into 2 sections (1 to 7) and (8 to 13).
  
  Here is our response toward every of your valuable review:
  
  Response to Comments no 1 to 7
  
  ================================
  We appreciate your insightful comment regarding our use of terminology related to surface currents measured by HF radar. To clarify, our study utilized U and V components derived from radial velocities obtained through HF radar observations. The internal process of HF radar systems calculates U (east-west flow) and V (north-south flow) by combining the radial velocities from multiple radar beams using trigonometric transformations based on their respective angles. We are referring to a study by Paduan and Washburn (2013) in the construction of U and V data from radial velocity [1]. Thus, the U and V components data used in this study are the projections of the actual velocity captured by our HF radar system. We want to add that this study uses the final U and V data and did not perform the preprocessing or transformation from radial velocity to U and V data. We acknowledge that the internal processes in the HFR system, such as the transformation process, can introduce cases where radial velocities may exhibit zero values despite significant true currents due to geometric factors. We are only focusing on the existing HF radar data, both for training and testing; therefore, we will include an additional note in our discussion section for other researchers to potentially study any instances of resulting zero values despite significant true currents, especially in studies that compare prediction results with actual measured currents from different instruments. We will revise and add this explanation to our manuscript accordingly to provide a clearer understanding.
  
  We appreciate your comments; we clarify that the U and V data used in our analysis were the direct output from the HF radar system, without an explicit vector reconstruction step performed in our study. So, it had done in the internal process of the HF Radar system before the U and V data (that we used) are generated. As noted in the review by Paduan and Washburn (2013).
  
  In addition, High-Frequency (HF) radar systems have gained recognition for their ability to capture surface current variability across a wide range of spatial and temporal scales. Here are few supporting statement and references:
  
  Kaplan et al. [2] further illustrate the capability of HF radar to capture surface circulation patterns by presenting a case study off Bodega Bay, California (Kaplan et al., 2005). Their analysis demonstrates how HF radar data can be utilized to assess tidal influences and other dynamic processes affecting surface currents. The findings underscore the effectiveness of HF radar in providing high-resolution current measurements that account for variability over time.
  
  Solabarrieta et al. [3] validate the accuracy of HF radar-derived current measurements by comparing them with in-situ data from drifters (Solabarrieta et al., 2014). Their study reveals that HF radar can reliably capture the variability in surface currents, thus reinforcing the notion that the U and V data derived from these systems are suitable for analyzing current dynamics across different scales.
  
  The work of Ullman et al. [4] also supports the assertion that HF radar data can effectively account for surface current variability. They demonstrate how trajectory predictions based on HF radar surface currents can provide insights into the uncertainties associated with current measurements (Ullman et al., 2006). This research highlights the robustness of HF radar data in capturing the dynamic nature of ocean currents.
  
  Zhu et al. [5] also contribute to this discussion by comparing tidal currents in the Pearl River Estuary using HF radar data and model simulations (Zhu et al., 2022). Their findings illustrate the capability of HF radar to detect and represent current patterns effectively, reinforcing the notion that these systems can account for variability in surface currents across different environmental conditions.
  
  Moreover, the study by Li et al. [6] highlights the advantages of HF radar in monitoring ocean dynamics, particularly its ability to provide continuous observations over large areas (Li et al., 2023). The authors point out that HF radar systems can detect various oceanographic parameters, including wind fields and flow fields; so that the HFR system is account for surface current variability.
  
  We will include a figure showing the study region and radar coverage.
  
  Thank you for pointing this out; we will revise the language to provide clearer descriptions of our data selection process.
  
  We appreciate your feedback; we already include the description of data availability in the manuscript (in the section 2.1.2 line 90, and the data availability is indeed available at more than 87%. We will add the statement of the study area in the manuscript to enhance the clarity of the study.
  
  We appreciate the reviewer's insightful comments regarding the spatial variability of surface currents for our study. We will clarify that our analysis primarily focused on addressing the temporal, seasonal variability of surface currents, and did not explicitly account for the spatial variability across the study area.
  
  As stated in the manuscript, our study utilized time-series data from a single grid point located along the ship transportation route in the Bali Strait. This approach allowed us to develop robust predictive models for the temporal aspects of surface current variability, which is crucial for applications such as marine operations and ecosystem monitoring. However, we acknowledge that the use of a single grid point may limit our ability to fully capture the spatial variability of surface currents in the region. In future work, we plan to explore the integration of data from multiple grid points to better account for the spatial variability of surface currents.
  
  We apologize if there was a mistake in using the term. We only used the moving average parameter, not the standard average, and we will make revisions to the incorrect terms.
  
  References of Citations:
  
  [1] Paduan, J. D., & Washburn, L. (2013). High-frequency radar observations of ocean surface currents. Annual review of marine science, 5(1), 115-136. https://www.annualreviews.org/content/journals/10.1146/annurev-marine-121211-172315
  
  [2] Kaplan, D. M., Largier, J. L., & Botsford, L. W. (2005). Hf radar observations of surface circulation off bodega bay (northern california, usa). Journal of Geophysical Research: Oceans, 110(C10). https://doi.org/10.1029/2005jc002959
  [3] Solabarrieta, L., Rubio, A., Castanedo, S., Medina, R., Charria, G., & Hernández, C. (2014). Surface water circulation patterns in the southeastern bay of biscay: new evidences from hf radar data. Continental Shelf Research, 74, 60-76. https://doi.org/10.1016/j.csr.2013.11.022
  [4] Ullman, D. S., O’Donnell, J., Kohut, J., Fake, T., & Allen, A. A. (2006). Trajectory prediction using hf radar surface currents: monte carlo simulations of prediction uncertainties. Journal of Geophysical Research: Oceans, 111(C12). https://doi.org/10.1029/2006jc003715
  
  [5] Zhu, L., Lu, T., Yang, F., Liu, B., Wu, L., & Wei, J. (2022). Comparisons of tidal currents in the pearl river estuary between high-frequency radar data and model simulations. Applied Sciences, 12(13), 6509. https://doi.org/10.3390/app12136509
  [6] Zhu, L., Lu, T., Yang, F., Liu, B., Wu, L., & Wei, J. (2022). Comparisons of tidal currents in the pearl river estuary between high-frequency radar data and model simulations. Applied Sciences, 12(13), 6509. https://doi.org/10.3390/app12136509
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC1
- AC2:
  'Reply 1A on RC1', Ittaka Aldini, 20 Dec 2024
  Dear Expert Reviewers #1,
  
  We are thankful for your expert review, thoughtful insights and comments toward our manuscript.
  
  We already wrote the responses to your comments divided into 2 sections (1 to 7) and (8 to 13).
  
  Here is our response toward every of your valuable review:
  
  Response to Comments no 1 to 7
  
  ================================
  We appreciate your insightful comment regarding our use of terminology related to surface currents measured by HF radar. To clarify, our study utilized U and V components derived from radial velocities obtained through HF radar observations. The internal process of HF radar systems calculates U (east-west flow) and V (north-south flow) by combining the radial velocities from multiple radar beams using trigonometric transformations based on their respective angles. We are referring to a study by Paduan and Washburn (2013) in the construction of U and V data from radial velocity [1]. Thus, the U and V components data used in this study are the projections of the actual velocity captured by our HF radar system. We want to add that this study uses the final U and V data and did not perform the preprocessing or transformation from radial velocity to U and V data. We acknowledge that the internal processes in the HFR system, such as the transformation process, can introduce cases where radial velocities may exhibit zero values despite significant true currents due to geometric factors. We are only focusing on the existing HF radar data, both for training and testing; therefore, we will include an additional note in our discussion section for other researchers to potentially study any instances of resulting zero values despite significant true currents, especially in studies that compare prediction results with actual measured currents from different instruments. We will revise and add this explanation to our manuscript accordingly to provide a clearer understanding.
  
  We appreciate your comments; we clarify that the U and V data used in our analysis were the direct output from the HF radar system, without an explicit vector reconstruction step performed in our study. So, it had done in the internal process of the HF Radar system before the U and V data (that we used) are generated. As noted in the review by Paduan and Washburn (2013).
  
  In addition, High-Frequency (HF) radar systems have gained recognition for their ability to capture surface current variability across a wide range of spatial and temporal scales. Here are few supporting statement and references:
  
  Kaplan et al. [2] further illustrate the capability of HF radar to capture surface circulation patterns by presenting a case study off Bodega Bay, California (Kaplan et al., 2005). Their analysis demonstrates how HF radar data can be utilized to assess tidal influences and other dynamic processes affecting surface currents. The findings underscore the effectiveness of HF radar in providing high-resolution current measurements that account for variability over time.
  
  Solabarrieta et al. [3] validate the accuracy of HF radar-derived current measurements by comparing them with in-situ data from drifters (Solabarrieta et al., 2014). Their study reveals that HF radar can reliably capture the variability in surface currents, thus reinforcing the notion that the U and V data derived from these systems are suitable for analyzing current dynamics across different scales.
  
  The work of Ullman et al. [4] also supports the assertion that HF radar data can effectively account for surface current variability. They demonstrate how trajectory predictions based on HF radar surface currents can provide insights into the uncertainties associated with current measurements (Ullman et al., 2006). This research highlights the robustness of HF radar data in capturing the dynamic nature of ocean currents.
  
  Zhu et al. [5] also contribute to this discussion by comparing tidal currents in the Pearl River Estuary using HF radar data and model simulations (Zhu et al., 2022). Their findings illustrate the capability of HF radar to detect and represent current patterns effectively, reinforcing the notion that these systems can account for variability in surface currents across different environmental conditions.
  
  Moreover, the study by Li et al. [6] highlights the advantages of HF radar in monitoring ocean dynamics, particularly its ability to provide continuous observations over large areas (Li et al., 2023). The authors point out that HF radar systems can detect various oceanographic parameters, including wind fields and flow fields; so that the HFR system is account for surface current variability.
  
  We will include a figure showing the study region and radar coverage.
  
  Thank you for pointing this out; we will revise the language to provide clearer descriptions of our data selection process.
  
  We appreciate your feedback; we already include the description of data availability in the manuscript (in the section 2.1.2 line 90, and the data availability is indeed available at more than 87%. We will add the statement of the study area in the manuscript to enhance the clarity of the study.
  
  We appreciate the reviewer's insightful comments regarding the spatial variability of surface currents for our study. We will clarify that our analysis primarily focused on addressing the temporal, seasonal variability of surface currents, and did not explicitly account for the spatial variability across the study area.
  
  As stated in the manuscript, our study utilized time-series data from a single grid point located along the ship transportation route in the Bali Strait. This approach allowed us to develop robust predictive models for the temporal aspects of surface current variability, which is crucial for applications such as marine operations and ecosystem monitoring. However, we acknowledge that the use of a single grid point may limit our ability to fully capture the spatial variability of surface currents in the region. In future work, we plan to explore the integration of data from multiple grid points to better account for the spatial variability of surface currents.
  
  We apologize if there was a mistake in using the term. We only used the moving average parameter, not the standard average, and we will make revisions to the incorrect terms.
  
  References of Citations:
  
  [1] Paduan, J. D., & Washburn, L. (2013). High-frequency radar observations of ocean surface currents. Annual review of marine science, 5(1), 115-136. https://www.annualreviews.org/content/journals/10.1146/annurev-marine-121211-172315
  
  [2] Kaplan, D. M., Largier, J. L., & Botsford, L. W. (2005). Hf radar observations of surface circulation off bodega bay (northern california, usa). Journal of Geophysical Research: Oceans, 110(C10). https://doi.org/10.1029/2005jc002959
  [3] Solabarrieta, L., Rubio, A., Castanedo, S., Medina, R., Charria, G., & Hernández, C. (2014). Surface water circulation patterns in the southeastern bay of biscay: new evidences from hf radar data. Continental Shelf Research, 74, 60-76. https://doi.org/10.1016/j.csr.2013.11.022
  [4] Ullman, D. S., O’Donnell, J., Kohut, J., Fake, T., & Allen, A. A. (2006). Trajectory prediction using hf radar surface currents: monte carlo simulations of prediction uncertainties. Journal of Geophysical Research: Oceans, 111(C12). https://doi.org/10.1029/2006jc003715
  
  [5] Zhu, L., Lu, T., Yang, F., Liu, B., Wu, L., & Wei, J. (2022). Comparisons of tidal currents in the pearl river estuary between high-frequency radar data and model simulations. Applied Sciences, 12(13), 6509. https://doi.org/10.3390/app12136509
  [6] Zhu, L., Lu, T., Yang, F., Liu, B., Wu, L., & Wei, J. (2022). Comparisons of tidal currents in the pearl river estuary between high-frequency radar data and model simulations. Applied Sciences, 12(13), 6509. https://doi.org/10.3390/app12136509
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC2
- AC5: 'Reply 1B on RC1', Ittaka Aldini, 24 Dec 2024
  
  9) Thank you for your suggestion; we will revise this section to specify the models and data types discussed.
  10) We will review the manuscript to ensure that the definition and units of R² are correctly presented.
  
  11) We will improve the figure captions to ensure they provide adequate descriptions of all variables and their relevance.
  
  12) We will include a discussion of tidal current variability and its impact on model forecasting scores.
  
  13) Yes, We will provide justification for the classification of four monsoon seasons by citing relevant literature, confirming that the Bali Strait experiences DJF, MAM, JJA, and SON seasons.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC5
- AC6: 'Reply 1C on RC1', Ittaka Aldini, 27 Dec 2024
  
  We appreciate the evaluation regarding the robustness of Random Forest Regression (RFR) in the context of missing data. While RFR is typically robust against missing values in numerous other studies, we will investigate to check if there are any potential sources of bias that may have arisen from our data preprocessing procedures. We will include explanation of this in the manuscript revision.
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC6
RC2:
'Comment on egusphere-2024-3142', Anonymous Referee #2, 10 Dec 2024

There's a fundamental issue here that need to be clarified, and has also been pointed out by another reviewer. Authors confuse systematically the U,V, u,v, of radial-vector maps in their manuscript. I believe Author use vector maps - they refer specifically to .tuv files which in standard seasonde nomenclature identify vector maps as produced by conventional least-squares fitting of radial velocities from two or more sites onto a regular grid. So, please, remove any references to radial velocities as they are inappropriate in this context.
L95-110, linear data interpolation should be carefully applied to only short gaps in time. Authors should provide evidences that there is no systematic bias introduced in the data set by the interpolation techniques and also sound a bit more time and resources into alternative gap filling methods that are extensively used for HFR data sets.
there is little to no information on the way HFR data are processed and if any QC is applied. this is of paramount importance as to avoid feeding the forecast models with wrong data and bias the results. frequency, bandwidth, resolution, radial-to-vector mapping approach, grid resolution, type of radar calibration and validation, ..., are missing.
No systematic literature review is provided in regards to different forecasting methods applied to HFR data. I remember reading somewhere using ARIMA approaches with good results.
Text in Line 221is shadowed by Figure 2 and not readable.
Text in Lines 252-255 is confusing. What quantities do Authors refer to when using 'various statistical parameters' and how they were used (and what units are they expressed?) 'The U and V components of the ocean surface current data are characterized by various statistical parameters, including the

count, minimum, standard deviation, and maximum values, which were categorized based on four distinct monsoon seasons: DJF (December to February), MAM (March to May), JJA (June to August), and SON (September to November). These seasonal statistics are presented in Table A1.'
The manuscript focuses primarily on statistical forecasting methodologies completely ignoring the oceanography of the region, which is a bit disappointing. the ms would definitely benefit from some description of the main signals that the HFR is capturing, their time scale, their deterministic nature (i.e., tides) , their relationship to the main drivers and so forth. That would be likely explaining why some models fail in their prediction skills during some seasons of the year (L420-425)
Validation metrics and statistics refer to spatially averaged U, V, or across the entire domain? please clarify

Citation: https://doi.org/10.5194/egusphere-2024-3142-RC2
- AC3: 'Reply 2A on RC2', Ittaka Aldini, 23 Dec 2024
  
  Dear Expert Reviewers #2,
  
  Thank you for your careful evaluation and feedback on our study.
  
  We are already wrote the responses to your comments divided into 2 sections, 2A (1, 2, 3, 4, 5, 6, 8) and 2B (7).
  
  Here is our response toward every of your constructive review:
  
  1) Thank you for your feedback; we will clarify the distinction between the u and v components derived from HF radar measurements and remove any references to radial velocities to prevent confusion.
  
  2) We have previously conducted a preliminary study using several gap filling techniques and found that some of the more complex interpolation methods have implications that can alter the original data plot, and thus decrease the prediction accuracy. For this reason, we used a simpler linear interpolation. In addition, the focus of our current study is to investigate the effect of applying several model variables on different time scales.
  3) Thank you for highlighting this important aspect; We realized the importance to provide additional details on the HF radar data processing and quality control measures applied in our study.
  
  We acknowledged that the details that the reviewers asked is actually happened in the internal process of HF Radar system, while the data used in this study represented the final product generated by the HF Radar system. Thus,this manuscript only explaining the post processing procedures. However, we recognize the necessity to include some description about the pre-processing happened in HFR system for better transparency and understanding. We will revise it accordingly.
  
  4) As we note before, we are actually have conducted separated literature review study about the Prediction of Ocean Surface Current [1]. However, we ascertain that this was pivotal and thus we will enrich the literature review in this manuscript, as you suggested.
  
  5) Thank you, We will revise the manuscript to ensure that the text in line 221 is clearly visible and not obscured by Figure 2.
  6) We will clarify the statistical parameters mentioned in lines 252-255, is actually relating to the Table A1 in the Appendix A.
  
  8) We will clarify the validation metrics and statistics refer to certain grid of U and V, to enhance transparency in our findings.
  
  Citation:
  
  ========
  
  [1] Aldini, I., Permanasari, A. E., Hidayat, R., & Ramdhan, A. (2024). Prediction of ocean surface current: Research status, challenges, and opportunities. A review. Ocean Systems Engineering, 14(1), 85-99. https://koreascience.kr/article/JAKO202411643844960.page
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC3
- AC4:
  'Reply 2B on RC2', Ittaka Aldini, 24 Dec 2024
  7) We appreciate your valuable feedback regarding the need for deeper oceanographic context. We acknowledged that incorporating the regional oceanographic characteristics would strengthen our manuscript's scientific contribution.
  Before we conducted this prediction study, we had already completed another study that preceded this work, which included a literature review of ocean current prediction [1] and the identification of the main drivers of ocean current data in our study region, specifically tidal and wind [2].
  
  We will incorporate more detailed results, particularly regarding their relation to oceanographic factors. Here is a brief summary of the prediction results:
  
  Relation to Prediction Models
  The prediction models that only include the moving average variables F1, F5, and F8 (in addition to the (h-1) to (h-3) data, which is also used as the default prediction variable for the 11 models) produce higher prediction accuracy values and lower RMSE compared to the other 8 prediction models out of the 11 tested. Between F1, F5, and F8; the F1 model consistently provides R² results that are always higher and RMSE values that are always lower. The better performance of configurations F1, F5, and F8 can be attributed to their ability to capture the dominant oceanographic signals:
  Moving averages can retaining primary signal patterns
  
  Recent previous values (h-1 to h-3) can capture immediate changes
  
  This combination can provides representation of both deterministic and stochastic components.
  
  Relation to Time Scale
  Prediction schemes that include more data, particularly the additional data from 2019 and 2020, namely for prediction schemes P1 and P3, exhibit R² and RMSE results with a tighter range (the difference in R² and RMSE results is smaller among the 11 models).
  
  Relation to Main Oceanographic Factors in the Study Region
  As noted in our previous study [2], we conducted a separated study about the analysis of tidal and wind factors on HF radar sea surface current data. We confirm that the Bali Strait is indeed influenced by tide and wind, with the tide having a greater influence than the wind in our study region. We also have a breakdown analysis for each month and season, which we also attached in the following image.
  
  in brief, we concluded the following for the seasonal variability
  
  The seasonal patterns in prediction accuracy results (Figure 7) reflect the region's oceanographic characteristics:
  Lower RMSE during DJF: Corresponds to periods of stronger tidal dominance
  
  Higher RMSE during MAM: Coincides with increased wind influence and transitional periods
  
  The V-component shows greater seasonal variation, consistent with the stronger influence of monsoon winds on north-south flows
  
  This analysis demonstrates how the prediction models' performance is intrinsically linked to the underlying oceanographic processes, and provides the context for selecting appropriate prediction strategies based on seasonal and monthly oceanographic conditions in the study region.
  
  Citation:
  
  [2] Aldini, I., Permanasari, A. E., Hidayat, R., & Ramdhan, A. (2024). Prediction of ocean surface current: Research status, challenges, and opportunities. A review. Ocean Systems Engineering, 14(1), 85-99. https://koreascience.kr/article/JAKO202411643844960.page
  [3] Aldini, I., Permanasari, A. E., Hidayat, R., & Ramdhani, A. (2024, December). Tidal-induced and wind-driven ocean surface current data analysis derived from HF radar in Bali Strait. In IOP Conference Series: Earth and Environmental Science (Vol. 1412, No. 1, p. 012006). IOP Publishing. https://iopscience.iop.org/article/10.1088/1755-1315/1412/1/012006/pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC4
- AC7: 'Reply 2C on RC2', Ittaka Aldini, 02 Jan 2025
  
  #Additional Clarification#
  Dear Expert Reviewers,
  Thank you for your thoughtful review and attention to the details of our data handling methodology. We appreciate the Expert Reviewers taking the time to provide this feedback - it will really help us improve the clarity and accuracy of our manuscript.
  We want to make an important clarification regarding the description of our interpolation method. The manuscript incorrectly stated that we used linear interpolation. However, upon further review, we realize that no interpolation method was actually implemented in our study. The appearance of linear connections in our visualizations was simply an artifact of the plotting software's default rendering of discontinuous data points, and not a result of any real interpolation on our part. We will correct this mistake in the methodology section so that it accurately reflects our actual data handling approach.
  
  As the actual approach, we utilized the raw HF radar data with all of its original gaps intact. The Random Forest Regression algorithm handles the missing values naturally, which aligns with best practices in oceanographic time series analysis. This approach helps us avoid introducing any potential artificial patterns that could come from interpolation, and ensures the integrity of the data.
  Regarding the apparent sensitivity to missing data that we observed, particularly during the month of May, we believe this is attributable to natural oceanographic phenomena rather than any issues with our data handling. This is evidenced by (PDF Document Attached):
  
  - A significant drop in tidal correlation (U_MIN DCCA = 0.2217), Fig 5.
  
  - Reduced V-component correlation (V_MAX DCCA = 0.6161) Fig 5.
  
  - The residual DCCA analysis (Figure 6) indicates increased wind influence during that time period, and the higher variability in both U and V components.
  
  - Consistent patterns across all prediction schemes (P1, P2, P3), consistently lower performance during May (Fig 7).
  
  These changes in the underlying driving forces seem to explain the higher variability and lower performance metrics we reported for that month, across all of our prediction schemes.
  We hope this clarification is useful in understanding our approach and the reasons behind the patterns we observed.
  
  Please let us know if you have any other questions; we can discuss this further.
  Thank you again for your thoughtful review. It is greatly appreciated.
  Sincerely,
  Author Team
  
  Citation: https://doi.org/10.5194/egusphere-2024-3142-AC7

Ittaka Aldini, Adhistya Permanasari, Risanuri Hidayat, and Andri Ramdhani

Viewed

Total article views: 774 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
417	120	237	774	19	31

HTML: 417
PDF: 120
XML: 237
Total: 774
BibTeX: 19
EndNote: 31

Views and downloads (calculated since 25 Oct 2024)

Month	HTML	PDF	XML	Total
Oct 2024	36	8	4	48
Nov 2024	23	9	4	36
Dec 2024	91	31	8	130
Jan 2025	33	10	5	48
Feb 2025	13	1	44	58
Mar 2025	10	10	50	70
Apr 2025	13	7	46	66
May 2025	12	10	48	70
Jun 2025	19	9	21	49
Jul 2025	20	8	0	28
Aug 2025	41	11	6	58
Sep 2025	101	6	1	108
Oct 2025	5	0	5

Cumulative views and downloads (calculated since 25 Oct 2024)

Month	HTML	PDF	XML	Total
Oct 2024	36	8	4	48
Nov 2024	23	9	4	36
Dec 2024	91	31	8	130
Jan 2025	33	10	5	48
Feb 2025	13	1	44	58
Mar 2025	10	10	50	70
Apr 2025	13	7	46	66
May 2025	12	10	48	70
Jun 2025	19	9	21	49
Jul 2025	20	8	0	28
Aug 2025	41	11	6	58
Sep 2025	101	6	1	108
Oct 2025	5	0	5

Viewed (geographical distribution)

Total article views: 755 (including HTML, PDF, and XML) Thereof 755 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 09 Oct 2025

Short summary

This study enhances the prediction of sea surface currents using HF radar data, addressing a gap in understanding how seasonal and monthly data segmentation affects accuracy. By applying RF Regression, we developed three prediction schemes that demonstrated larger datasets yield higher correlation coefficients, while tailored models reduce prediction errors. Key findings reveal that selecting the appropriate dataset and integrating moving averages significantly improves predictive performance.


Total:	0
HTML:	0
PDF:	0
XML:	0