Application of artificial intelligence methods during the processing of spatial data from the hydrographic systems for coastal zone

Wlodarczyk-Sielicka, Marta; Damaševičius, Robertas

doi:https://doi.org/10.5194/egusphere-2025-904

Preprints

https://doi.org/10.5194/egusphere-2025-904

Preprints

04 Jul 2025

| 04 Jul 2025

Application of artificial intelligence methods during the processing of spatial data from the hydrographic systems for coastal zone

Marta Wlodarczyk-Sielicka and Robertas Damaševičius

Abstract. Effective processing of spatial data in coastal zones requires the integration of measurements from various sensors to achieve a more comprehensive picture of dynamic environmental changes. This study proposes a new approach to spatial data analysis, combining information from the LiDAR system and multi-beam echo sounder (MBES). This combination of both sources allowed for a more accurate estimation of the topography and bathymetry of the coastal zone. A key element of the study was developing an original data reduction method based on Self-Organizing Maps (SOM) neural networks. Initially used for analysing bathymetric data, this method has been optimised for aquatic data, enabling effective processing of both heights from LiDAR and depths from MBES. Data reduction significantly shortened computation time – interpolation using the Empirical Bayesian Kriging (EBK) method for raw data took over 9 hours, whereas, for the reduced data (those with the highest density), it took just 4 minutes and 51 seconds while maintaining the comparable quality of results. The study confirmed that the reduced data meets the requirements of the International Hydrographic Organization (IHO) for shallow water bodies, which indicates the high accuracy of the method employed. The results suggest that data reduction based on artificial intelligence allows for the efficient management of big spatial data, and its integration with classical GIS interpolation methods can find broad applications in hydrography, environmental monitoring, and coastal zone management.

Received: 26 Feb 2025 – Discussion started: 04 Jul 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Marta Wlodarczyk-Sielicka and Robertas Damaševičius

Status: closed

CC1:
'Comment on egusphere-2025-904', Ting Zhang, 24 Jul 2025

The paper, titled "Application of artificial intelligence methods during the processing of spatial data from the hydrographic systems for coastal zone", primarily focuses on combining spatial data from LiDAR and MBES systems, utilizing a Self-Organizing Map (SOM) neural network for data dimensionality reduction, and generating terrain and bathymetric models through the Empirical Bayes Kriging (EBK) interpolation method. The research aims to optimize big data processing efficiency while maintaining data quality, in order to meet the accuracy requirements of the International Hydrographic Organization (IHO) for shallow water area measurements. The increase in data volume primarily stems from the demand for high-precision data, hence, data simplification while maintaining data accuracy is highly meaningful. However, From the results shown in this work, with the reduction of data, the precision is reduced. This is similar with the traditional methods, such as Kriging, Reature-Preserving methods, etc. What are the advantages of the method proposed in this work? The example area is very small, which is 100m*150m. Why do you choose such a small area? Is it feasible to handle larger area? I don't think there are much work with innovations in this work.

Citation: https://doi.org/10.5194/egusphere-2025-904-CC1
- AC1:
  'Reply on CC1', Marta Wlodarczyk Sielicka, 28 Jul 2025
  Thank you for taking the time to review my article entitled "Application of artificial intelligence methods during the processing of spatial data from the hydrographic systems for coastal zone". I sincerely appreciate your comments and suggestions. Below, I provide responses to the key points raised:
  Advantages of the proposed method:
  
  The main advantage of the proposed method is the significant reduction in data processing time while maintaining a high quality of results. By applying a Self-Organizing Map (SOM) neural network for data reduction, it was possible to decrease the number of spatial data points from over 2,168,011 measurements to significantly smaller subsets. This reduction enabled Empirical Bayesian Kriging (EBK) interpolation to be performed in under 5 minutes, compared to over 9 hours for the raw dataset. Importantly, the resulting models still meet the International Hydrographic Organization (IHO) accuracy requirements for shallow water areas, confirming the practical usefulness of the approach.
  Additionally, the proposed method offers several other advantages:
  Integration of multi-source data: The method is capable of processing both LiDAR (positive elevation) and MBES (negative depth) data simultaneously. This integrated approach eliminates the need to separate terrestrial and underwater datasets, streamlining workflows and enabling the creation of more comprehensive models of coastal zones.
  
  Preservation of data quality despite reduction: Even highly reduced datasets (e.g., Data A and B) preserved terrain and bathymetric features within acceptable accuracy thresholds. This confirms that the method effectively balances simplification with spatial precision.
  
  Flexible resolution control: The SOM-based approach allows adjustable reduction levels (e.g., 0.1 m, 0.2 m, 0.5 m, 1.0 m), which makes it adaptable to different application needs — from high-resolution local surveys to generalized regional analyses.
  
  Computational efficiency: The method drastically reduces dataset size (by up to 99%), lowering storage requirements and accelerating processing. This is particularly important for big data scenarios in hydrography, where datasets often exceed millions of points.
  
  Unsupervised learning and reduced subjectivity: Since SOM is an unsupervised algorithm, it does not require manual point selection or feature engineering, unlike traditional decimation or grid-based simplification techniques. This minimizes human error and increases repeatability.
  
  Potential for real-time or near-real-time use: With proper hardware and optimization, the method could be applied in operational contexts — such as real-time monitoring in ports or during field surveys — offering faster turnaround for data analysis and decision-making.
  
  These benefits collectively highlight the novelty and practical value of the approach, particularly in the context of spatial big data processing in hydrographic and coastal zone applications.
  Size of the test area:
  
  The selected test area (100 m × 150 m) was deliberately chosen because, despite its small physical dimensions, it contained a very large number of measurement points – over 2 million. Attempts to apply the method to a larger area at the exact resolution failed due to hardware limitations – the computer was not able to efficiently process such a large dataset within a reasonable time frame.
  Applicability to larger areas:
  
  The proposed method is fully scalable and can be applied to larger areas. However, doing so would require more advanced hardware resources and support from a qualified IT specialist – for instance, to optimise code, implement parallel processing, or use cloud computing. I plan to explore these directions in future research to allow for efficient large-scale spatial data processing.
  
  I hope these clarifications help to highlight the strengths of the proposed approach and justify the methodological choices made in this study.
  
  Sincerely,
  Marta Wlodarczyk-Sielicka
  
  Citation: https://doi.org/10.5194/egusphere-2025-904-AC1
  - CC2: 'Reply on AC1', Ting Zhang, 28 Jul 2025
    
    Thank you for the author's response, but it still lacks persuasiveness. Firstly, our existing tools such as ArcGIS can easily achieve terrain data resampling of different accuracies without the need for such complex methods. So, what are the specific advantages of the method proposed in this article compared to these commonly used methods? The issue you mentioned about significantly reducing processing time, when we lower the data resolution from 0.1m to 1m or even 10m, the processing time can be greatly reduced, and the required storage space can also be greatly reduced. I have compared the impact of DEM data with resolutions of 2m, 5m, and 10m on flood simulation results in my previous research. As the resolution decreases, the flood simulation results may also be slightly different, but still meet the simulation requirements. The data volume and calculation time are greatly reduced, which is basically consistent with the results of this study. However, I only used a simple resampling method to generate DEM with different resolutions. As the number of sampling points decreases, the accuracy in this article is clearly affected, making it difficult to evaluate the advantages of your method. In addition, regarding the question of this article only selects a 100m * 150m area as the research area and whether large-scale research can be conducted, I think the author's answer is not satisfactory. Our research based on terrain data generally takes watersheds and regions as scales. What are the application scenarios for the research of 100m * 150m areas? If the work in this article is limited by computer computing power and time, under what circumstances can it be unrestricted?
    
    Citation: https://doi.org/10.5194/egusphere-2025-904-CC2
    
    AC2: 'Reply on CC2', Marta Wlodarczyk Sielicka, 05 Aug 2025
    
    Thank you very much for your renewed and thorough assessment of the manuscript. Your remarks have helped me to refine the value of the proposed approach. Below, I respond to every point you raised.
    Although ArcGIS (or other GIS packages) can perform classical raster resampling, hydrographic work requires us to preserve the shallowest soundings (shoals) and the highest quay-wall points strictly within the IHO S-44 total vertical uncertainty (± 0.25 m for Order 1a surveys). Standard gridding or bilinear/bicubic smoothing averages out extremes—acceptable in flood-modelling contexts, yet potentially dangerous in precision nautical charting because critical minima may be lost, compromising navigational safety.
    The procedure I propose decimates the point cloud adaptively: in each cluster, it explicitly retains the shallowest (or highest) point before interpolation. Consequently, even after a 76 % reduction in data volume, the minimum depth remains identical to that in the raw set, and the entire product stays within IHO tolerances.
    Hydrographic data differ fundamentally from typical flood-plain DEMs:
    Sensor and density: hydrography combines a multibeam echosounder (MBES) below the waterline with airborne LiDAR over land, yielding up to several hundred points per square metre; flood DEMs rely on far sparser topographic LiDAR or photogrammetry.
    
    Analytical focus: flood models concentrate on average ground elevation, whereas nautical charting hinges on extreme points that determine the safe depth of the fairway.
    
    Consequences: losing a single shoal in a hydrographic dataset has an immediate safety impact; in flood simulations, minor deviations seldom alter inundation extent by more than a few centimetres.
    
    By consciously preserving extremes, my method therefore meets hydrographic requirements far better than generic resampling.
    Traditional workflows shorten computation mainly by coarse gridding (e.g., 1 m or 5 m). I decouple point-count reduction from grid resolution: after the SOM step, I still interpolate on a 0.1 m grid. In the raw dataset, processing a 0.1 m grid required 2,168,011 points; Empirical Bayesian Kriging took 9 hours 12 minutes, and the final model met IHO standards. When those same data were resampled to a 1 m grid, the point count dropped to about 15,000 and the runtime to 7 minutes, but the shallowest point vanished, exceeding IHO limits. Using the proposed Self-Organising Map while retaining 0.1 m resolution reduced the cloud to 510 531 points; interpolation then took only 4 minutes 51 seconds, and the model still satisfied all IHO criteria, preserving the key bathymetric minima. Thus, I achieve more than a hundred-fold speed-up over the raw data without sacrificing resolution or navigational safety.
    Areas as small as the 100 m × 150 m test patch are typical for
    monitoring quay-wall deformation,
    
    dredging acceptance surveys, and
    
    point-target inspections in port zones or cable landfalls.
    
    Although this is only 1.5 ha, dense MBES acquisition produces more than two million soundings—an ideal big-data testbed. For larger water bodies, the workflow is fully scalable: I process overlapping tiles (e.g., 500 m × 500 m) without loss of continuity.
    I developed the solution myself in MATLAB/Python and possess the complete source code, which I can freely modify, optimise for GPUs, or deploy in the cloud. ArcGIS is a commercial environment with high licence costs and limited algorithm transparency—the user does not influence the internal resampling procedures. With my implementation, I can
    fine-tune SOM parameters to specific bathymetries,
    
    release the script as an open-source toolbox for the hydrographic community, and
    
    avoid additional software-licence expenses, which is essential for smaller research units.
    
    The proposed method fuses LiDAR and MBES in a single pass, adaptively reduces the point cloud, preserves shoals, and fulfils IHO rules while markedly shortening computation time. It also eliminates licence costs and gives complete control over the code, making it a valuable alternative to classical resampling in commercial GIS packages.
    I trust that these explanations convincingly present the advantages of the proposed approach and address all concerns raised. I will, of course, be happy to accommodate any further suggestions.
    Sincerely,
    MWS
    
    Citation: https://doi.org/10.5194/egusphere-2025-904-AC2
RC1:
'Comment on egusphere-2025-904', Anonymous Referee #1, 30 Aug 2025

This study proposes an integrated method to generate elevation data using LiDAR and MBES data with the aid of self organizing map technology. Overall I followed the paper but I did not understand whether the proposed method helps improve the data quality. Looking at Figure 5, the mean values of data significantly increases through the process. In addition, through the Figure 6 to Figure 10, the resolution of data decreases as well as no validity check was demonstrated. I am not sure whether the study adds significant contribution.
Minor comments

P2 L 38 Lidar Detectoin and Ranging -> Light Detection and Ranging

P 2 L 51 uncrewed aerial vehicle -> unmanned aerial vehicle

P 19 Conclusion part is a bit lengthy.

Citation: https://doi.org/10.5194/egusphere-2025-904-RC1
- AC3: 'Reply on RC1', Marta Wlodarczyk Sielicka, 01 Sep 2025
  
  We sincerely thank the Reviewer for the careful reading of our manuscript and for the constructive comments. Below, we provide a detailed response to the observations.
  Our main goal was not only to improve the quality of the data in terms of accuracy, but also to increase the efficiency of processing very large datasets through the use of Self-Organizing Maps (SOM). As shown in the Conclusions and Discussion section (Section 5), the reduced datasets still meet the requirements of the International Hydrographic Organization (IHO) for shallow waters, and thus retain acceptable quality despite the reduction.
  The increase in mean values (Fig. 5) is a natural consequence of the disproportionate number of MBES points (negative values) compared to LiDAR points (positive values) in the raw dataset. The reduction process removes redundant underwater points, which shifts the mean toward higher values.
  Regarding resolution—we agree that with more substantial reduction the terrain models become more smoothed (Figs. 6–10). However, the surface difference analysis (Table 2, Fig. 11) shows that at lower levels of reduction (datasets A and B), deviations from the raw data remain small and within IHO requirements. In the future, we will add a more explicit validation element to the manuscript in order to emphasize that the method maintains compliance with international hydrographic standards.
  We stress that the main contribution of the article is the integration of LiDAR and MBES data with AI-based reduction, which enabled: a significant reduction of computation time (from over 9 hours to less than 5 minutes with EBK interpolation), preservation of quality sufficient for hydrographic applications, and demonstration of a scalable approach to working with spatial “big data.”
  We thank the Reviewer for pointing out the need to clarify issues related to data quality, changes in mean values, and model resolution. The revised version of the manuscript will highlight the balance between computational efficiency and quality preservation and will contain clear references to meeting IHO requirements. We also thank the Reviewer for the minor remarks.
  
  Citation: https://doi.org/10.5194/egusphere-2025-904-AC3
RC2:
'Comment on egusphere-2025-904', Anonymous Referee #2, 03 Sep 2025
This manuscript presents an original data reduction method based on Self-Organizing Maps (SOM) neural networks, providing an effective processing tool for spatial data in coastal zones. The authors tested their method in the processing of topography and bathymetry data from LiDAR and multi-beam echo sounder (MBES), and compared their results by interpolating fields with different reductions using the Empirical Bayesian Kriging (EBK) interpolation method. According to the authors, their method provides comparable quality results with only a fraction of computational time, highlighting the efficiency of their machine learning approach.
The manuscript requires considerable work. It contains redundant statements throughout. Parts of the text contain statements without references, and the authors need to provide clarifications in their methods and conclusions. One of this reviewer's primary concerns is the lack of comparison between other reduced dimensionality methods and the connection of their method to overall applications. Below, I have listed comments, hoping they may help improve the manuscript’s quality.
Specific Comments
There are some clarifications needed in the proposed methodology,
In multiple places within the manuscript, the authors state that their newly developed approach “was initially developed for analysing bathymetric data but was optimised within this study for processing waterborne data, allowing effective application to both LiDAR and MBES measurement points.” This reviewer might be confused, but could the authors provide clarification on how the extension of processing waterborne data is linked to surface LiDAR information?

It is not clear to this reviewer what the output of the SOM is. Is it a lower-dimensional set of datapoints with only the relevant information? Does it depend on the user? What is the best approach to select the number of output variables?

How did the authors arrive at the given configuration of the hyperparameters (topology, neighborhood size, and number of iterations, among others)? Did they explore a set of plausible configurations and use the most effective one? How can future users be certain that this configuration will provide the best estimates in their cases?

Also, does this method use activation functions for the neural network nodes? If so, what activation functions were used?

The results show a decrease in the resolution of the interpolated fields, which makes me wonder. Have the authors considered comparing their reduced dimensionality approach to other methods, such as Principal Component Analysis (PCA), Autoencoders, or other simpler resampling techniques? How well does their method perform against these other, more standard practices?

As the raw data is oversampled. Have the authors considered using linear interpolation between the points instead of EBK? Are the results similar? What about the efficacy of the interpolation?

With respect to the applicability of this method
The authors mention that their interpolated fields meet the requirements of the International Hydrographic Organization (IHO). However, these are not explained in the text. Additional clarification should be added.

Why are the authors aiming to create surfaces with a resolution of less than or equal to 0.1 meters? Did they verify that the instrumental data have such precision? Also, is this high resolution necessary for technical purposes? This reviewer suggests giving additional clarifications.

The authors show multiple interpolated fields and their differences with respect to the raw data interpolation. However, there is little discussion about which model is best for their specific objective. This reviewer recommends linking the results with potential applications in the field.

The area that the authors used for their analysis is relatively small. Have the authors considered how their approach will scale with larger regions? Or how transferable their method is to other places? Additionally, can users benefit from their trained model in other coastal settings?

I encourage the authors to include a section on the potential limitations of their approach and discuss its applicability to other data sources.

Technical Corrections
Besides the comments described above, I have a few technical recommendations for the manuscript.
The sentence between lines 15 and 16 in the abstract sounds redundant. Consider rephrasing it.

I suggest a rewrite of the introduction section. It lacks a cohesive story. Currently, it jumps from artificial intelligence to coastal resources to monitoring instrumentation without delving in-depth into any of them, making the paragraphs difficult to follow. Below are some general suggestions that I have found while reading this section.

I recommend adding a citation that summarizes what is being said in lines 26 to 27.

Sentences in lines 29 to 31 seem out of place and redundant; consider rewriting them.

A citation is needed for sentences between lines 34 and 36.

Why is measuring these variables important? I suggest including some references that show the importance in terms of policymaking, the mitigation of climate change, and the biological processes occurring in these zones. These are just some ideas that could be relevant while discussing the importance of having precise topographic, bathymetric, and environmental information.

A citation is needed for the sentence in lines 45 to 47. What studies have shown that the development of remote sensing techniques has enabled precise and efficient research in coastal systems?

The sentences after that (lines 47 to 53) jump from remote sensing to deep learning to other coastal mapping methods, which makes following the idea difficult. Consider rewriting.

There should be an in-line citation in the sentence starting in line 67. “Specht and Wiśniewska (year).”

A citation is needed in line 183, where the authors state that “Previous use of the method focuses solely on depth-related data.” Who has used this method previously, and what was their approach?

A citation is also needed in the sentence “The method employs Self-Organizing Maps (SOM) developed by Teuvo Kohonen” (Lines 185-186).

This author found Figure 2 on another website with attribution to LatentView Analytics (https://www.latentview.com/blog/self-organizing-maps/). The authors should give proper acknowledgments when using figures that come from different sources.

The sentences in lines 235-238 are very redundant and not relevant to the current study. Consider rewriting or removing them.

The presentation of Table 1 and Figures 4 and 5 is redundant. This reviewer recommends removing Figure 5, as it is not adding to the interpretation of the results.

Labels in Figures 6 through 11 are difficult to read. Consider increasing the font size.
Citation: https://doi.org/10.5194/egusphere-2025-904-RC2
- AC4:
  'Reply on RC2', Marta Wlodarczyk Sielicka, 04 Sep 2025
  Dear Reviewer,
  We sincerely thank you for your comprehensive and detailed review of our manuscript. Your comments and suggestions will be constructive in further improving the paper. In response to your remarks, we would like to emphasize that in the revised version of the manuscript:
  We will provide a more detailed description of the method modification for LiDAR data. We will explain in detail how the SOM method, initially applied to bathymetric data, was adapted for processing elevation data from LiDAR. We will include the technical aspects of this adaptation and show the connections between waterborne and terrestrial data.
  
  We will expand the description of the SOM method. The manuscript will be supplemented with a more detailed explanation of SOM functioning, parameter configuration (topology, neighborhood size, number of iterations, activation functions), and the criteria for their selection. We will also add appropriate references, including the author’s previous publications related to hydrographic data reduction.
  
  We will add a comparison with standard methods. We will include results compared with more commonly used dimensionality reduction techniques. A literature review comparing the effectiveness of these approaches will also be introduced.
  
  We will extend the analysis to include linear interpolation. We will compare the results of our data reduction method combined with EBK and linear interpolation to highlight similarities and differences in the quality of the results.
  
  We will describe the IHO requirements in more detail. We will clarify how the obtained results meet the criteria of the International Hydrographic Organization (IHO), adding references to official documents (e.g., IHO C-17, S-44).
  
  We will provide additional information on data resolution. We will explain why a specific resolution was chosen, verify the precision of the instrumental data, and justify why such resolution is necessary for detailed coastal zone mapping.
  
  We will link the results with practical applications. The discussion will be expanded to indicate which reduction models are most suitable for various application scenarios (hydrography, marine engineering, environmental monitoring, port management).
  
  We will refer to other research areas. The analysis will be extended to address issues of scalability and transferability of the method to other water bodies. We will provide examples of potential applications in different coastal environments.
  
  We will add a section on the limitations of the method. Both technical limitations (choice of SOM parameters, risk of detail loss during reduction) and practical limitations (scalability, hardware requirements) will be included.
  
  We will address the technical remarks. We will revise the abstract and introduction, remove redundancies, correct citations, and improve the graphical presentation (larger font sizes in figures, removal of unnecessary elements such as Fig. 5). We will also add missing literature references and provide proper acknowledgments for external graphical materials used.
  
  Once again, we thank you for your insightful comments. All suggestions will be incorporated and will undoubtedly contribute to improving the quality and scientific value of the manuscript.
  With kind regards,
  Marta Wlodarczyk-Sielicka
  
  Citation: https://doi.org/10.5194/egusphere-2025-904-AC4

Status: closed

CC1:
'Comment on egusphere-2025-904', Ting Zhang, 24 Jul 2025

The paper, titled "Application of artificial intelligence methods during the processing of spatial data from the hydrographic systems for coastal zone", primarily focuses on combining spatial data from LiDAR and MBES systems, utilizing a Self-Organizing Map (SOM) neural network for data dimensionality reduction, and generating terrain and bathymetric models through the Empirical Bayes Kriging (EBK) interpolation method. The research aims to optimize big data processing efficiency while maintaining data quality, in order to meet the accuracy requirements of the International Hydrographic Organization (IHO) for shallow water area measurements. The increase in data volume primarily stems from the demand for high-precision data, hence, data simplification while maintaining data accuracy is highly meaningful. However, From the results shown in this work, with the reduction of data, the precision is reduced. This is similar with the traditional methods, such as Kriging, Reature-Preserving methods, etc. What are the advantages of the method proposed in this work? The example area is very small, which is 100m*150m. Why do you choose such a small area? Is it feasible to handle larger area? I don't think there are much work with innovations in this work.

Citation: https://doi.org/10.5194/egusphere-2025-904-CC1
- AC1:
  'Reply on CC1', Marta Wlodarczyk Sielicka, 28 Jul 2025
  Thank you for taking the time to review my article entitled "Application of artificial intelligence methods during the processing of spatial data from the hydrographic systems for coastal zone". I sincerely appreciate your comments and suggestions. Below, I provide responses to the key points raised:
  Advantages of the proposed method:
  
  The main advantage of the proposed method is the significant reduction in data processing time while maintaining a high quality of results. By applying a Self-Organizing Map (SOM) neural network for data reduction, it was possible to decrease the number of spatial data points from over 2,168,011 measurements to significantly smaller subsets. This reduction enabled Empirical Bayesian Kriging (EBK) interpolation to be performed in under 5 minutes, compared to over 9 hours for the raw dataset. Importantly, the resulting models still meet the International Hydrographic Organization (IHO) accuracy requirements for shallow water areas, confirming the practical usefulness of the approach.
  Additionally, the proposed method offers several other advantages:
  Integration of multi-source data: The method is capable of processing both LiDAR (positive elevation) and MBES (negative depth) data simultaneously. This integrated approach eliminates the need to separate terrestrial and underwater datasets, streamlining workflows and enabling the creation of more comprehensive models of coastal zones.
  
  Preservation of data quality despite reduction: Even highly reduced datasets (e.g., Data A and B) preserved terrain and bathymetric features within acceptable accuracy thresholds. This confirms that the method effectively balances simplification with spatial precision.
  
  Flexible resolution control: The SOM-based approach allows adjustable reduction levels (e.g., 0.1 m, 0.2 m, 0.5 m, 1.0 m), which makes it adaptable to different application needs — from high-resolution local surveys to generalized regional analyses.
  
  Computational efficiency: The method drastically reduces dataset size (by up to 99%), lowering storage requirements and accelerating processing. This is particularly important for big data scenarios in hydrography, where datasets often exceed millions of points.
  
  Unsupervised learning and reduced subjectivity: Since SOM is an unsupervised algorithm, it does not require manual point selection or feature engineering, unlike traditional decimation or grid-based simplification techniques. This minimizes human error and increases repeatability.
  
  Potential for real-time or near-real-time use: With proper hardware and optimization, the method could be applied in operational contexts — such as real-time monitoring in ports or during field surveys — offering faster turnaround for data analysis and decision-making.
  
  These benefits collectively highlight the novelty and practical value of the approach, particularly in the context of spatial big data processing in hydrographic and coastal zone applications.
  Size of the test area:
  
  The selected test area (100 m × 150 m) was deliberately chosen because, despite its small physical dimensions, it contained a very large number of measurement points – over 2 million. Attempts to apply the method to a larger area at the exact resolution failed due to hardware limitations – the computer was not able to efficiently process such a large dataset within a reasonable time frame.
  Applicability to larger areas:
  
  The proposed method is fully scalable and can be applied to larger areas. However, doing so would require more advanced hardware resources and support from a qualified IT specialist – for instance, to optimise code, implement parallel processing, or use cloud computing. I plan to explore these directions in future research to allow for efficient large-scale spatial data processing.
  
  I hope these clarifications help to highlight the strengths of the proposed approach and justify the methodological choices made in this study.
  
  Sincerely,
  Marta Wlodarczyk-Sielicka
  
  Citation: https://doi.org/10.5194/egusphere-2025-904-AC1
  - CC2: 'Reply on AC1', Ting Zhang, 28 Jul 2025
    
    Thank you for the author's response, but it still lacks persuasiveness. Firstly, our existing tools such as ArcGIS can easily achieve terrain data resampling of different accuracies without the need for such complex methods. So, what are the specific advantages of the method proposed in this article compared to these commonly used methods? The issue you mentioned about significantly reducing processing time, when we lower the data resolution from 0.1m to 1m or even 10m, the processing time can be greatly reduced, and the required storage space can also be greatly reduced. I have compared the impact of DEM data with resolutions of 2m, 5m, and 10m on flood simulation results in my previous research. As the resolution decreases, the flood simulation results may also be slightly different, but still meet the simulation requirements. The data volume and calculation time are greatly reduced, which is basically consistent with the results of this study. However, I only used a simple resampling method to generate DEM with different resolutions. As the number of sampling points decreases, the accuracy in this article is clearly affected, making it difficult to evaluate the advantages of your method. In addition, regarding the question of this article only selects a 100m * 150m area as the research area and whether large-scale research can be conducted, I think the author's answer is not satisfactory. Our research based on terrain data generally takes watersheds and regions as scales. What are the application scenarios for the research of 100m * 150m areas? If the work in this article is limited by computer computing power and time, under what circumstances can it be unrestricted?
    
    Citation: https://doi.org/10.5194/egusphere-2025-904-CC2
    
    AC2: 'Reply on CC2', Marta Wlodarczyk Sielicka, 05 Aug 2025
    
    Thank you very much for your renewed and thorough assessment of the manuscript. Your remarks have helped me to refine the value of the proposed approach. Below, I respond to every point you raised.
    Although ArcGIS (or other GIS packages) can perform classical raster resampling, hydrographic work requires us to preserve the shallowest soundings (shoals) and the highest quay-wall points strictly within the IHO S-44 total vertical uncertainty (± 0.25 m for Order 1a surveys). Standard gridding or bilinear/bicubic smoothing averages out extremes—acceptable in flood-modelling contexts, yet potentially dangerous in precision nautical charting because critical minima may be lost, compromising navigational safety.
    The procedure I propose decimates the point cloud adaptively: in each cluster, it explicitly retains the shallowest (or highest) point before interpolation. Consequently, even after a 76 % reduction in data volume, the minimum depth remains identical to that in the raw set, and the entire product stays within IHO tolerances.
    Hydrographic data differ fundamentally from typical flood-plain DEMs:
    Sensor and density: hydrography combines a multibeam echosounder (MBES) below the waterline with airborne LiDAR over land, yielding up to several hundred points per square metre; flood DEMs rely on far sparser topographic LiDAR or photogrammetry.
    
    Analytical focus: flood models concentrate on average ground elevation, whereas nautical charting hinges on extreme points that determine the safe depth of the fairway.
    
    Consequences: losing a single shoal in a hydrographic dataset has an immediate safety impact; in flood simulations, minor deviations seldom alter inundation extent by more than a few centimetres.
    
    By consciously preserving extremes, my method therefore meets hydrographic requirements far better than generic resampling.
    Traditional workflows shorten computation mainly by coarse gridding (e.g., 1 m or 5 m). I decouple point-count reduction from grid resolution: after the SOM step, I still interpolate on a 0.1 m grid. In the raw dataset, processing a 0.1 m grid required 2,168,011 points; Empirical Bayesian Kriging took 9 hours 12 minutes, and the final model met IHO standards. When those same data were resampled to a 1 m grid, the point count dropped to about 15,000 and the runtime to 7 minutes, but the shallowest point vanished, exceeding IHO limits. Using the proposed Self-Organising Map while retaining 0.1 m resolution reduced the cloud to 510 531 points; interpolation then took only 4 minutes 51 seconds, and the model still satisfied all IHO criteria, preserving the key bathymetric minima. Thus, I achieve more than a hundred-fold speed-up over the raw data without sacrificing resolution or navigational safety.
    Areas as small as the 100 m × 150 m test patch are typical for
    monitoring quay-wall deformation,
    
    dredging acceptance surveys, and
    
    point-target inspections in port zones or cable landfalls.
    
    Although this is only 1.5 ha, dense MBES acquisition produces more than two million soundings—an ideal big-data testbed. For larger water bodies, the workflow is fully scalable: I process overlapping tiles (e.g., 500 m × 500 m) without loss of continuity.
    I developed the solution myself in MATLAB/Python and possess the complete source code, which I can freely modify, optimise for GPUs, or deploy in the cloud. ArcGIS is a commercial environment with high licence costs and limited algorithm transparency—the user does not influence the internal resampling procedures. With my implementation, I can
    fine-tune SOM parameters to specific bathymetries,
    
    release the script as an open-source toolbox for the hydrographic community, and
    
    avoid additional software-licence expenses, which is essential for smaller research units.
    
    The proposed method fuses LiDAR and MBES in a single pass, adaptively reduces the point cloud, preserves shoals, and fulfils IHO rules while markedly shortening computation time. It also eliminates licence costs and gives complete control over the code, making it a valuable alternative to classical resampling in commercial GIS packages.
    I trust that these explanations convincingly present the advantages of the proposed approach and address all concerns raised. I will, of course, be happy to accommodate any further suggestions.
    Sincerely,
    MWS
    
    Citation: https://doi.org/10.5194/egusphere-2025-904-AC2
RC1:
'Comment on egusphere-2025-904', Anonymous Referee #1, 30 Aug 2025

This study proposes an integrated method to generate elevation data using LiDAR and MBES data with the aid of self organizing map technology. Overall I followed the paper but I did not understand whether the proposed method helps improve the data quality. Looking at Figure 5, the mean values of data significantly increases through the process. In addition, through the Figure 6 to Figure 10, the resolution of data decreases as well as no validity check was demonstrated. I am not sure whether the study adds significant contribution.
Minor comments

P2 L 38 Lidar Detectoin and Ranging -> Light Detection and Ranging

P 2 L 51 uncrewed aerial vehicle -> unmanned aerial vehicle

P 19 Conclusion part is a bit lengthy.

Citation: https://doi.org/10.5194/egusphere-2025-904-RC1
- AC3: 'Reply on RC1', Marta Wlodarczyk Sielicka, 01 Sep 2025
  
  We sincerely thank the Reviewer for the careful reading of our manuscript and for the constructive comments. Below, we provide a detailed response to the observations.
  Our main goal was not only to improve the quality of the data in terms of accuracy, but also to increase the efficiency of processing very large datasets through the use of Self-Organizing Maps (SOM). As shown in the Conclusions and Discussion section (Section 5), the reduced datasets still meet the requirements of the International Hydrographic Organization (IHO) for shallow waters, and thus retain acceptable quality despite the reduction.
  The increase in mean values (Fig. 5) is a natural consequence of the disproportionate number of MBES points (negative values) compared to LiDAR points (positive values) in the raw dataset. The reduction process removes redundant underwater points, which shifts the mean toward higher values.
  Regarding resolution—we agree that with more substantial reduction the terrain models become more smoothed (Figs. 6–10). However, the surface difference analysis (Table 2, Fig. 11) shows that at lower levels of reduction (datasets A and B), deviations from the raw data remain small and within IHO requirements. In the future, we will add a more explicit validation element to the manuscript in order to emphasize that the method maintains compliance with international hydrographic standards.
  We stress that the main contribution of the article is the integration of LiDAR and MBES data with AI-based reduction, which enabled: a significant reduction of computation time (from over 9 hours to less than 5 minutes with EBK interpolation), preservation of quality sufficient for hydrographic applications, and demonstration of a scalable approach to working with spatial “big data.”
  We thank the Reviewer for pointing out the need to clarify issues related to data quality, changes in mean values, and model resolution. The revised version of the manuscript will highlight the balance between computational efficiency and quality preservation and will contain clear references to meeting IHO requirements. We also thank the Reviewer for the minor remarks.
  
  Citation: https://doi.org/10.5194/egusphere-2025-904-AC3
RC2:
'Comment on egusphere-2025-904', Anonymous Referee #2, 03 Sep 2025
This manuscript presents an original data reduction method based on Self-Organizing Maps (SOM) neural networks, providing an effective processing tool for spatial data in coastal zones. The authors tested their method in the processing of topography and bathymetry data from LiDAR and multi-beam echo sounder (MBES), and compared their results by interpolating fields with different reductions using the Empirical Bayesian Kriging (EBK) interpolation method. According to the authors, their method provides comparable quality results with only a fraction of computational time, highlighting the efficiency of their machine learning approach.
The manuscript requires considerable work. It contains redundant statements throughout. Parts of the text contain statements without references, and the authors need to provide clarifications in their methods and conclusions. One of this reviewer's primary concerns is the lack of comparison between other reduced dimensionality methods and the connection of their method to overall applications. Below, I have listed comments, hoping they may help improve the manuscript’s quality.
Specific Comments
There are some clarifications needed in the proposed methodology,
In multiple places within the manuscript, the authors state that their newly developed approach “was initially developed for analysing bathymetric data but was optimised within this study for processing waterborne data, allowing effective application to both LiDAR and MBES measurement points.” This reviewer might be confused, but could the authors provide clarification on how the extension of processing waterborne data is linked to surface LiDAR information?

It is not clear to this reviewer what the output of the SOM is. Is it a lower-dimensional set of datapoints with only the relevant information? Does it depend on the user? What is the best approach to select the number of output variables?

How did the authors arrive at the given configuration of the hyperparameters (topology, neighborhood size, and number of iterations, among others)? Did they explore a set of plausible configurations and use the most effective one? How can future users be certain that this configuration will provide the best estimates in their cases?

Also, does this method use activation functions for the neural network nodes? If so, what activation functions were used?

The results show a decrease in the resolution of the interpolated fields, which makes me wonder. Have the authors considered comparing their reduced dimensionality approach to other methods, such as Principal Component Analysis (PCA), Autoencoders, or other simpler resampling techniques? How well does their method perform against these other, more standard practices?

As the raw data is oversampled. Have the authors considered using linear interpolation between the points instead of EBK? Are the results similar? What about the efficacy of the interpolation?

With respect to the applicability of this method
The authors mention that their interpolated fields meet the requirements of the International Hydrographic Organization (IHO). However, these are not explained in the text. Additional clarification should be added.

Why are the authors aiming to create surfaces with a resolution of less than or equal to 0.1 meters? Did they verify that the instrumental data have such precision? Also, is this high resolution necessary for technical purposes? This reviewer suggests giving additional clarifications.

The authors show multiple interpolated fields and their differences with respect to the raw data interpolation. However, there is little discussion about which model is best for their specific objective. This reviewer recommends linking the results with potential applications in the field.

The area that the authors used for their analysis is relatively small. Have the authors considered how their approach will scale with larger regions? Or how transferable their method is to other places? Additionally, can users benefit from their trained model in other coastal settings?

I encourage the authors to include a section on the potential limitations of their approach and discuss its applicability to other data sources.

Technical Corrections
Besides the comments described above, I have a few technical recommendations for the manuscript.
The sentence between lines 15 and 16 in the abstract sounds redundant. Consider rephrasing it.

I suggest a rewrite of the introduction section. It lacks a cohesive story. Currently, it jumps from artificial intelligence to coastal resources to monitoring instrumentation without delving in-depth into any of them, making the paragraphs difficult to follow. Below are some general suggestions that I have found while reading this section.

I recommend adding a citation that summarizes what is being said in lines 26 to 27.

Sentences in lines 29 to 31 seem out of place and redundant; consider rewriting them.

A citation is needed for sentences between lines 34 and 36.

Why is measuring these variables important? I suggest including some references that show the importance in terms of policymaking, the mitigation of climate change, and the biological processes occurring in these zones. These are just some ideas that could be relevant while discussing the importance of having precise topographic, bathymetric, and environmental information.

A citation is needed for the sentence in lines 45 to 47. What studies have shown that the development of remote sensing techniques has enabled precise and efficient research in coastal systems?

The sentences after that (lines 47 to 53) jump from remote sensing to deep learning to other coastal mapping methods, which makes following the idea difficult. Consider rewriting.

There should be an in-line citation in the sentence starting in line 67. “Specht and Wiśniewska (year).”

A citation is needed in line 183, where the authors state that “Previous use of the method focuses solely on depth-related data.” Who has used this method previously, and what was their approach?

A citation is also needed in the sentence “The method employs Self-Organizing Maps (SOM) developed by Teuvo Kohonen” (Lines 185-186).

This author found Figure 2 on another website with attribution to LatentView Analytics (https://www.latentview.com/blog/self-organizing-maps/). The authors should give proper acknowledgments when using figures that come from different sources.

The sentences in lines 235-238 are very redundant and not relevant to the current study. Consider rewriting or removing them.

The presentation of Table 1 and Figures 4 and 5 is redundant. This reviewer recommends removing Figure 5, as it is not adding to the interpretation of the results.

Labels in Figures 6 through 11 are difficult to read. Consider increasing the font size.
Citation: https://doi.org/10.5194/egusphere-2025-904-RC2
- AC4:
  'Reply on RC2', Marta Wlodarczyk Sielicka, 04 Sep 2025
  Dear Reviewer,
  We sincerely thank you for your comprehensive and detailed review of our manuscript. Your comments and suggestions will be constructive in further improving the paper. In response to your remarks, we would like to emphasize that in the revised version of the manuscript:
  We will provide a more detailed description of the method modification for LiDAR data. We will explain in detail how the SOM method, initially applied to bathymetric data, was adapted for processing elevation data from LiDAR. We will include the technical aspects of this adaptation and show the connections between waterborne and terrestrial data.
  
  We will expand the description of the SOM method. The manuscript will be supplemented with a more detailed explanation of SOM functioning, parameter configuration (topology, neighborhood size, number of iterations, activation functions), and the criteria for their selection. We will also add appropriate references, including the author’s previous publications related to hydrographic data reduction.
  
  We will add a comparison with standard methods. We will include results compared with more commonly used dimensionality reduction techniques. A literature review comparing the effectiveness of these approaches will also be introduced.
  
  We will extend the analysis to include linear interpolation. We will compare the results of our data reduction method combined with EBK and linear interpolation to highlight similarities and differences in the quality of the results.
  
  We will describe the IHO requirements in more detail. We will clarify how the obtained results meet the criteria of the International Hydrographic Organization (IHO), adding references to official documents (e.g., IHO C-17, S-44).
  
  We will provide additional information on data resolution. We will explain why a specific resolution was chosen, verify the precision of the instrumental data, and justify why such resolution is necessary for detailed coastal zone mapping.
  
  We will link the results with practical applications. The discussion will be expanded to indicate which reduction models are most suitable for various application scenarios (hydrography, marine engineering, environmental monitoring, port management).
  
  We will refer to other research areas. The analysis will be extended to address issues of scalability and transferability of the method to other water bodies. We will provide examples of potential applications in different coastal environments.
  
  We will add a section on the limitations of the method. Both technical limitations (choice of SOM parameters, risk of detail loss during reduction) and practical limitations (scalability, hardware requirements) will be included.
  
  We will address the technical remarks. We will revise the abstract and introduction, remove redundancies, correct citations, and improve the graphical presentation (larger font sizes in figures, removal of unnecessary elements such as Fig. 5). We will also add missing literature references and provide proper acknowledgments for external graphical materials used.
  
  Once again, we thank you for your insightful comments. All suggestions will be incorporated and will undoubtedly contribute to improving the quality and scientific value of the manuscript.
  With kind regards,
  Marta Wlodarczyk-Sielicka
  
  Citation: https://doi.org/10.5194/egusphere-2025-904-AC4

Marta Wlodarczyk-Sielicka and Robertas Damaševičius

Viewed

Total article views: 605 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
453	126	26	605	9	19

HTML: 453
PDF: 126
XML: 26
Total: 605
BibTeX: 9
EndNote: 19

Views and downloads (calculated since 04 Jul 2025)

Month	HTML	PDF	XML	Total
Jul 2025	105	36	11	152
Aug 2025	131	38	6	175
Sep 2025	189	22	9	220
Oct 2025	27	28	0	55
Nov 2025	1	2	0	3

Cumulative views and downloads (calculated since 04 Jul 2025)

Month	HTML	PDF	XML	Total
Jul 2025	105	36	11	152
Aug 2025	131	38	6	175
Sep 2025	189	22	9	220
Oct 2025	27	28	0	55
Nov 2025	1	2	0	3

Viewed (geographical distribution)

Total article views: 585 (including HTML, PDF, and XML) Thereof 585 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 03 Nov 2025

Short summary

This study introduces a new way to analyze coastal zone data by combining LiDAR and multi-beam echo sounder (MBES) measurements, improving the accuracy of topography and bathymetry mapping. Using artificial intelligence, we developed a data reduction method that significantly speeds up processing while maintaining precision. Our findings confirm that this approach meets international standards, offering practical benefits for hydrography, environmental monitoring, and coastal management.


Total:	0
HTML:	0
PDF:	0
XML:	0