Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).
Evaluation of preCICE (version 3.3.0) in an Earth System Model Regridding Benchmark
Alex Hocksand Benjamin Uekermann
Abstract. In Earth System Modeling (ESM), meshes of different models usually do not match, requiring data mapping algorithms implemented in coupling software. Valcke et al. (2022) recently introduced a benchmark to evaluate such algorithms and compared implementations in four specialized ESM couplers. In this paper, we assess preCICE, a general-purpose coupling library not limited to ESM, using this benchmark and compare our results to the original study. The generality of preCICE with its larger community offers potential benefits to ESM applications, but the software naturally lacks ESM-specific solutions. We describe necessary pre- and postprocessing steps to make the benchmark tangible for preCICE. Overall, preCICE achieves comparable results; using its radial basis function mapping yields significantly lower errors.
Received: 13 Nov 2025 – Discussion started: 12 Jan 2026
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
The manuscript evaluates the coupling library preCICE (v3.3.0) using the Earth system model (ESM) regridding benchmark of Valcke et al. (2022). The authors adapt this benchmark, originally developed for domain-specific ESM coupling software, to preCICE by introducing additional preprocessing and postprocessing steps. Three remapping methods available in preCICE - nearest neighbour, linear interpolation, and a radial basis function (RBF)–based approach - are assessed and compared against published benchmark results. The study shows that the tested preCICE remapping methods can achieve interpolation accuracy comparable to the tools included in the original benchmark for this specific offline regridding task.
Scientific significance for the Earth system modelling community
The manuscript is of interest to the ESM community as it explores whether a general-purpose, multi-physics coupling framework can reproduce results from an established ESM regridding benchmark. It demonstrates that a subset of remapping methods commonly required in ESM workflows is available in preCICE and can yield competitive accuracy. However, the scientific significance is limited by the narrow scope of the evaluation: only offline remapping accuracy for a subset of interpolation methods required by ESM setups is assessed, while other core ESM coupling functionalities and critical performance and scalability characteristics are not examined.
Recommendation for publication
The manuscript presents a technically sound and transparent benchmark study that fits the scope of Geoscientific Model Development. However, the conclusions substantially overgeneralize the demonstrated results. I recommend publication only after major revisions.
General comments
The work presented in this manuscript primarily evaluates the quality of remapping methods available in preCICE in the context of ESM. Several conclusions drawn from this evaluation appear overly general and, in places, exaggerated with respect to preCICE’s overall coupling capabilities for ESM setups. A more critical and focused discussion would be more appropriate. Alternatively, the scope of the paper could be broadened by including an evaluation of additional coupling functionality that is handled differently in preCICE compared to domain-specific ESM couplers.
A significant omission is the lack of performance and scalability analysis. In particular, runtime and scaling data - including measurements of so-called “ping-pong exchanges” - would considerably strengthen the evaluation. For domain-specific coupling software, “ping-pong exchanges” were already assessed in another referenced benchmark [4], which explains their absence in Valcke et al. (2022). For preCICE, especially for the RBF-based mapping, such measurements would be highly relevant given its potential computational cost.
Section 1 (Introduction) The introduction suggests that preCICE represents a general-purpose coupling solution that could serve as a coupling library for the ESM community if extended by ESM-specific optimizations. This appears to be an oversimplification and does not sufficiently acknowledge the broader range of functionality provided by domain-specific ESM coupling software beyond accuracy and efficiency considerations.
In addition to the benchmark by Valcke et al. (2022), the MIRA protocol [5] constitutes another relevant benchmark for regridding quality and could be mentioned for completeness.
Section 2.2.1 (Library) Within the ESM community, a clear distinction is made between online and offline generation of remapping weights. It would therefore be useful to clarify which of these approaches preCICE supports in practice and how they would be used in realistic ESM configurations. In the case of online generation, performance and scaling characteristics are of particular interest.
In typical ESM setups, remapping weights are either read from file or computed online during model initialization and remain fixed throughout the simulation, enabling the exchange to be implemented as a distributed sparse matrix–vector multiplication. Based on Chourdakis et al. (2022), this does not appear to be the case for the RBF implementation in preCICE. This is important information and should be stated explicitly. Furthermore, the performance results presented there suggest that the RBF approach may be prohibitively expensive for high-resolution ESM simulations and should be discussed accordingly.
Section 4 (Results) The comparison of global conservation properties for interpolation methods that are not designed to be conservative is of limited value. For this reason, the original benchmark did not include such measures for these methods. Metrics such as mean and maximum misfit would be more appropriate in this context.
Specific comments
Line 3–4: "a general-purpose coupling library not limited to ESM" The term “general-purpose” already implies not being limited to ESM. Additionally, describing other couplers as “limited” appears unnecessarily negative and suggests functional equivalence that does not currently exist.
Line 7: “yields significantly lower error” lower than which reference or method?
Line 10: "SEA" The abbreviation “SEA” seems at least for me uncommon; “OCE” or “OCN” are more typical for ocean components.
Line 12: Consider replacing “dedicated” with “domain-specific”.
Line 13–15: ordering of software Is there any reason for this ordering? Lexicographic ordering would appear more neutral.
Line 16–17: “already used in ESM”, "limited" Claims regarding existing use in ESM and limitations of other couplers overstate the general applicability of preCICE beyond the specific setups considered (Abele et al. 2025).
Line 20: “flexibility of the coupling approach” This is vague and should be specified more precisely.
Line 23–25: Listing these as potential benefits implies that existing ESM coupling software lacks these properties, which is debatable.
Line 25 “advanced numerical [..] methods” Implicit coupling support is largely irrelevant for current ESM runs, as most component models do not support implicit coupling by design.
Line 25 “advanced [..] HPC methods” Large-scale coupled ESM simulations using domain-specific couplers have already demonstrated extreme scalability (e.g. [6], [7]) some some of the largest HPC systems of the world. Comparable demonstrations for preCICE are currently lacking.
Line 26: "the absence of ESM-specific optimizations in terms of accuracy and efficiency" The distinction between generic and domain-specific couplers is not limited to accuracy and efficiency; additional functionality is often crucial.
Line 34: "VTK" Add a reference or footnote for VTK. Since the VTK format is listed as an important dependency, it could be of interest to the reader to justify the choice of this file format section 2.4.
Line 46–47: "constant data should remain constant" In addition to conservation, monotonicity (no new extrema) and smoothness (continuously differentiable) are important remapping properties. Depending on the physical property to be exchanged and the grid configuration, you may want your remapping to have one or the other property. The first property is often achieved using a "first order" method and the second one with the "second order" method. Since, in ESM remapping weights are usually not recomputed during a run, most of the time these two properties exclude themselves. The RBF implementation in preCICE seem to fulfil both properties, however as mentioned above it may have its own significant disadvantage.
Line 49: "In ESM,..." Consider starting a new paragraph?
Line 51: "To this end, meshes need to be enriched by connectivity." This statement would benefit from additional clarification. In the benchmark, a mesh is defined by vertices, edges connecting these vertices, and cells formed by the edges. While most coupling software implicitly assumes that all edges follow great circles, this assumption does not hold for certain grid types commonly used in ESM, such as regular Gaussian grids, where edges follow circles of constant latitude or longitude. Among the coupling solutions considered in the benchmark, only YAC explicitly distinguishes between these edge types, which can affect cell area computations and, consequently, remapping accuracy [2]. Furthermore, although intensive quantities are provided at cell centers in the benchmark, connectivity information is still required by some remapping algorithms used for intensive fields, and is essential for all conservative remapping of extensive quantities. Clarifying these aspects would help readers better understand why and how connectivity information is required in this context.
Line 51: reference for conservative remapping ESM Add a standard reference for conservative remapping in ESM is. Jones, 1999 [8]
Line 51: "mapping is considered conservative" There is also the terms of local and global conservation. For numerical weather predication local conservation ensures high accuracy locally. However, for long running climate simulations global conservation is very important. If grid combinations are chosen carefully, local conservation can automatically also lead to global conservation.
Formula 4: For conservative interpolation, the ideal reference solution would be obtained by analytically integrating the test function over each cell, accounting for the exact cell geometry, and then normalizing by the cell area. In the benchmark, however, the source and reference target fields appear to be constructed by evaluating the analytical test function at the cell center and implicitly assuming this value to represent the cell-average. This approximation is likely a major contributor to the observed differences in global conservation between source and target fields, rather than the remapping procedure itself.
Line 110: "we use the WGS84 model" Most ESM components assume a spherical Earth; using WGS84 may not be appropriate here.
Line 115–116: "Cell areas are computed from the nodal-based form." Clarify whether planar or spherical cell areas were computed.
Line 122: The sse7 grid is a reduced Gaussian grid composed exclusively of quadrilateral cells. Owing to the grid’s structure, cell connectivity cannot be straightforwardly inferred. Moreover, as most coupling frameworks do not explicitly support edges following circles of constant latitude, a purely quadrilateral representation would lead to unintended gaps in the mesh. To avoid this issue, three additional vertices were introduced into the grid definition.
Line 127-128: “To use preCICE for coupling, we need to convert these into 3D Cartesian representations using x/y/z coordinates.” Remark: This is also commonly done internally by domain-specific implementations (using a unit sphere).
Line 138: "The couplers used in the reference are SCRIP ..." SCRIP is a library for computing remapping weights and not a coupler (XIOS is until now an IO-server library). How about: "The coupling software used in the reference are...."
Table 1: The naming of the columns is misleading, because it is not obvious that columns three and four refer to the method names in the benchmark.
Table 1 column 4: In the benchmark, these method names were used only in the appendix and primarily reflect the terminology employed by SCRIP for the respective categories. As such, they are potentially misleading, since different coupling software used different algorithms within the same category depending on availability. This is particularly relevant for the “higher-order” category, where the implemented methods differed substantially between tools.
Specifically, “distwgt-1” corresponds to a one-nearest-neighbour approach and was identical across all implementations. The label “distwgt” merely describes the weighting scheme that would be applied if more than one neighbour were used; for the one-nearest-neighbour case, no weighting is actually performed. The “bilinear” category effectively represents first-order interpolation in all cases; for YAC this was combined with inverse-distance weighting, although a different configuration was employed in a later benchmark, leading to significantly improved results (see Fig. 16 in [1]). Finally, the “bicubic” category encompassed fundamentally different methods: bicubic interpolation in SCRIP, patch recovery in ESMF, and HCSBB in YAC.
Line 161-162: “Moreover, similarly to Valcke et al. (2022), we need to transfer land-ocean masks from the SEA to the ATM meshes in a pre-processing step” Maybe: “Moreover, similar to Valcke et al. (2022), we have to generate a land-sea mask for the ATM mesh by computing the overlap between both meshes.”
Line 162-163: "To increase the usability of the benchmark..." Due to differences in cell shape representations and the numerical accuracy of the implemented methods, the various software packages in the benchmark may compute different grid overlaps. Consequently, the corresponding masks are generated separately for each software. The procedure for generating these masks was therefore explicitly described in benchmark. In the specific case of the torc grid, a fixed mask is usually provided alongside the grid data.
Line 168: "write the results to VTK" Maybe: "write the results to a VTK-formated file"?
Line 213: "parameter" What parameter? The support radius?
Line 214: "of the Earth radius" Remark: most coupling software actually work on the unit sphere (radius 1.0), which makes a couple of computations much easier.
Line 215-216: "Alternatively, in future studies, we could make use of an automatic parameter optimization, which is currently developed for preCICE." Remark: It may be worth considering whether the support radius could be determined adaptively based on a measure of the average distance between source points in the neighbourhood of a target point. Such an approach could yield more robust results for source grids with spatially varying resolution. A similar strategy is employed in the RBF implementation of YAC.
Section 3.7: In contrast to what is suggested in the manuscript, domain-specific software (SCRIP and YAC) is not able to handle the torc grid due to different error handling. The grid itself is valid and does not suffer from missing or invalid cells, its constructors were just “very creative” and did not just cause headaches for you. ;-) The torc grid is used by an ocean model, hence you can and have to ignore all land cells. Therefore, this grid comes with its specific land-sea mask. Structurally, the grid is curvilinear, with vertices stored in a two-dimensional array and connectivity defined implicitly. To locally increase resolution in regions of interest (e.g. the Mediterranean Sea), unused land cells were relocated, which results in degenerated neighbouring cells. Consequently, this grid can only be processed reliably if all land cells defined by the provided mask are removed prior to further analysis or remapping.
Line 241: "who tested the three couplers" Maybe: "who tested the three couplers for the non-conservative case:" (because actually there were four tools being tested)
Line 248-249: "We presume that this is due to the scaling of the surface areas with the water fraction in post-processing as explained in Section 3.6" Or due to the test fields being too smooth and uniform.
Figure 6: The benchmark demonstrated that, for the one-nearest-neighbour method, all regridding software produced nearly identical results in terms of mean, maximum, and RMS misfit. For this specific case, any substantial deviation would likely indicate an implementation issue. It is therefore unclear how the differences observed in global target conservation arise. (I still do not think that global target conservation is a valid measure for these methods).
Line 257-258: "For gulfstream, we obtain comparable results to the reference." This could also be an indication for "over-smoothing", which could be a problem for fields used in actual simulations. The benchmark contained an additional measurement, which evaluated this and might be interesting to see here.
Line 265: "very similarly than the misfit metric" Do you mean: "very similarly than the mean-misfit metric"? In the benchmark the comparison between mean and max misfit provided quite some interesting insights. For some grid configurations these also differed significantly between the implementations.
Line 277: "demonstrating that it can successfully handle ESM mapping problem" Claims regarding successful handling of ESM mapping problems are overstated given the absence of conservative remapping and performance evaluation.
Line 279: "An example of such transfer is the RBF mapping method" Actually, YAC already contains an RBF based method (see [3]). However, to my knowledge it did not yet found a user. The benchmark only evaluated methods that were available in most tested software. This is why other methods and configuration options were ignored.
Line 280-281: "it outperformed ESM-specific second-order methods by in average two orders of magnitude for smooth test function" Since the benchmark compared similar methods that were already successfully used in the ESM context, using simplified test functions did not seem to cause issues. However, with the introduction the RBF-method, this may have to be reevaluated, because good performance for the test functions may not automatically yield the same for actual data.
Line 285-286: "as it would require a carefully refined benchmark definition." Especially for Numerical Weather Prediction performance measurements are important. Depending on the overall performance, it may also be of interest for climate simulations. It would be very interesting to see some basic runtime and scaling measurements, similar to the ones described in the original benchmark (see section 8.)
We tested the general coupling software preCICE for data mapping between atmosphere and ocean simulation meshes in Earth system modeling. In a recent benchmark, preCICE performed on par with specialized tools. Its general design and large user community make it broadly applicable across scientific domains, fostering knowledge transfer and collaboration beyond Earth system research.
We tested the general coupling software preCICE for data mapping between atmosphere and ocean...
Summary
The manuscript evaluates the coupling library preCICE (v3.3.0) using the Earth system model (ESM) regridding benchmark of Valcke et al. (2022). The authors adapt this benchmark, originally developed for domain-specific ESM coupling software, to preCICE by introducing additional preprocessing and postprocessing steps. Three remapping methods available in preCICE - nearest neighbour, linear interpolation, and a radial basis function (RBF)–based approach - are assessed and compared against published benchmark results. The study shows that the tested preCICE remapping methods can achieve interpolation accuracy comparable to the tools included in the original benchmark for this specific offline regridding task.
Scientific significance for the Earth system modelling community
The manuscript is of interest to the ESM community as it explores whether a general-purpose, multi-physics coupling framework can reproduce results from an established ESM regridding benchmark. It demonstrates that a subset of remapping methods commonly required in ESM workflows is available in preCICE and can yield competitive accuracy. However, the scientific significance is limited by the narrow scope of the evaluation: only offline remapping accuracy for a subset of interpolation methods required by ESM setups is assessed, while other core ESM coupling functionalities and critical performance and scalability characteristics are not examined.
Recommendation for publication
The manuscript presents a technically sound and transparent benchmark study that fits the scope of Geoscientific Model Development. However, the conclusions substantially overgeneralize the demonstrated results. I recommend publication only after major revisions.
General comments
The work presented in this manuscript primarily evaluates the quality of remapping methods available in preCICE in the context of ESM. Several conclusions drawn from this evaluation appear overly general and, in places, exaggerated with respect to preCICE’s overall coupling capabilities for ESM setups. A more critical and focused discussion would be more appropriate. Alternatively, the scope of the paper could be broadened by including an evaluation of additional coupling functionality that is handled differently in preCICE compared to domain-specific ESM couplers.
A significant omission is the lack of performance and scalability analysis. In particular, runtime and scaling data - including measurements of so-called “ping-pong exchanges” - would considerably strengthen the evaluation. For domain-specific coupling software, “ping-pong exchanges” were already assessed in another referenced benchmark [4], which explains their absence in Valcke et al. (2022). For preCICE, especially for the RBF-based mapping, such measurements would be highly relevant given its potential computational cost.
Section 1 (Introduction)
The introduction suggests that preCICE represents a general-purpose coupling solution that could serve as a coupling library for the ESM community if extended by ESM-specific optimizations. This appears to be an oversimplification and does not sufficiently acknowledge the broader range of functionality provided by domain-specific ESM coupling software beyond accuracy and efficiency considerations.
In addition to the benchmark by Valcke et al. (2022), the MIRA protocol [5] constitutes another relevant benchmark for regridding quality and could be mentioned for completeness.
Section 2.2.1 (Library)
Within the ESM community, a clear distinction is made between online and offline generation of remapping weights. It would therefore be useful to clarify which of these approaches preCICE supports in practice and how they would be used in realistic ESM configurations. In the case of online generation, performance and scaling characteristics are of particular interest.
In typical ESM setups, remapping weights are either read from file or computed online during model initialization and remain fixed throughout the simulation, enabling the exchange to be implemented as a distributed sparse matrix–vector multiplication. Based on Chourdakis et al. (2022), this does not appear to be the case for the RBF implementation in preCICE. This is important information and should be stated explicitly. Furthermore, the performance results presented there suggest that the RBF approach may be prohibitively expensive for high-resolution ESM simulations and should be discussed accordingly.
Section 4 (Results)
The comparison of global conservation properties for interpolation methods that are not designed to be conservative is of limited value. For this reason, the original benchmark did not include such measures for these methods. Metrics such as mean and maximum misfit would be more appropriate in this context.
Specific comments
Line 3–4: "a general-purpose coupling library not limited to ESM"
The term “general-purpose” already implies not being limited to ESM. Additionally, describing other couplers as “limited” appears unnecessarily negative and suggests functional equivalence that does not currently exist.
Line 7: “yields significantly lower error”
lower than which reference or method?
Line 10: "SEA"
The abbreviation “SEA” seems at least for me uncommon; “OCE” or “OCN” are more typical for ocean components.
Line 12: Consider replacing “dedicated” with “domain-specific”.
Line 13–15: ordering of software
Is there any reason for this ordering? Lexicographic ordering would appear more neutral.
Line 16–17: “already used in ESM”, "limited"
Claims regarding existing use in ESM and limitations of other couplers overstate the general applicability of preCICE beyond the specific setups considered (Abele et al. 2025).
Line 20: “flexibility of the coupling approach”
This is vague and should be specified more precisely.
Line 23–25:
Listing these as potential benefits implies that existing ESM coupling software lacks these properties, which is debatable.
Line 25 “advanced numerical [..] methods”
Implicit coupling support is largely irrelevant for current ESM runs, as most component models do not support implicit coupling by design.
Line 25 “advanced [..] HPC methods”
Large-scale coupled ESM simulations using domain-specific couplers have already demonstrated extreme scalability (e.g. [6], [7]) some some of the largest HPC systems of the world. Comparable demonstrations for preCICE are currently lacking.
Line 26: "the absence of ESM-specific optimizations in terms of accuracy and efficiency"
The distinction between generic and domain-specific couplers is not limited to accuracy and efficiency; additional functionality is often crucial.
Line 34: "VTK"
Add a reference or footnote for VTK. Since the VTK format is listed as an important dependency, it could be of interest to the reader to justify the choice of this file format section 2.4.
Line 46–47: "constant data should remain constant"
In addition to conservation, monotonicity (no new extrema) and smoothness (continuously differentiable) are important remapping properties.
Depending on the physical property to be exchanged and the grid configuration, you may want your remapping to have one or the other property. The first property is often achieved using a "first order" method and the second one with the "second order" method. Since, in ESM remapping weights are usually not recomputed during a run, most of the time these two properties exclude themselves. The RBF implementation in preCICE seem to fulfil both properties, however as mentioned above it may have its own significant disadvantage.
Line 49: "In ESM,..."
Consider starting a new paragraph?
Line 51: "To this end, meshes need to be enriched by connectivity."
This statement would benefit from additional clarification. In the benchmark, a mesh is defined by vertices, edges connecting these vertices, and cells formed by the edges. While most coupling software implicitly assumes that all edges follow great circles, this assumption does not hold for certain grid types commonly used in ESM, such as regular Gaussian grids, where edges follow circles of constant latitude or longitude. Among the coupling solutions considered in the benchmark, only YAC explicitly distinguishes between these edge types, which can affect cell area computations and, consequently, remapping accuracy [2]. Furthermore, although intensive quantities are provided at cell centers in the benchmark, connectivity information is still required by some remapping algorithms used for intensive fields, and is essential for all conservative remapping of extensive quantities. Clarifying these aspects would help readers better understand why and how connectivity information is required in this context.
Line 51: reference for conservative remapping ESM
Add a standard reference for conservative remapping in ESM is. Jones, 1999 [8]
Line 51: "mapping is considered conservative"
There is also the terms of local and global conservation. For numerical weather predication local conservation ensures high accuracy locally. However, for long running climate simulations global conservation is very important. If grid combinations are chosen carefully, local conservation can automatically also lead to global conservation.
Formula 4:
For conservative interpolation, the ideal reference solution would be obtained by analytically integrating the test function over each cell, accounting for the exact cell geometry, and then normalizing by the cell area. In the benchmark, however, the source and reference target fields appear to be constructed by evaluating the analytical test function at the cell center and implicitly assuming this value to represent the cell-average. This approximation is likely a major contributor to the observed differences in global conservation between source and target fields, rather than the remapping procedure itself.
Line 110: "we use the WGS84 model"
Most ESM components assume a spherical Earth; using WGS84 may not be appropriate here.
Line 115–116: "Cell areas are computed from the nodal-based form."
Clarify whether planar or spherical cell areas were computed.
Line 122:
The sse7 grid is a reduced Gaussian grid composed exclusively of quadrilateral cells. Owing to the grid’s structure, cell connectivity cannot be straightforwardly inferred. Moreover, as most coupling frameworks do not explicitly support edges following circles of constant latitude, a purely quadrilateral representation would lead to unintended gaps in the mesh. To avoid this issue, three additional vertices were introduced into the grid definition.
Line 127-128: “To use preCICE for coupling, we need to convert these into 3D Cartesian representations using x/y/z coordinates.”
Remark: This is also commonly done internally by domain-specific implementations (using a unit sphere).
Line 138: "The couplers used in the reference are SCRIP ..."
SCRIP is a library for computing remapping weights and not a coupler (XIOS is until now an IO-server library).
How about: "The coupling software used in the reference are...."
Table 1:
The naming of the columns is misleading, because it is not obvious that columns three and four refer to the method names in the benchmark.
Table 1 column 4:
In the benchmark, these method names were used only in the appendix and primarily reflect the terminology employed by SCRIP for the respective categories. As such, they are potentially misleading, since different coupling software used different algorithms within the same category depending on availability. This is particularly relevant for the “higher-order” category, where the implemented methods differed substantially between tools.
Specifically, “distwgt-1” corresponds to a one-nearest-neighbour approach and was identical across all implementations. The label “distwgt” merely describes the weighting scheme that would be applied if more than one neighbour were used; for the one-nearest-neighbour case, no weighting is actually performed. The “bilinear” category effectively represents first-order interpolation in all cases; for YAC this was combined with inverse-distance weighting, although a different configuration was employed in a later benchmark, leading to significantly improved results (see Fig. 16 in [1]). Finally, the “bicubic” category encompassed fundamentally different methods: bicubic interpolation in SCRIP, patch recovery in ESMF, and HCSBB in YAC.
Line 161-162: “Moreover, similarly to Valcke et al. (2022), we need to transfer land-ocean masks from the SEA to the ATM meshes in a pre-processing step”
Maybe: “Moreover, similar to Valcke et al. (2022), we have to generate a land-sea mask for the ATM mesh by computing the overlap between both meshes.”
Line 162-163: "To increase the usability of the benchmark..."
Due to differences in cell shape representations and the numerical accuracy of the implemented methods, the various software packages in the benchmark may compute different grid overlaps. Consequently, the corresponding masks are generated separately for each software. The procedure for generating these masks was therefore explicitly described in benchmark. In the specific case of the torc grid, a fixed mask is usually provided alongside the grid data.
Line 168: "write the results to VTK"
Maybe: "write the results to a VTK-formated file"?
Line 213: "parameter"
What parameter? The support radius?
Line 214: "of the Earth radius"
Remark: most coupling software actually work on the unit sphere (radius 1.0), which makes a couple of computations much easier.
Line 215-216: "Alternatively, in future studies, we could make use of an automatic parameter optimization, which is currently developed for preCICE."
Remark: It may be worth considering whether the support radius could be determined adaptively based on a measure of the average distance between source points in the neighbourhood of a target point. Such an approach could yield more robust results for source grids with spatially varying resolution. A similar strategy is employed in the RBF implementation of YAC.
Section 3.7:
In contrast to what is suggested in the manuscript, domain-specific software (SCRIP and YAC) is not able to handle the torc grid due to different error handling. The grid itself is valid and does not suffer from missing or invalid cells, its constructors were just “very creative” and did not just cause headaches for you. ;-)
The torc grid is used by an ocean model, hence you can and have to ignore all land cells. Therefore, this grid comes with its specific land-sea mask. Structurally, the grid is curvilinear, with vertices stored in a two-dimensional array and connectivity defined implicitly. To locally increase resolution in regions of interest (e.g. the Mediterranean Sea), unused land cells were relocated, which results in degenerated neighbouring cells. Consequently, this grid can only be processed reliably if all land cells defined by the provided mask are removed prior to further analysis or remapping.
Line 241: "who tested the three couplers"
Maybe: "who tested the three couplers for the non-conservative case:" (because actually there were four tools being tested)
Line 248-249: "We presume that this is due to the scaling of the surface areas with the water fraction in post-processing as explained in Section 3.6"
Or due to the test fields being too smooth and uniform.
Figure 6:
The benchmark demonstrated that, for the one-nearest-neighbour method, all regridding software produced nearly identical results in terms of mean, maximum, and RMS misfit. For this specific case, any substantial deviation would likely indicate an implementation issue. It is therefore unclear how the differences observed in global target conservation arise. (I still do not think that global target conservation is a valid measure for these methods).
Line 257-258: "For gulfstream, we obtain comparable results to the reference."
This could also be an indication for "over-smoothing", which could be a problem for fields used in actual simulations. The benchmark contained an additional measurement, which evaluated this and might be interesting to see here.
Line 265: "very similarly than the misfit metric"
Do you mean: "very similarly than the mean-misfit metric"?
In the benchmark the comparison between mean and max misfit provided quite some interesting insights. For some grid configurations these also differed significantly between the implementations.
Line 277: "demonstrating that it can successfully handle ESM mapping problem"
Claims regarding successful handling of ESM mapping problems are overstated given the absence of conservative remapping and performance evaluation.
Line 279: "An example of such transfer is the RBF mapping method"
Actually, YAC already contains an RBF based method (see [3]). However, to my knowledge it did not yet found a user. The benchmark only evaluated methods that were available in most tested software. This is why other methods and configuration options were ignored.
Line 280-281: "it outperformed ESM-specific second-order methods by in average two orders of magnitude for smooth test function"
Since the benchmark compared similar methods that were already successfully used in the ESM context, using simplified test functions did not seem to cause issues. However, with the introduction the RBF-method, this may have to be reevaluated, because good performance for the test functions may not automatically yield the same for actual data.
Line 285-286: "as it would require a carefully refined benchmark definition."
Especially for Numerical Weather Prediction performance measurements are important. Depending on the overall performance, it may also be of interest for climate simulations. It would be very interesting to see some basic runtime and scaling measurements, similar to the ones described in the original benchmark (see section 8.)
[1]: https://cerfacs.fr/wp-content/uploads/2024/12/TR-CMGC-24-182_yac_in_oasis.pdf
[2]: https://doi.org/10.5194/gmd-17-415-2024
[3]: https://dkrz-sw.gitlab-pages.dkrz.de/yac/d1/d79/interp_method_rbf.html
[4]: https://raw.githubusercontent.com/IS-ENES3/IS-ENES-Website/main/pdf_documents/IS-ENES2_D10.3_FV.pdf
[5]: https://doi.org/10.5194/gmd-15-6601-2022
[6]: https://doi.org/10.1145/3712285.3771789
[7]: https://awards.acm.org/bell-climate
[8]: https://doi.org/10.1175/1520-0493(1999)127<2204:FASOCR>2.0.CO;2