Improving trajectory calculations by FLEXPART 10.4+ using deep learning inspired single image superresolution

Brecht, Rüdiger; Bakels, Lucie; Bihlo, Alex; Stohl, Andreas

doi:https://doi.org/10.5194/egusphere-2022-441

Preprints

https://doi.org/10.5194/egusphere-2022-441

Preprints

11 Jul 2022

| 11 Jul 2022

Improving trajectory calculations by FLEXPART 10.4+ using deep learning inspired single image superresolution

Rüdiger Brecht, Lucie Bakels, Alex Bihlo, and Andreas Stohl

Abstract. Lagrangian trajectory or particle dispersion models as well as semi-Lagrangian advection schemes require meteorological data such as wind, temperature and geopotential at the exact spatio-temporal locations of the particles that move independently from a regular grid. Traditionally, this high-resolution data has been obtained by interpolating the meteorological parameters from the gridded data of a meteorological model or reanalysis, e.g. using linear interpolation in space and time. However, interpolation errors are a large source of error for these models. Reducing them requires meteorological input fields with high space and time resolution, which may not always be available and can cause severe data storage and transfer problems. Here, we interpret this problem as a single image superresolution task. That is, we interpret meteorological fields available at their native resolution as low-resolution images and train deep neural networks to up-scale them to higher resolution, thereby providing more accurate data for Lagrangian models. We train various versions of the state-of-the-art Enhanced Deep Residual Networks for Superresolution (EDSR) on low-resolution ERA5 reanalysis data with the goal to up-scale these data to arbitrary spatial resolution. We show that the resulting up-scaled wind fields have root-mean-squared errors half the size of the winds obtained with linear spatial interpolation at acceptable computational inference costs. In a test setup using the Lagrangian particle dispersion model FLEXPART and reduced-resolution wind fields, we demonstrate that absolute horizontal transport deviations of calculated trajectories from "ground-truth'' trajectories calculated with undegraded 0.5° x 0.5° winds are reduced by at least 49.5 % (21.8 %) after 48 hours relative to trajectories using linear interpolation of the wind data when training on 2° x 2° to 1° x 1° (4° x 4° to 2° x 2°) resolution data.

Received: 04 Jun 2022 – Discussion started: 11 Jul 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 1466 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (1466 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

21 Apr 2023

Improving trajectory calculations by FLEXPART 10.4+ using single-image super-resolution

Rüdiger Brecht, Lucie Bakels, Alex Bihlo, and Andreas Stohl

Geosci. Model Dev., 16, 2181–2192, https://doi.org/10.5194/gmd-16-2181-2023,https://doi.org/10.5194/gmd-16-2181-2023, 2023

Short summary

Rüdiger Brecht, Lucie Bakels, Alex Bihlo, and Andreas Stohl

Interactive discussion

Status: closed

CEC1:
'Comment on egusphere-2022-441', Juan Antonio Añel, 23 Aug 2022

Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".

https://www.geoscientific-model-development.net/policies/code_and_data_policy.html

You have not published the code, input, and output data used or obtained to perform your study. In this way, unless you fix it, we will have to reject your manuscript for publication in Geosci. Model Dev. Admittedly this has been an oversight from our side, and your manuscript should have never been published in Discussions, given this problem. However, we are now offering you the possibility to fix it.
First, note that the only thing you have published is a Python notebook on GitHub to reproduce your plots. However, GitHub is not a suitable repository. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo. Therefore, please, publish your code in one of the appropriate repositories, and reply to this comment with the relevant information (link and DOI) as soon as possible, as it should be available for the Discussions stage. Also, please include the primary input and output data as mentioned above. In this way, you must include the modified 'Code and Data Availability' section in a potential reviewed version of your manuscript, the DOI of the code (and another DOI for the dataset if necessary).
Please, when uploading your code, include a license for it. If you do not include it, the code continues to be your property and can not be used by others, despite any statement on being free to use. You could want to choose a free software/open-source (FLOSS) license. We recommend the GPLv3. You only need to include the file 'https://www.gnu.org/licenses/gpl-3.0.txt' as LICENSE.txt with your code. Also, you can choose other options that Zenodo provides: GPLv2, Apache License, MIT License, etc.
Regards,
Juan A. Añel
Geosci. Model Dev. Executive Editor

Citation: https://doi.org/10.5194/egusphere-2022-441-CEC1
- AC1: 'Reply on CEC1', Rüdiger Brecht, 01 Sep 2022
  
  Dear Dr. Añel
  
  Geosci. Model Dev. Executive Editor,
  We are working on uploading the code and data, and modify the 'Code and Data Availability' section. We hope to finish this within the next week.
  
  Regards,
  
  The authors
  
  Citation: https://doi.org/10.5194/egusphere-2022-441-AC1
- AC2:
  'Reply on CEC1', Rüdiger Brecht, 13 Sep 2022
  
  Dear Dr. Añel
  
  Geosci. Model Dev. Executive Editor,
  
  Thank you for letting us publish the code and continue with the publishing process. We made the code available to pre-process the data, train the neural network and then post-process the data. The evaluated data is published on Zenodo (see the documentation on GitHub (https://github.com/RudigerBrecht/Improving-trajectory-calculations-using-SISR). Moreover the code is also available on Zenodo (https://doi.org/10.5281/zenodo.7065139). Furthermore, we updated the Code and Data Availability statement, because the input data is not freely available.
  Please let us know if there is anything else that we can improve.
  
  Regards,
  The authors
  
  Citation: https://doi.org/10.5194/egusphere-2022-441-AC2
  - CEC2: 'Reply on AC2', Juan Antonio Añel, 13 Sep 2022
    
    Dear authors,
    
    Unfortunately, you continue to make the same mistake. The data are made available through GitHub. However, again, we can not accept Github to host information related to your manuscript, independently of the fact that it is documentation, data or code. Therefore, please, move the contents in Github to a suitable repository, and use it in a potential reviewed version of your work.
    Also, you fail to justify not releasing the input data, which I understand is public, as based on ERA5, and nothing prevents you from sharing it. Therefore, we can not accept your statement, and we will have to reject your paper if you do not provide the input data, as the replicability of the work would be compromised. Another option is that you give a justification not to do it (usually a legal regulation that forbids you to do it). If this is the case, please, provide us with the evidence necessary (law, regulation, etc.), so that we can assess it, and store the input data in a trustable repository with a DOI, with the difference that it would not be public.
    Regards,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-CEC2
    
    AC3: 'Reply on CEC2', Rüdiger Brecht, 11 Oct 2022
    
    Dear Dr. Añel
    
    Geosci. Model Dev. Executive Editor,
    
    Thank you for your quick response. Indeed, the ERA5 data set is publicly available, thus we added the scripts to download the data. Here is the link with the updated code: https://zenodo.org/record/7181840. Please let us know if there is anything else that we can improve.
    
    Regards,
    The authors
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-AC3
    
    CEC3: 'Reply on AC3', Juan Antonio Añel, 11 Oct 2022
    
    Dear authors,
    First, thanks for sharing the scripts in Zenodo. However, the main problem persists. I think that you do not quite understand the nature of it. To reproduce your manuscript, currently, readers need to have access to the ERA5 data, and this does not even produce the input files for FlexPart. The ECMWF servers (and their internet addresses) are not permanent, and they are not long-term repositories for the archival of scientific data. They are a temporary service that, at best, works for over a few years. In this case, for example, in this case, until the current ERA5 reanalysis is superseded. Also, the reanalysis files could be modified if, for example, an error is found. Therefore, it should be avoided that to reproduce your manuscript the readers need to access the ERA5 data. Moreover, because they are not the real input data. You have already produced the interpolated input files, and these are the files that you have to share in the new repository. So, unless they are too big for it (e.g. hundreds of GB), please, share your interpolated input files.
    Best regards,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-CEC3
    
    CC1: 'Reply on CEC3', Andreas Stohl, 11 Oct 2022
    
    Dear editor,
    Unfortunately, there seems to be a misunderstanding.
    1) ECMWF re-analysis data are the best archived data that you can find. You can still get old re-analyses from decades ago, and they are versioned such that if there are multiple data streams, these are all preserved. So ERA5 may be superseded in the future but it will still be available.
    2) The data volume is ~400 gigabytes even for the short time periods we use.
    I have always been a strong advocate for open source, open data, etc., and we would openly share any data where it makes sense. In this case, however, I don't see how to do this, given the large data volumes!
    I hope for your understanding!
    Regards,
    Andreas
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-CC1
    
    CEC4: 'Reply on CC1', Juan Antonio Añel, 25 Oct 2022
    
    Dear Andreas, dear authors,
    First, apologies for taking so long to reply. I admit that it is hard that a change in reanalysis is not recorded and that ECMWF does well on this. Yet, it is true that in the past, there have been problems with others. Beyond that, the reproducibility and replicability of any learning technique are compromised without the exact input and output data; therefore, I am not making a point about open access here but about complying with the scientific method, and what makes something "science".
    Regarding the size, from the editorial point of view, it is hard to make an assessment about what is reasonable or not to ask (this happens all the time) as this information is usually omitted from the text. On the other hand, I would like to point out that nowadays, 400 GB is not such a big problem; it is a reasonable amount of data to share. For example, Zenodo has a limit of 50 GB per archive, so eight to ten Zenodo repositories could make the trick. We have published many papers that have solved the issue in this way.
    Therefore, I would ask you to be more open about the total size of the files that you would need to store to assure the replicability of the work and to assess the possibility of publishing at least a minimum sample. Otherwise, if you have stored the data internally and it is a too big dataset to upload and share, I would ask you to issue a DOI for it and provide information about where you are keeping them. And add this information to the manuscript.
    I know that this represents some work, but we would like to comply with the highest standards in open science and scientific reproducibility.
    Best regards,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-CEC4
    
    AC4: 'Reply on CEC4', Rüdiger Brecht, 24 Nov 2022
    
    Dear editor,
    
    As discussed, we uploaded the output data of the neural network which is then used for the trajectory simulation. To minimize the size of the data we compressed the files.
    The repositories are: https://zenodo.org/record/7277854 and https://zenodo.org/record/7318809.
    The repository https://zenodo.org/record/7350568 contains now the code including FLEXPART in the version used for our simulations. We also removed any occurrence of git/github in a reviewed version of the manuscript.
    Then, since we can trust the ECMWF about the storage of ERA5, one can download the data using the download scripts. We indicate the exact information about the fields in a reviewed version of our manuscript (we use the horizontal wind velocities u and v field at 138 levels).
    
    Regards,
    
    The authors
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-AC4
    
    AC5: 'Reply on CEC4', Rüdiger Brecht, 24 Nov 2022
    
    Dear editor,
    
    As discussed, we uploaded the output data of the neural network which is then used for the trajectory simulation. To minimize the size of the data we compressed the files.
    The repositories are: https://zenodo.org/record/7277854 and https://zenodo.org/record/7318809.
    The repository https://zenodo.org/record/7350568 contains now the code including FLEXPART in the version used for our simulations. We also removed any occurrence of git/github in a reviewed version of the manuscript.
    Then, since we can trust the ECMWF about the storage of ERA5, one can download the data using the download scripts. We indicate the exact information about the fields in a reviewed version of our manuscript (we use the horizontal wind velocities u and v field at 138 levels).
    
    Regards,
    
    The authors
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-AC5
    
    AC6: 'Reply on CEC4', Rüdiger Brecht, 24 Nov 2022
    
    Dear editor,
    
    As discussed, we uploaded the output data of the neural network which is then used for the trajectory simulation. To minimize the size of the data we compressed the files.
    The repositories are: https://zenodo.org/record/7277854 and https://zenodo.org/record/7318809.
    The repository https://zenodo.org/record/7350568 contains now the code including FLEXPART in the version used for our simulations. We also removed any occurrence of git/github in a reviewed version of the manuscript.
    Then, since we can trust the ECMWF about the storage of ERA5, one can download the data using the download scripts. We indicate the exact information about the fields in a reviewed version of our manuscript (we use the horizontal wind velocities u and v field at 138 levels).
    
    Regards,
    
    The authors
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-AC6
RC1:
'Comment on egusphere-2022-441', Anonymous Referee #1, 12 Sep 2022

Review of "Improving trajectory calculations by FLEXPART 10.4+ using deep learning inspired single image superresolution by Brecht et al., submitted to EGUsphere.

The study by Brecht et al. presents an application of a neural network to provide higher-resolution wind information from a coarser resolution. Specifically, the authors analyse the impact of their approach on the accuracy of trajectory calculations, which rely heavily on interpolation operations. The paper describes interesting and relevant new approaches to interpolation in an atmospheric framework. However, there are a number of issues with the methodology, the presentation of the results, the discussion of limitations, and the overall writing style, that need to be adressed before the manuscript can be considered publishable.

Major items

1. Currently, the authors sub-sample a higher resolution model field to obtain the coarse resolution wind field. This approach is in my impression inconsistent with what a coarser-resolution model would provide. A coarser-resolution model would provide an average of what is represented by several grid points of a finer-scale model. I think the sub-sampling makes it harder for the linear interpolation to provide good results. That sub-sampling approach is also the origin of the checkerboard patterns apparent in e.g. Fig. 4. I recommend to redo the analysis by averaging the model fields rather than sub-sampling.

2. Is it correct that the neural network works on a 3x3 grid point stencil on a lat-lon grid? It seems that the footprint of the interpolation operator would thus see very different area sizes near the pole than near the equator. How is this affecting the training and application of the neural network?

3. Some sections are poorly written and therefore hardly comprehensible (for example section 3.2). Since the authors introduce new methods into the field of atmospheric science, I recommend to make an extra effort to clearly define all terminology (e.g., "channel attention" is never defined). However, I do not think this is mainly about the content material, but rather about how the sentences and paragraphs are compiled. For example, section 3.2 could start with a short paragraph, providing a break-down of the steps involved in the architecture, before describing each part in following paragraphs. I recommend the authors to take a look at Gopen and Swan (1990) regarding how to write more clearly and effectively.

4. The issue of non-conservation of interpolation algorithms is a major concern in a physical application as presented here. The basic model equations are derived from principles of mass and energy conservation. Therefore, if there are conservation violations induced by this method, this aspect needs to be clearly brought forward throughout the manuscript, to make sure it is not overlooked by readers. This aspect can be mentioned in Sec. 3.4, brought up as part of the results, for example by comparing the kinetic energy and the velocity spectra from the fine-resolution and interpolated fields. The short discussion in L. 234 onwards might be more suitable in a discussion paragraph.

5. The results need to be structured more clearly. Right now, the sequence of results and examples in Sec. 4.1 appear somewhat arbitrary. While there certainly is some reasoning behind the structure and examples, it is not spelled out clearly, and thus the reader is left guessing about how to "connect the dots". Coming from an atmospheric science perspective, I suggest a structure that starts one specific case as an example, such as one of the frontal bands shown in Fig. 3., where the linear interpolation has clear deficiencies. Thereby, it would be helpful to also show additional atmospheric variables to illustrate the case (for example surface pressure or air temperature. A tropical cyclone or a Rossby wave breaking could be other interesting situations to present. After stepping through the example case, more statistically robust information could be provided, from considering a larger number of days or cases. Finally, you proceed to the application with the trajectory calculations, before considering energy conservation.

6. On many occasions, the results are presented in a qualitative way (closer/larger/etc.). In order to connect the results to the figures, and to make it possible to follow the interpretation and evaluation of the authors, it would be very useful to include concrete numbers alongside the qualitative interpretation, while referring to the respective figure panels. Examples are L. 182 and onward, L. 198 and onward.

7. The writing style of in particular the results section should be more distanced or objective. Now, the authors frequently use expressions like "we demonstrate", "we show" in the start of a paragraph, i.e. before actually having presented the evidence. As a critical reader, one might get the impression you are overselling the results. I strongly recommend changing this unnecessary forceful writing style to a more distanced, objective style. Let the reader see the evidence for themselves, while guiding them through the material, before drawing conclusions. Many paragraphs in the results are currently "upside down" in that way.

8. As another, related aspect, the figures are not properly described. At present, the length of the text describing the results is very much out of balance with the number of figures. For example in Sec. 4.1.1, L. 183, an entire 3 figures are referred to within just 3 sentences, but none of the sentences describes what actually is seen within the figures. Rather than leaving it up to the author to interpret the figures, use some sentences for each figure to describe what is displayed, and highlight what is important to take away. This applies to all figures in the manuscript.

9. On several occasions (including Fig. 2, 4, 5), the figure captions contain information about the method or results that are not mentioned in the text. Such information must be placed in the main text.

10. What are the limitations of the method in terms of computational effort? In L. 192, it is briefly mentioned that the computation time is a factor 10 larger than linear interpolation. Is there still an advantage of neural network approach compared to for example quadratic interpolation? This could be worth a short section in the discussion. The improved conservation of other properties is also interesting, but unfortunately not shown in more detail.

Detailed comments:

Title: "deep learning inspired": unclear what this expression means, consider to remove/replace. State what aspect of trajectory calculations is improved (accuracy).

L. 20: Can you back up this statement by a reference/example?

L. 21: "where a dense network" rephrase. If the point is that the numerical weather prediction process produces large amounts of gridded data, then it would be sufficient to state just that, without mentioning observations (which are not at all part of this manuscript). Remove "reanalysis model", a reanalysis is generated from regular NWP models.

L. 25 onward: check citation of references, missing brackets.

L. 27: remove "just to name a few"

L. 30: logical gap, what is the connection to the previous paragraph?

L. 34: remove "surprisingly", this entirely depends on the perspective of the writer.

L. 34: briefly define "convolutional neural network".

L. 44: what do you mean by "variable-scale"?

L. 44: what to you mean by "deep" - how deep?

L. 45: Rephrase: "showcase" sounds like snapshots or illustrations, but as a reader I look for reliable evidence.

Section 2: "Related work"

This section does currently not serve a clear purpose, and is somewhat duplicate with the introduction. I recommend to delete this section here, and partly incorporate bits in the introduction, partly into a clearer method description.

Section 3: "Methods"

This section would benefit from a first paragraph that explains your overall approach, followed by a section that discusses the choice of the neural network, based on the range of choices that exist, in an accessible writing style.

Section 3.1: "Training data"

The training data would be more natural to place after sections that describe the actual neural network and approach.

L. 82: Why could this seem little data? How much training is commonly needed?

L. 107: rephrase using more distanced and objective terms. It could provide depth to the study to present a less well-performing approach in an appendix.

L. 114: remove "for testing purposes"

L. 117: 50 or 88 -> 50 and 88

Figure 1: several abbreviations and terms of the operations in the figure are not defined, include in caption or describing text. What do the bracketing lines indicate? The hierarchy between (a), (b) and (c) and between (a) and (d) could be made clearer in the figure, e.g. by lines that indicate "zooming in".

L. 123: Add a statement about the purpose of the error metric, i.e. what is to be assessed.

L. 127: here and elsewhere: ground truth -> truth. (ground truth would only make sense in a remote sensing context)

The notations for RMSE and SSIM could be simplified and clarified, for example using \hat{z} for the interpolated quantity, and using a,b instead of x,y (which is commonly used for spatial coordinates) for the two figures in the SSIM metric. How important is the "perceived similarity of two images" for the given application? This would be a suitable place to mention conservation issues due to interpolation.

L. 142, 144: unclear what "this" refers to.

L. 145-149: unclear, please rephrase

L. 152 onward: the emphasised names do not appear to be re-used in the remainder of the manuscript. Maybe rather introduce 3-letter abbreviations, such as REF, LIN, NNI that then can re-appear in the results and figures.

L. 160: place references at the end of sentence

L. 163: clarify whether Xn, Yn are vectors with m elements, or for a specific time along the trajectory

L. 172: This section seems to describe your approach, and would be better placed in the methods.

Figure 2: the lines for lin u and lin v are exactly equal, is this coincidence? x-axis is lacking a unit. RMSE is defined with an index, but given without index here. Please explain in the result text how to interpret this figure.

Table 1: What do the arrows indicate? The caption contains a key result, that should be moved to the main text.

L. 183 to 188: Need to guide the reader through the results. The comparison needs more structure, and quantitative examples from the figure where available to support the qualitative conclusions.

L. 191: Maybe express in relative terms, hardware-dependent?

L. 195: distinguish "evaluate" and "apply" - an objective way to present the results would be to apply the method, display the results, and thereafter evaluate based on the error metrics.

Figure 3: Lacking panel labels. The top row does not seem to give additional information to the bottom row. I recommend using a continuous color scale; the two-color scale gives unjustified importance to errors larger than 5 m/s. Maps are missing coordinates. The RMSE in the title should be part of the text rather than a caption title. It would be useful to present a specific situation with meteorological fields for context.

L. 198: "before": rephrase

L. 199: "relative error reduction": where shown?

L. 200: "This holds...": can this information be presented as part of a more aggregated and thus robust result?

L. 202: unclear, rephrase

Figure 4: See comments about Fig. 3, the color scale gives unjustified emphasis to wind errors above 2.5 m/s. Indicate in Fig. 3 where this zoom is taken. Arrows are difficult to see, take to separate panels, and use meteorological fields (e.g. sea level pressure or potential temperature) as reference in both sets of panels.

Figure 5: This figure needs more explanation. What bins have been used? With only 10 bins, it may be more appropriate to show the lines as step function. What error metric has been used? Can this figure be constructed on more than just one day to make it more robust?

Figure 6: see Fig. 3.

Table 2: see Table 1

Figure 7: consider to remove this Figure. At this point, quantitiative information may be sufficient/more useful than another illustration

L. 209: likely -> conceivable, provide reference

L. 215: it would be useful to briefly re-cap how these results are obtained. One case, several cases, specific region? How are trajectory errors distributed on a global map, do they mirror the interpolation errors?

L. 221: "smaller" - should this be "larger"?

L. 223: "directly corresponds" - is this a result, your interpretation, or an assumption?

L. 228: These paragraphs would better fit into a discussion section, together with other limitations. If possible, it would be useful to give more details, such that other studies can refer to you work.

L. 242: remove "just to name a few"

L. 245: would be useful to connect to weather phenomena here

L. 250: remove "see Fig. 2"

L. 263: this is an important limitation and should be taken up at different locations in the manuscript, including a discussion section. If non-conservation is an issue here, it would be useful to quantify. This would also give some balance to the study, which now mainly focuses on the advantages.

Citation: https://doi.org/10.5194/egusphere-2022-441-RC1
- AC7:
  'Reply on RC1', Rüdiger Brecht, 22 Dec 2022
  We thank the reviewer for providing valuable feedback on the first version of our submitted paper. We have done our best to take into account all remarks raised. In the following we give a detailed list of all the changes made in response to the points raised. Thank you once more for your help in improving our paper.
  Major items
  Currently, the authors sub-sample a higher resolution model field to obtain the coarse resolution wind field. This approach is in my impression inconsistent with what a coarser-resolution model would provide. A coarser-resolution model would provide an average of what is represented by several grid points of a finer-scale model. I think the sub-sampling makes it harder for the linear interpolation to provide good results. That sub-sampling approach is also the origin of the checkerboard patterns apparent in e.g. Fig. 4. I recommend to redo the analysis by averaging the model fields rather than sub-sampling.
  
  The motivation for our choice of degrading data by leaving certain grid points entirely unchanged, is twofold:
  
  This approach has been used often in the past when studying wind interpolation errors for trajectory models (e.g., Kuo et al., 1985; Stohl et al., 1995). It is appropriate to be consistent with such past approaches.
  
  We agree that it would also be interesting to see how higher-resolution data could be reconstructed from lower-resolution data. This would then be more equivalent to downscaling approaches in weather prediction. However, for comparing the skill of different interpolation methods, this is not ideal. At the points where data are available at both high and low resolution, these data would be different in each case. Reconstruction of points in between would then not only reflect differences in the skill of interpolation but also the data differences at the points from where the interpolation is done. This would thus not allow a "clean" evaluation of different interpolation methods, mixing the effects of interpolation and grid-cell averaging for the coarse-resolution data points.
  
  Is it correct that the neural network works on a 3x3 grid point stencil on a lat-lon grid? It seems that the footprint of the interpolation operator would thus see very different area sizes near the pole than near the equator. How is this affecting the training and application of the neural network?
  
  It is correct that some parts of the neural network work on 3x3 grid point stencils. However, the network architecture is more complicated than considering just 3x3 stencils working on the lat-lon grid. Ideally the nonlinear nature of the neural network learns how to cope with the different area sizes. It would be very interesting to analyze the effect of the area, but this is not the scope of the present work.
  
  Some sections are poorly written and therefore hardly comprehensible (for example section 3.2). Since the authors introduce new methods into the field of atmospheric science, I recommend to make an extra effort to clearly define all terminology (e.g., "channel attention" is never defined). However, I do not think this is mainly about the content material, but rather about how the sentences and paragraphs are compiled. For example, section 3.2 could start with a short paragraph, providing a break-down of the steps involved in the architecture, before describing each part in following paragraphs. I recommend the authors to take a look at Gopen and Swan (1990) regarding how to write more clearly and effectively.
  
  To make the neural network architecture comprehensible, we added a more non technical description of the overall architecture. We also went through the entire manuscript again to clarify various items.
  
  The issue of non-conservation of interpolation algorithms is a major concern in a physical application as presented here. The basic model equations are derived from principles of mass and energy conservation. Therefore, if there are conservation violations induced by this method, this aspect needs to be clearly brought forward throughout the manuscript, to make sure it is not overlooked by readers. This aspect can be mentioned in Sec. 3.4, brought up as part of the results, for example by comparing the kinetic energy and the velocity spectra from the fine-resolution and interpolated fields. The short discussion in L. 234 onwards might be more suitable in a discussion paragraph.
  
  We admit that mass conservation is a possible issue in interpolation. However, most available interpolation methods do not conserve mass and are not designed for that. The goal of this study was to compare NN against such interpolation methods. We therefore think that this topic is somewhat beyond the scope of the current paper. However, we admit that it would be of great value to develop a dynamically constrained NN interpolation method that also ensures mass conservation. This would be a topic of future research.
  
  The results need to be structured more clearly. Right now, the sequence of results and examples in Sec. 4.1 appear somewhat arbitrary. While there certainly is some reasoning behind the structure and examples, it is not spelled out clearly, and thus the reader is left guessing about how to "connect the dots". Coming from an atmospheric science perspective, I suggest a structure that starts one specific case as an example, such as one of the frontal bands shown in Fig. 3., where the linear interpolation has clear deficiencies. Thereby, it would be helpful to also show additional atmospheric variables to illustrate the case (for example surface pressure or air temperature. A tropical cyclone or a Rossby wave breaking could be other interesting situations to present. After stepping through the example case, more statistically robust information could be provided, from considering a larger number of days or cases. Finally, you proceed to the application with the trajectory calculations, before considering energy conservation.
  
  Thanks a lot for this remark. We went through the section one more time to better explain our results. In summary, our structure of the results is to show first the interpolation method and then use it in a simulation. In the present work we only consider the horizontal velocity fields. Thus, first we show that the new interpolation method using the neural network outperforms linear interpolation and then we use these interpolated fields to demonstrate that also the trajectory simulation using these interpolated fields is better. The frontal bands shown in Fig. 3 are a coincidence, since the plot shows the error, which is highest at these bands.
  
  On many occasions, the results are presented in a qualitative way (closer/larger/etc.). In order to connect the results to the figures, and to make it possible to follow the interpretation and evaluation of the authors, it would be very useful to include concrete numbers alongside the qualitative interpretation, while referring to the respective figure panels. Examples are L. 182 and onward, L. 198 and onward.
  
  We follow the methodology in presenting our results in tables and figures in a quantitative manner with associated qualitative descriptions being provided in the text.
  
  The writing style of in particular the results section should be more distanced or objective. Now, the authors frequently use expressions like "we demonstrate", "we show" in the start of a paragraph, i.e. before actually having presented the evidence. As a critical reader, one might get the impression you are overselling the results. I strongly recommend changing this unnecessary forceful writing style to a more distanced, objective style. Let the reader see the evidence for themselves, while guiding them through the material, before drawing conclusions. Many paragraphs in the results are currently "upside down" in that way.
  
  We appreciate your remark and went through the entire manuscript again to soften some of our writing style.
  
  As another, related aspect, the figures are not properly described. At present, the length of the text describing the results is very much out of balance with the number of figures. For example in Sec. 4.1.1, L. 183, an entire 3 figures are referred to within just 3 sentences, but none of the sentences describes what actually is seen within the figures. Rather than leaving it up to the author to interpret the figures, use some sentences for each figure to describe what is displayed, and highlight what is important to take away. This applies to all figures in the manuscript.
  
  Thank you for pointing this out, we extended the description of the figures and softened some of our writing style.
  
  On several occasions (including Fig. 2, 4, 5), the figure captions contain information about the method or results that are not mentioned in the text. Such information must be placed in the main text.
  
  Our style of writing the manuscript is based on explaining the content of the figures in the captions and when referencing the figures commenting the results observed. We also went through the entire manuscript again to clarify various items.
  
  What are the limitations of the method in terms of computational effort? In L. 192, it is briefly mentioned that the computation time is a factor 10 larger than linear interpolation. Is there still an advantage of neural network approach compared to for example quadratic interpolation? This could be worth a short section in the discussion. The improved conservation of other properties is also interesting, but unfortunately not shown in more detail.
  
  At this stage it is too premature to compare the efficiency. We are also not implementing the method in the most efficient way. For a true comparison we would need to implement it in the best way also in FLEXPART and then compare the execution time.
  
  Detailed comments:
  Title: "deep learning inspired": unclear what this expression means, consider to remove/replace. State what aspect of trajectory calculations is improved (accuracy).
  We changed the title to Improving trajectory calculations by FLEXPART 10.4+ using single image superresolution.
  
  20: Can you back up this statement by a reference/example?
  
  We added a reference.
  
  21: "where a dense network" rephrase. If the point is that the numerical weather prediction process produces large amounts of gridded data, then it would be sufficient to state just that, without mentioning observations (which are not at all part of this manuscript). Remove "reanalysis model", a reanalysis is generated from regular NWP models.
  
  We simplified the sentence to say that NWP and observations generate large amounts of gridded data.
  
  25 onward: check citation of references, missing brackets.
  
  We added the missing brackets.
  
  27: remove "just to name a few"
  
  We removed it.
  
  30: logical gap, what is the connection to the previous paragraph?
  
  We removed the logical gap by moving the section “Related work” into the introduction.
  
  34: remove "surprisingly", this entirely depends on the perspective of the writer.
  
  We removed it.
  
  34: briefly define "convolutional neural network".
  
  We added that a CNN is a neural network whose layers are convolutions, which puts the input images through a set of convolutional filters, each of which activates certain features from the input.
  
  44: what do you mean by "variable-scale"?
  
  Here, variable-scale means that the neural network can cope with different resolutions of the wind fields. This way it can be applied multiple times to interpolate a meteorological field to the desired resolution. This is explained in the sentence after L 44.
  
  44: what to you mean by "deep" - how deep?
  
  Here, deep refers to the neural network having multiple layers.
  
  45: Rephrase: "showcase" sounds like snapshots or illustrations, but as a reader I look for reliable evidence.
  
  We rephrased “showcase” with “demonstrate”.
  
  Section 2: "Related work". This section does currently not serve a clear purpose, and is somewhat duplicate with the introduction. I recommend deleting this section here, and partly incorporating bits in the introduction, partly into a clearer method description.
  We moved the section “Related work” to the introduction.
  
  Section 3: "Methods". This section would benefit from a first paragraph that explains your overall approach, followed by a section that discusses the choice of the neural network, based on the range of choices that exist, in an accessible writing style.
  We added the overall approach to the “Methods” section. The choice of the neural network is then described at the end of the “architecture” section.
  
  Section 3.1: "Training data". The training data would be more natural to place after sections that describe the actual neural network and approach.
  We swapped the section “Training data” and “neural network architecture”.
  
  82: Why could this seem little data? How much training is commonly needed?
  
  The phrase is misleading. We will just state the number of training files. It is difficult to say how much training data is needed, at least a few thousand samples.
  
  107: rephrase using more distanced and objective terms. It could provide depth to the study to present a less well-performing approach in an appendix.
  
  Indeed, a comparison of different models would be an interesting study. Here, however, the focus is on improving the trajectory simulation.
  
  114: remove "for testing purposes"
  
  We removed it.
  
  117: 50 or 88 -> 50 and 88
  
  We changed “or” to “and”.
  
  Figure 1: several abbreviations and terms of the operations in the figure are not defined, include in caption or describing text. What do the bracketing lines indicate? The hierarchy between (a), (b) and (c) and between (a) and (d) could be made clearer in the figure, e.g. by lines that indicate "zooming in".
  We added an explanation about the dotted line, which just means that the ResBlock is repeated multiple times. It is difficult to indicate “zooming in” by lines in the figure.
  
  123: Add a statement about the purpose of the error metric, i.e. what is to be assessed.
  
  The error metrics are evaluating the accuracy of the interpolation and trajectory simulation. We add a statement to the revised version.
  
  127: here and elsewhere: ground truth -> truth. (ground truth would only make sense in a remote sensing context)
  
  We replaced “ground truth'' with “truth”.
  
  The notations for RMSE and SSIM could be simplified and clarified, for example using \hat{z} for the interpolated quantity, and using a,b instead of x,y (which is commonly used for spatial coordinates) for the two figures in the SSIM metric. How important is the "perceived similarity of two images" for the given application? This would be a suitable place to mention conservation issues due to interpolation.
  Indeed, for x and y we refer to two images, to avoid confusion we use now a and b. We include the SSIM metric because we interpret the gridded horizontal velocity fields as images.
  
  142, 144: unclear what "this" refers to.
  
  142: Replaced '...this is outside of the scope..' with '...directly implementing the neural network interpolation into FLEXPART is outside...'
  
  144: Replaced 'This does not make full use...' with 'Using gridded up-sampled testing data does not make full use..'
  
  145-149: unclear, please rephrase
  
  We are not sure what exactly was unclear. However, we replaced this text with the following one and hope it is clearer now: “Using gridded up-sampled testing data does not make full use of the neural network capabilities, since the neural network only produced values at a fixed resolution of $0.5^\circ\times0.5^\circ$ latitude/longitude, while we still use linear interpolation of the wind data to the exact particle position when computing their trajectories. However, the neural network could in principle also determine the wind components almost exactly at the particle positions upon repeatedly using the trained SISR model to increase the resolution high enough to obtain the wind values at the respective particle positions.”
  
  152 onward: the emphasized names do not appear to be re-used in the remainder of the manuscript. Maybe rather introduce 3-letter abbreviations, such as REF, LIN, NNI that then can re-appear in the results and figures.
  
  We removed the emphasized names.
  
  160: place references at the end of sentence
  
  Unfortunately, the sentence will become confusing when the references are not placed after the first sub-sentence (before the comma), since the explanation of the equations follows.
  
  163: clarify whether Xn, Yn are vectors with m elements, or for a specific time along the trajectory
  
  We added the time variable to the equation for clarification.
  
  172: This section seems to describe your approach, and would be better placed in the methods.
  
  We added a paragraph to the methods section to better explain our approach. Nevertheless, to remind the reader of the approach we leave the explanation here, too.
  
  Figure 2: the lines for lin u and lin v are exactly equal, is this coincidence? x-axis is lacking a unit. RMSE is defined with an index, but given without index here. Please explain in the result text how to interpret this figure.
  The linear interpolation is not dependent on the data in contrast to the neural network interpolation which is trained on different data. The interpolation of Fig. 2 is explained in L. 176 ff.
  
  Table 1: What do the arrows indicate? The caption contains a key result, that should be moved to the main text.
  The arrows indicate that for the RMSE a lower number while for the SSIM a higher number refers to a better interpolation. The key result is spelled out in text in L. 190.
  
  183 to 188: Need to guide the reader through the results. The comparison needs more structure, and quantitative examples from the figure where available to support the qualitative conclusions.
  
  We will extend the interpretation.
  
  191: Maybe express in relative terms, hardware-dependent?
  
  We now state that the linear interpolation is about 10 times faster than the neural network interpolation considering our hardware.
  
  195: distinguish "evaluate" and "apply" - an objective way to present the results would be to apply the method, display the results, and thereafter evaluate based on the error metrics.
  
  We reformulated the sentences to present the results in an objective way.
  
  Figure 3: Lacking panel labels. The top row does not seem to give additional information to the bottom row. I recommend using a continuous color scale; the two-color scale gives unjustified importance to errors larger than 5 m/s. Maps are missing coordinates. The RMSE in the title should be part of the text rather than a caption title. It would be useful to present a specific situation with meteorological fields for context.
  We split the figure in sub-figures with panel labels. The top row shows that high errors occur at fronts, this is then shown in a zoomed-in sub-figure in the bottom row. The color scale emphasizes the high errors, this way we see the strong difference in the error at the fronts.
  
  198: "before": rephrase
  
  Here, “before” referenced the previous section and we changed “before” to “one time up-scaling” to make it clear.
  
  199: "relative error reduction": where shown?
  
  The error reduction is shown in Table 2., we added a reference.
  
  200: "This holds...": can this information be presented as part of a more aggregated and thus robust result?
  
  First we showed an example and then using Table 2 the result is presented in a robust way.
  
  202: unclear, rephrase
  
  We now state that the neural network interpolation is 19% more accurate than the linear interpolation.
  
  Figure 4: See comments about Fig. 3, the color scale gives unjustified emphasis to wind errors above 2.5 m/s. Indicate in Fig. 3 where this zoom is taken. Arrows are difficult to see, take to separate panels, and use meteorological fields (e.g. sea level pressure or potential temperature) as reference in both sets of panels.
  We split the figure into sub-figures. Also here we want to emphasize high errors. The Arrows for the neural network interpolation almost coincide with the truth, thus the arrows of the true field are difficult to see.
  
  Figure 5: This figure needs more explanation. What bins have been used? With only 10 bins, it may be more appropriate to show the lines as step function. What error metric has been used? Can this figure be constructed on more than just one day to make it more robust?
  For each pixel we compute the relative error against the truth and increase the count of the corresponding bin (using 20 bins now).We now compute the error frequencies for all 138 levels and over 24h. This way the result is more robust.
  
  Figure 6: see Fig. 3.
  Table 2: see Table 1
  Figure 7: consider to remove this Figure. At this point, quantitative information may be sufficient/more useful than another illustration
  Quantitative information is given in Table 2. However, we consider it important to also show the error structure at a concrete example, and this is shown in Fig. 7. We combined the previous Fig. 3 and 4, and also Fig. 6 and 7.
  
  209: likely -> conceivable, provide reference
  
  Replaced the sentence with: 'However, this does not necessarily mean that trajectories advanced using the neural network interpolated fields are more accurate. Trajectories are not always equally sensitive to wind interpolation errors,...'
  
  215: it would be useful to briefly re-cap how these results are obtained. One case, several cases, specific region? How are trajectory errors distributed on a global map, do they mirror the interpolation errors?
  
  Since the trajectory errors result from interpolation, trajectory errors (for relatively short trajectory duration) are distributed quite similarly to the interpolation errors. For trajectories of longer duration (say, 10 days or longer), errors would be smeared out over larger areas, since initial errors are propagated along the trajectories. We do not think adding a figure showing the trajectory error distribution would provide meaningful additional information.
  
  We added ‘Here we show the results of the horizontal transport deviation (Eq. \eqref{eq:ahtd}) and standard deviations of particles advanced for 48 hours, using FLEXPART, after being initially globally distributed.’
  
  221: "smaller" - should this be "larger"?
  
  We reformulated the sentence to avoid confusion.
  
  223: "directly corresponds" - is this a result, your interpretation, or an assumption?
  
  Replaced with: '...interpolated ones, is likely a result of the lower frequency...'
  
  228: These paragraphs would better fit into a discussion section, together with other limitations. If possible, it would be useful to give more details, such that other studies can refer to your work.
  
  We feel that there is not enough material to justify a separate discussion section. In a nutshell, all existing interpolation methods, to the best of our knowledge, are not conservative and if conservation on the level of interpolation is important, then different design choices on the level of interpolation are necessary altogether not only for neural network but also for polynomial interpolation.
  
  242: remove "just to name a few"
  
  We removed it.
  
  245: would be useful to connect to weather phenomena here
  
  Indeed, it is helpful to connect to weather phenomena. Therefore, we have added a more detailed discussion on the interpolation errors along the cold front shown in Figure 4.
  
  250: remove "see Fig. 2"
  
  We removed it.
  
  263: this is an important limitation and should be taken up at different locations in the manuscript, including a discussion section. If non-conservation is an issue here, it would be useful to quantify. This would also give some balance to the study, which now mainly focuses on the advantages.
  
  Citation: https://doi.org/10.5194/egusphere-2022-441-AC7
RC2:
'Comment on egusphere-2022-441', Anonymous Referee #2, 14 Nov 2022

General

The manuscript entitled "" by Brecht and co-workers presents a very interesting and relevant study on using machine learning to interpolate (down-scale) meteorological wind data as taken from three-dimensional meteorological models in order to use them in advection calculation (semi-Lagrangian and Lagrangian). They apply their approach to global dataset (ERA5) and show that the ML approach is more successful in restoring the original, higher-resolution data than a simple linear-interpolation. Furthermore, they indicate similar improvements when ML interpolation is used in trajectory calculations in comparison to linear interpolation. The presented approach is a first step in the development of improved interpolations for Lagrangian models and semi-Lagrangian advection schemes. As the authors state, additional improvements can be expected when interpolation in time and onto particle positions could be incorporated. As such the study is highly relevant and should be published in GMD. The applied methods are sound, the manuscript is well written, the results presented in a concise and clear manner. Hence, I only have one 'major' comment and few minor suggestions.

Major comment

Construction of the degraded data: On line 85 it is described that the lower resolution data was obtained from sub-sampling the original ERA5 data. I am a bit surprised by this approach, since it does not necessarily reflect the representation in a coarser-resolution model, where the state variables in a larger grid cell should still represent the average in this grid cell and not a sub-sample. Could you please comment on the choice of this degradation strategy.

One direct result of the approach could be the large differences across frontal systems as indicated for the linear interpolation of coarse vs reference data. Likely, these differences would be smaller when average would have been used for degrading.

Minor comments

L40: Higher-order interpolation. It would be interesting to see how higher-order interpolation schemes would compete with the ML approach. Did you give this any try?

L65ff: Largely repeating the same points and references as in introduction. Consider removing/shortening it here or in intro.

L83: I would rather call this a 'vertical model layer' than a 'horizontal layer'.

L87: How much does the exclusive treatment of the horizontal wind components impact the flow's mass budget (continuity)? It is mentioned later (conclusions) that all interpolation methods suffer from potentially breaking conservation laws and that physics-based ML could improve things. Maybe it can already be mentioned here. Why was the vertical wind not included in this study? Are there any fundamental differences that make it impossible to directly train the model for vertical wind?

L115: Original levels are counted from the model top in IFS. So 0 to 50 would be the upper part of the atmosphere. What is the rational for cutting at level 50? What is the approximate pressure at this level? Does this separate into troposphere vs stratosphere?

Related to training two models for two vertical layers. How about training different models land and ocean as these give fundamentally different lower boundary conditions. How much does the performance increases in the ML method differ for land and ocean areas? How much for boundary layer (where turbulence is part of FLEXPARTs transport description) vs free troposphere?

L131: Are mu_x and mu_y scalars representing the overall image mean? If yes, I don't quite understand the use of the 11 x 11 Gaussian filter. Furthermore, I think it would be good to argue if and why SSIM should be a useful metric for comparing wind components as opposed to images. I suppose wind components will have a very different pdf from that of images (color channels)?

L132f: What is the motivation for K1 and K2? Why not simply mention C1=1E-4 and C2=9E-4?

L177: There is an exception to this observation! For SSIM linear interpolation in u seems to perform slightly better than model4.

L186, Fig.5: How would the same figure look like for the relative error? Are these large error associated with large wind speeds?

L191: It is mentioned elsewhere that FLEXPART was not run on the same compute architecture as the ML model. How comparable are the times given here? Consider adding CPU/GPU specs.

Fig 6: Figure caption wrong? I assume these are similar differences as in Fig. 3

Fig 7: Why do we not see the checkerboard pattern (as mentioned in the caption to Figure 4) here?

L234ff: Other downscaling approaches ingest additional high-resolution predictor variables (like topography or land cover) that have a direct impact on near-surface flow and spatial variability. Could such predictors be integrated into the present method as well?

Technical issues

Citation style: Seems to be wrong. Authors are given outside braces most of the time.

Equation 1: Consider using the same x, y notation as in equation 2.

L188: Additional figures in git repository? Shouldn't they rather be made available as part of a supplemental document/dataset? As git repository is not a permanent link/location, I would suggest to put figures elsewhere.

Citation: https://doi.org/10.5194/egusphere-2022-441-RC2
- AC8:
  'Reply on RC2', Rüdiger Brecht, 22 Dec 2022
  We thank the reviewer for providing valuable feedback on the first version of our submitted paper. We have done our best to take into account all remarks raised. In the following we give a detailed list of all the changes made in response to the points raised. Thank you once more for your help in improving our paper.
  
  Major comment
  Construction of the degraded data: On line 85 it is described that the lower resolution data was obtained from sub-sampling the original ERA5 data. I am a bit surprised by this approach, since it does not necessarily reflect the representation in a coarser-resolution model, where the state variables in a larger grid cell should still represent the average in this grid cell and not a sub-sample. Could you please comment on the choice of this degradation strategy.
  One direct result of the approach could be the large differences across frontal systems as indicated for the linear interpolation of coarse vs reference data. Likely, these differences would be smaller when average would have been used for degrading.
  The motivation for our choice of degrading data by leaving certain grid points entirely unchanged, is twofold:
  
  This approach has been used often in the past when studying wind interpolation errors for trajectory models (e.g., Kuo et al., 1985; Stohl et al., 1995). It is appropriate to be consistent with such past approaches.
  
  We agree that it would also be interesting to see how higher-resolution data could be reconstructed from lower-resolution data. This would then be more equivalent to downscaling approaches in weather prediction. However, for comparing the skill of different interpolation methods, this is not ideal. At the points where data are available at both high and low resolution, these data would be different in each case. Reconstruction of points in between would then not only reflect differences in the skill of interpolation but also the data differences at the points from where the interpolation is done. This would thus not allow a "clean" evaluation of different interpolation methods, mixing the effects of interpolation and grid-cell averaging for the coarse-resolution data points.
  
  Minor comments
  L40: Higher-order interpolation. It would be interesting to see how higher-order interpolation schemes would compete with the ML approach. Did you give this any try?
  At this stage it is too premature to compare the efficiency. We are also not implementing the method in the most efficient way. For a true comparison we would need to implement it in the best way also in FLEXPART and then compare the execution time.
  
  L65ff: Largely repeating the same points and references as in the introduction. Consider removing/shortening it here or in the intro.
  We moved the section “Related work” to the introduction.
  
  L83: I would rather call this a 'vertical model layer' than a 'horizontal layer'.
  The input of the neural network is a horizontal u or v velocity component. Here, “vertical model layer” refers to the horizontal u or v velocity.
  
  L87: How much does the exclusive treatment of the horizontal wind components impact the flow's mass budget (continuity)? It is mentioned later (conclusions) that all interpolation methods suffer from potentially breaking conservation laws and that physics-based ML could improve things. Maybe it can already be mentioned here. Why was the vertical wind not included in this study? Are there any fundamental differences that make it impossible to directly train the model for vertical wind?
  The vertical velocity is fundamentally different from the horizontal velocity as it is much more small scale. In practice the vertical velocity will require training of a more complicated neural network as neural networks have a tendency to learn large scale features first, which is referred to as a spectral bias. For this study we did not have the computational resources to experiment with the vertical velocity.
  
  L115: Original levels are counted from the model top in IFS. So 0 to 50 would be the upper part of the atmosphere. What is the rational for cutting at level 50? What is the approximate pressure at this level? Does this separate into troposphere vs stratosphere?
  We have it bottom to top. Cutting at index 50 (ca 8000m) results in separating troposphere and stratosphere and above
  
  Related to training two models for two vertical layers. How about training different models land and ocean as these give fundamentally different lower boundary conditions. How much does the performance increases in the ML method differ for land and ocean areas? How much for boundary layer (where turbulence is part of FLEXPARTs transport description) vs free troposphere?
  Indeed, distinguishing land and ocean in the training would be an alternative to our rather simple differentiation by height levels. However, there are also many other potential alternatives, such as developing different training data sets for climatically different regions (e.g., tropics, subtropics, midlatitudes), within or above the boundary layer, or for different meteorological situations. For developing an optimal method, it will be important to explore several of these options but it is beyond the scope of the current exploratory paper. With respect to the boundary layer, it is important to note that turbulence parameterizations have been switched off in FLEXPART for the current paper, as we wanted to study interpolation errors in isolation.
  
  L131: Are mu_x and mu_y scalars representing the overall image mean? If yes, I don't quite understand the use of the 11 x 11 Gaussian filter. Furthermore, I think it would be good to argue if and why SSIM should be a useful metric for comparing wind components as opposed to images. I suppose wind components will have a very different pdf from that of images (color channels)?
  Here, we stated the definition of the SSIM as used in practise, which uses the 11x11 Gaussian filter. We agree that the SSIM is not a traditional error measure. However, since our model is an adaptation from an image processing task we felt it was reasonable to present also the SSIM measure as well as it is useful for the machine learning community.
  
  L132f: What is the motivation for K1 and K2? Why not simply mention C1=1E-4 and C2=9E-4?
  For the sake of completeness we stated the definition of K1 and K2 as well.
  
  L177: There is an exception to this observation! For SSIM linear interpolation in u seems to perform slightly better than model4.
  We changed it to almost always has better metrics
  
  L186, Fig.5: How would the same figure look like for the relative error? Are these large error associated with large wind speeds?
  The largest errors generally occur where wind shears are largest, and this is usually associated with fronts and, generally, higher than average wind speeds. We discuss this now in more detail in the discussion of Fig. 4, which presents a clear example of this.
  
  We updated figure 5 to present the relative error.
  
  L191: It is mentioned elsewhere that FLEXPART was not run on the same compute architecture as the ML model. How comparable are the times given here? Consider adding CPU/GPU specs.
  The training of the neural network is done on the mentioned GPU device. The FLEXPART simulations are run on a CPU. Since we updated the interpolated fields before the simulation, all simulation times are the same.
  
  Fig 6: Figure caption wrong? I assume these are similar differences as in Fig. 3
  After reading the caption of Fig. 6 again we could not find an error.
  
  Fig 7: Why do we not see the checkerboard pattern (as mentioned in the caption to Figure 4) here?
  We do see the checkerboard pattern, however it is only every fourth pixel that stays the same, this way the checkerboard pattern is less visible.
  
  L234ff: Other downscaling approaches ingest additional high-resolution predictor variables (like topography or land cover) that have a direct impact on near-surface flow and spatial variability. Could such predictors be integrated into the present method as well?
  This is a very interesting idea and could be integrated into the current method. However, it would be another scope and thus relevant for future research.
  
  Citation: https://doi.org/10.5194/egusphere-2022-441-AC8

Interactive discussion

Status: closed

CEC1:
'Comment on egusphere-2022-441', Juan Antonio Añel, 23 Aug 2022

Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".

https://www.geoscientific-model-development.net/policies/code_and_data_policy.html

You have not published the code, input, and output data used or obtained to perform your study. In this way, unless you fix it, we will have to reject your manuscript for publication in Geosci. Model Dev. Admittedly this has been an oversight from our side, and your manuscript should have never been published in Discussions, given this problem. However, we are now offering you the possibility to fix it.
First, note that the only thing you have published is a Python notebook on GitHub to reproduce your plots. However, GitHub is not a suitable repository. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo. Therefore, please, publish your code in one of the appropriate repositories, and reply to this comment with the relevant information (link and DOI) as soon as possible, as it should be available for the Discussions stage. Also, please include the primary input and output data as mentioned above. In this way, you must include the modified 'Code and Data Availability' section in a potential reviewed version of your manuscript, the DOI of the code (and another DOI for the dataset if necessary).
Please, when uploading your code, include a license for it. If you do not include it, the code continues to be your property and can not be used by others, despite any statement on being free to use. You could want to choose a free software/open-source (FLOSS) license. We recommend the GPLv3. You only need to include the file 'https://www.gnu.org/licenses/gpl-3.0.txt' as LICENSE.txt with your code. Also, you can choose other options that Zenodo provides: GPLv2, Apache License, MIT License, etc.
Regards,
Juan A. Añel
Geosci. Model Dev. Executive Editor

Citation: https://doi.org/10.5194/egusphere-2022-441-CEC1
- AC1: 'Reply on CEC1', Rüdiger Brecht, 01 Sep 2022
  
  Dear Dr. Añel
  
  Geosci. Model Dev. Executive Editor,
  We are working on uploading the code and data, and modify the 'Code and Data Availability' section. We hope to finish this within the next week.
  
  Regards,
  
  The authors
  
  Citation: https://doi.org/10.5194/egusphere-2022-441-AC1
- AC2:
  'Reply on CEC1', Rüdiger Brecht, 13 Sep 2022
  
  Dear Dr. Añel
  
  Geosci. Model Dev. Executive Editor,
  
  Thank you for letting us publish the code and continue with the publishing process. We made the code available to pre-process the data, train the neural network and then post-process the data. The evaluated data is published on Zenodo (see the documentation on GitHub (https://github.com/RudigerBrecht/Improving-trajectory-calculations-using-SISR). Moreover the code is also available on Zenodo (https://doi.org/10.5281/zenodo.7065139). Furthermore, we updated the Code and Data Availability statement, because the input data is not freely available.
  Please let us know if there is anything else that we can improve.
  
  Regards,
  The authors
  
  Citation: https://doi.org/10.5194/egusphere-2022-441-AC2
  - CEC2: 'Reply on AC2', Juan Antonio Añel, 13 Sep 2022
    
    Dear authors,
    
    Unfortunately, you continue to make the same mistake. The data are made available through GitHub. However, again, we can not accept Github to host information related to your manuscript, independently of the fact that it is documentation, data or code. Therefore, please, move the contents in Github to a suitable repository, and use it in a potential reviewed version of your work.
    Also, you fail to justify not releasing the input data, which I understand is public, as based on ERA5, and nothing prevents you from sharing it. Therefore, we can not accept your statement, and we will have to reject your paper if you do not provide the input data, as the replicability of the work would be compromised. Another option is that you give a justification not to do it (usually a legal regulation that forbids you to do it). If this is the case, please, provide us with the evidence necessary (law, regulation, etc.), so that we can assess it, and store the input data in a trustable repository with a DOI, with the difference that it would not be public.
    Regards,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-CEC2
    
    AC3: 'Reply on CEC2', Rüdiger Brecht, 11 Oct 2022
    
    Dear Dr. Añel
    
    Geosci. Model Dev. Executive Editor,
    
    Thank you for your quick response. Indeed, the ERA5 data set is publicly available, thus we added the scripts to download the data. Here is the link with the updated code: https://zenodo.org/record/7181840. Please let us know if there is anything else that we can improve.
    
    Regards,
    The authors
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-AC3
    
    CEC3: 'Reply on AC3', Juan Antonio Añel, 11 Oct 2022
    
    Dear authors,
    First, thanks for sharing the scripts in Zenodo. However, the main problem persists. I think that you do not quite understand the nature of it. To reproduce your manuscript, currently, readers need to have access to the ERA5 data, and this does not even produce the input files for FlexPart. The ECMWF servers (and their internet addresses) are not permanent, and they are not long-term repositories for the archival of scientific data. They are a temporary service that, at best, works for over a few years. In this case, for example, in this case, until the current ERA5 reanalysis is superseded. Also, the reanalysis files could be modified if, for example, an error is found. Therefore, it should be avoided that to reproduce your manuscript the readers need to access the ERA5 data. Moreover, because they are not the real input data. You have already produced the interpolated input files, and these are the files that you have to share in the new repository. So, unless they are too big for it (e.g. hundreds of GB), please, share your interpolated input files.
    Best regards,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-CEC3
    
    CC1: 'Reply on CEC3', Andreas Stohl, 11 Oct 2022
    
    Dear editor,
    Unfortunately, there seems to be a misunderstanding.
    1) ECMWF re-analysis data are the best archived data that you can find. You can still get old re-analyses from decades ago, and they are versioned such that if there are multiple data streams, these are all preserved. So ERA5 may be superseded in the future but it will still be available.
    2) The data volume is ~400 gigabytes even for the short time periods we use.
    I have always been a strong advocate for open source, open data, etc., and we would openly share any data where it makes sense. In this case, however, I don't see how to do this, given the large data volumes!
    I hope for your understanding!
    Regards,
    Andreas
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-CC1
    
    CEC4: 'Reply on CC1', Juan Antonio Añel, 25 Oct 2022
    
    Dear Andreas, dear authors,
    First, apologies for taking so long to reply. I admit that it is hard that a change in reanalysis is not recorded and that ECMWF does well on this. Yet, it is true that in the past, there have been problems with others. Beyond that, the reproducibility and replicability of any learning technique are compromised without the exact input and output data; therefore, I am not making a point about open access here but about complying with the scientific method, and what makes something "science".
    Regarding the size, from the editorial point of view, it is hard to make an assessment about what is reasonable or not to ask (this happens all the time) as this information is usually omitted from the text. On the other hand, I would like to point out that nowadays, 400 GB is not such a big problem; it is a reasonable amount of data to share. For example, Zenodo has a limit of 50 GB per archive, so eight to ten Zenodo repositories could make the trick. We have published many papers that have solved the issue in this way.
    Therefore, I would ask you to be more open about the total size of the files that you would need to store to assure the replicability of the work and to assess the possibility of publishing at least a minimum sample. Otherwise, if you have stored the data internally and it is a too big dataset to upload and share, I would ask you to issue a DOI for it and provide information about where you are keeping them. And add this information to the manuscript.
    I know that this represents some work, but we would like to comply with the highest standards in open science and scientific reproducibility.
    Best regards,
    Juan A. Añel
    Geosci. Model Dev. Executive Editor
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-CEC4
    
    AC4: 'Reply on CEC4', Rüdiger Brecht, 24 Nov 2022
    
    Dear editor,
    
    As discussed, we uploaded the output data of the neural network which is then used for the trajectory simulation. To minimize the size of the data we compressed the files.
    The repositories are: https://zenodo.org/record/7277854 and https://zenodo.org/record/7318809.
    The repository https://zenodo.org/record/7350568 contains now the code including FLEXPART in the version used for our simulations. We also removed any occurrence of git/github in a reviewed version of the manuscript.
    Then, since we can trust the ECMWF about the storage of ERA5, one can download the data using the download scripts. We indicate the exact information about the fields in a reviewed version of our manuscript (we use the horizontal wind velocities u and v field at 138 levels).
    
    Regards,
    
    The authors
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-AC4
    
    AC5: 'Reply on CEC4', Rüdiger Brecht, 24 Nov 2022
    
    Dear editor,
    
    As discussed, we uploaded the output data of the neural network which is then used for the trajectory simulation. To minimize the size of the data we compressed the files.
    The repositories are: https://zenodo.org/record/7277854 and https://zenodo.org/record/7318809.
    The repository https://zenodo.org/record/7350568 contains now the code including FLEXPART in the version used for our simulations. We also removed any occurrence of git/github in a reviewed version of the manuscript.
    Then, since we can trust the ECMWF about the storage of ERA5, one can download the data using the download scripts. We indicate the exact information about the fields in a reviewed version of our manuscript (we use the horizontal wind velocities u and v field at 138 levels).
    
    Regards,
    
    The authors
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-AC5
    
    AC6: 'Reply on CEC4', Rüdiger Brecht, 24 Nov 2022
    
    Dear editor,
    
    As discussed, we uploaded the output data of the neural network which is then used for the trajectory simulation. To minimize the size of the data we compressed the files.
    The repositories are: https://zenodo.org/record/7277854 and https://zenodo.org/record/7318809.
    The repository https://zenodo.org/record/7350568 contains now the code including FLEXPART in the version used for our simulations. We also removed any occurrence of git/github in a reviewed version of the manuscript.
    Then, since we can trust the ECMWF about the storage of ERA5, one can download the data using the download scripts. We indicate the exact information about the fields in a reviewed version of our manuscript (we use the horizontal wind velocities u and v field at 138 levels).
    
    Regards,
    
    The authors
    
    Citation: https://doi.org/10.5194/egusphere-2022-441-AC6
RC1:
'Comment on egusphere-2022-441', Anonymous Referee #1, 12 Sep 2022

Review of "Improving trajectory calculations by FLEXPART 10.4+ using deep learning inspired single image superresolution by Brecht et al., submitted to EGUsphere.

The study by Brecht et al. presents an application of a neural network to provide higher-resolution wind information from a coarser resolution. Specifically, the authors analyse the impact of their approach on the accuracy of trajectory calculations, which rely heavily on interpolation operations. The paper describes interesting and relevant new approaches to interpolation in an atmospheric framework. However, there are a number of issues with the methodology, the presentation of the results, the discussion of limitations, and the overall writing style, that need to be adressed before the manuscript can be considered publishable.

Major items

1. Currently, the authors sub-sample a higher resolution model field to obtain the coarse resolution wind field. This approach is in my impression inconsistent with what a coarser-resolution model would provide. A coarser-resolution model would provide an average of what is represented by several grid points of a finer-scale model. I think the sub-sampling makes it harder for the linear interpolation to provide good results. That sub-sampling approach is also the origin of the checkerboard patterns apparent in e.g. Fig. 4. I recommend to redo the analysis by averaging the model fields rather than sub-sampling.

2. Is it correct that the neural network works on a 3x3 grid point stencil on a lat-lon grid? It seems that the footprint of the interpolation operator would thus see very different area sizes near the pole than near the equator. How is this affecting the training and application of the neural network?

3. Some sections are poorly written and therefore hardly comprehensible (for example section 3.2). Since the authors introduce new methods into the field of atmospheric science, I recommend to make an extra effort to clearly define all terminology (e.g., "channel attention" is never defined). However, I do not think this is mainly about the content material, but rather about how the sentences and paragraphs are compiled. For example, section 3.2 could start with a short paragraph, providing a break-down of the steps involved in the architecture, before describing each part in following paragraphs. I recommend the authors to take a look at Gopen and Swan (1990) regarding how to write more clearly and effectively.

4. The issue of non-conservation of interpolation algorithms is a major concern in a physical application as presented here. The basic model equations are derived from principles of mass and energy conservation. Therefore, if there are conservation violations induced by this method, this aspect needs to be clearly brought forward throughout the manuscript, to make sure it is not overlooked by readers. This aspect can be mentioned in Sec. 3.4, brought up as part of the results, for example by comparing the kinetic energy and the velocity spectra from the fine-resolution and interpolated fields. The short discussion in L. 234 onwards might be more suitable in a discussion paragraph.

5. The results need to be structured more clearly. Right now, the sequence of results and examples in Sec. 4.1 appear somewhat arbitrary. While there certainly is some reasoning behind the structure and examples, it is not spelled out clearly, and thus the reader is left guessing about how to "connect the dots". Coming from an atmospheric science perspective, I suggest a structure that starts one specific case as an example, such as one of the frontal bands shown in Fig. 3., where the linear interpolation has clear deficiencies. Thereby, it would be helpful to also show additional atmospheric variables to illustrate the case (for example surface pressure or air temperature. A tropical cyclone or a Rossby wave breaking could be other interesting situations to present. After stepping through the example case, more statistically robust information could be provided, from considering a larger number of days or cases. Finally, you proceed to the application with the trajectory calculations, before considering energy conservation.

6. On many occasions, the results are presented in a qualitative way (closer/larger/etc.). In order to connect the results to the figures, and to make it possible to follow the interpretation and evaluation of the authors, it would be very useful to include concrete numbers alongside the qualitative interpretation, while referring to the respective figure panels. Examples are L. 182 and onward, L. 198 and onward.

7. The writing style of in particular the results section should be more distanced or objective. Now, the authors frequently use expressions like "we demonstrate", "we show" in the start of a paragraph, i.e. before actually having presented the evidence. As a critical reader, one might get the impression you are overselling the results. I strongly recommend changing this unnecessary forceful writing style to a more distanced, objective style. Let the reader see the evidence for themselves, while guiding them through the material, before drawing conclusions. Many paragraphs in the results are currently "upside down" in that way.

8. As another, related aspect, the figures are not properly described. At present, the length of the text describing the results is very much out of balance with the number of figures. For example in Sec. 4.1.1, L. 183, an entire 3 figures are referred to within just 3 sentences, but none of the sentences describes what actually is seen within the figures. Rather than leaving it up to the author to interpret the figures, use some sentences for each figure to describe what is displayed, and highlight what is important to take away. This applies to all figures in the manuscript.

9. On several occasions (including Fig. 2, 4, 5), the figure captions contain information about the method or results that are not mentioned in the text. Such information must be placed in the main text.

10. What are the limitations of the method in terms of computational effort? In L. 192, it is briefly mentioned that the computation time is a factor 10 larger than linear interpolation. Is there still an advantage of neural network approach compared to for example quadratic interpolation? This could be worth a short section in the discussion. The improved conservation of other properties is also interesting, but unfortunately not shown in more detail.

Detailed comments:

Title: "deep learning inspired": unclear what this expression means, consider to remove/replace. State what aspect of trajectory calculations is improved (accuracy).

L. 20: Can you back up this statement by a reference/example?

L. 21: "where a dense network" rephrase. If the point is that the numerical weather prediction process produces large amounts of gridded data, then it would be sufficient to state just that, without mentioning observations (which are not at all part of this manuscript). Remove "reanalysis model", a reanalysis is generated from regular NWP models.

L. 25 onward: check citation of references, missing brackets.

L. 27: remove "just to name a few"

L. 30: logical gap, what is the connection to the previous paragraph?

L. 34: remove "surprisingly", this entirely depends on the perspective of the writer.

L. 34: briefly define "convolutional neural network".

L. 44: what do you mean by "variable-scale"?

L. 44: what to you mean by "deep" - how deep?

L. 45: Rephrase: "showcase" sounds like snapshots or illustrations, but as a reader I look for reliable evidence.

Section 2: "Related work"

This section does currently not serve a clear purpose, and is somewhat duplicate with the introduction. I recommend to delete this section here, and partly incorporate bits in the introduction, partly into a clearer method description.

Section 3: "Methods"

This section would benefit from a first paragraph that explains your overall approach, followed by a section that discusses the choice of the neural network, based on the range of choices that exist, in an accessible writing style.

Section 3.1: "Training data"

The training data would be more natural to place after sections that describe the actual neural network and approach.

L. 82: Why could this seem little data? How much training is commonly needed?

L. 107: rephrase using more distanced and objective terms. It could provide depth to the study to present a less well-performing approach in an appendix.

L. 114: remove "for testing purposes"

L. 117: 50 or 88 -> 50 and 88

Figure 1: several abbreviations and terms of the operations in the figure are not defined, include in caption or describing text. What do the bracketing lines indicate? The hierarchy between (a), (b) and (c) and between (a) and (d) could be made clearer in the figure, e.g. by lines that indicate "zooming in".

L. 123: Add a statement about the purpose of the error metric, i.e. what is to be assessed.

L. 127: here and elsewhere: ground truth -> truth. (ground truth would only make sense in a remote sensing context)

The notations for RMSE and SSIM could be simplified and clarified, for example using \hat{z} for the interpolated quantity, and using a,b instead of x,y (which is commonly used for spatial coordinates) for the two figures in the SSIM metric. How important is the "perceived similarity of two images" for the given application? This would be a suitable place to mention conservation issues due to interpolation.

L. 142, 144: unclear what "this" refers to.

L. 145-149: unclear, please rephrase

L. 152 onward: the emphasised names do not appear to be re-used in the remainder of the manuscript. Maybe rather introduce 3-letter abbreviations, such as REF, LIN, NNI that then can re-appear in the results and figures.

L. 160: place references at the end of sentence

L. 163: clarify whether Xn, Yn are vectors with m elements, or for a specific time along the trajectory

L. 172: This section seems to describe your approach, and would be better placed in the methods.

Figure 2: the lines for lin u and lin v are exactly equal, is this coincidence? x-axis is lacking a unit. RMSE is defined with an index, but given without index here. Please explain in the result text how to interpret this figure.

Table 1: What do the arrows indicate? The caption contains a key result, that should be moved to the main text.

L. 183 to 188: Need to guide the reader through the results. The comparison needs more structure, and quantitative examples from the figure where available to support the qualitative conclusions.

L. 191: Maybe express in relative terms, hardware-dependent?

L. 195: distinguish "evaluate" and "apply" - an objective way to present the results would be to apply the method, display the results, and thereafter evaluate based on the error metrics.

Figure 3: Lacking panel labels. The top row does not seem to give additional information to the bottom row. I recommend using a continuous color scale; the two-color scale gives unjustified importance to errors larger than 5 m/s. Maps are missing coordinates. The RMSE in the title should be part of the text rather than a caption title. It would be useful to present a specific situation with meteorological fields for context.

L. 198: "before": rephrase

L. 199: "relative error reduction": where shown?

L. 200: "This holds...": can this information be presented as part of a more aggregated and thus robust result?

L. 202: unclear, rephrase

Figure 4: See comments about Fig. 3, the color scale gives unjustified emphasis to wind errors above 2.5 m/s. Indicate in Fig. 3 where this zoom is taken. Arrows are difficult to see, take to separate panels, and use meteorological fields (e.g. sea level pressure or potential temperature) as reference in both sets of panels.

Figure 5: This figure needs more explanation. What bins have been used? With only 10 bins, it may be more appropriate to show the lines as step function. What error metric has been used? Can this figure be constructed on more than just one day to make it more robust?

Figure 6: see Fig. 3.

Table 2: see Table 1

Figure 7: consider to remove this Figure. At this point, quantitiative information may be sufficient/more useful than another illustration

L. 209: likely -> conceivable, provide reference

L. 215: it would be useful to briefly re-cap how these results are obtained. One case, several cases, specific region? How are trajectory errors distributed on a global map, do they mirror the interpolation errors?

L. 221: "smaller" - should this be "larger"?

L. 223: "directly corresponds" - is this a result, your interpretation, or an assumption?

L. 228: These paragraphs would better fit into a discussion section, together with other limitations. If possible, it would be useful to give more details, such that other studies can refer to you work.

L. 242: remove "just to name a few"

L. 245: would be useful to connect to weather phenomena here

L. 250: remove "see Fig. 2"

L. 263: this is an important limitation and should be taken up at different locations in the manuscript, including a discussion section. If non-conservation is an issue here, it would be useful to quantify. This would also give some balance to the study, which now mainly focuses on the advantages.

Citation: https://doi.org/10.5194/egusphere-2022-441-RC1
- AC7:
  'Reply on RC1', Rüdiger Brecht, 22 Dec 2022
  We thank the reviewer for providing valuable feedback on the first version of our submitted paper. We have done our best to take into account all remarks raised. In the following we give a detailed list of all the changes made in response to the points raised. Thank you once more for your help in improving our paper.
  Major items
  Currently, the authors sub-sample a higher resolution model field to obtain the coarse resolution wind field. This approach is in my impression inconsistent with what a coarser-resolution model would provide. A coarser-resolution model would provide an average of what is represented by several grid points of a finer-scale model. I think the sub-sampling makes it harder for the linear interpolation to provide good results. That sub-sampling approach is also the origin of the checkerboard patterns apparent in e.g. Fig. 4. I recommend to redo the analysis by averaging the model fields rather than sub-sampling.
  
  The motivation for our choice of degrading data by leaving certain grid points entirely unchanged, is twofold:
  
  This approach has been used often in the past when studying wind interpolation errors for trajectory models (e.g., Kuo et al., 1985; Stohl et al., 1995). It is appropriate to be consistent with such past approaches.
  
  We agree that it would also be interesting to see how higher-resolution data could be reconstructed from lower-resolution data. This would then be more equivalent to downscaling approaches in weather prediction. However, for comparing the skill of different interpolation methods, this is not ideal. At the points where data are available at both high and low resolution, these data would be different in each case. Reconstruction of points in between would then not only reflect differences in the skill of interpolation but also the data differences at the points from where the interpolation is done. This would thus not allow a "clean" evaluation of different interpolation methods, mixing the effects of interpolation and grid-cell averaging for the coarse-resolution data points.
  
  Is it correct that the neural network works on a 3x3 grid point stencil on a lat-lon grid? It seems that the footprint of the interpolation operator would thus see very different area sizes near the pole than near the equator. How is this affecting the training and application of the neural network?
  
  It is correct that some parts of the neural network work on 3x3 grid point stencils. However, the network architecture is more complicated than considering just 3x3 stencils working on the lat-lon grid. Ideally the nonlinear nature of the neural network learns how to cope with the different area sizes. It would be very interesting to analyze the effect of the area, but this is not the scope of the present work.
  
  Some sections are poorly written and therefore hardly comprehensible (for example section 3.2). Since the authors introduce new methods into the field of atmospheric science, I recommend to make an extra effort to clearly define all terminology (e.g., "channel attention" is never defined). However, I do not think this is mainly about the content material, but rather about how the sentences and paragraphs are compiled. For example, section 3.2 could start with a short paragraph, providing a break-down of the steps involved in the architecture, before describing each part in following paragraphs. I recommend the authors to take a look at Gopen and Swan (1990) regarding how to write more clearly and effectively.
  
  To make the neural network architecture comprehensible, we added a more non technical description of the overall architecture. We also went through the entire manuscript again to clarify various items.
  
  The issue of non-conservation of interpolation algorithms is a major concern in a physical application as presented here. The basic model equations are derived from principles of mass and energy conservation. Therefore, if there are conservation violations induced by this method, this aspect needs to be clearly brought forward throughout the manuscript, to make sure it is not overlooked by readers. This aspect can be mentioned in Sec. 3.4, brought up as part of the results, for example by comparing the kinetic energy and the velocity spectra from the fine-resolution and interpolated fields. The short discussion in L. 234 onwards might be more suitable in a discussion paragraph.
  
  We admit that mass conservation is a possible issue in interpolation. However, most available interpolation methods do not conserve mass and are not designed for that. The goal of this study was to compare NN against such interpolation methods. We therefore think that this topic is somewhat beyond the scope of the current paper. However, we admit that it would be of great value to develop a dynamically constrained NN interpolation method that also ensures mass conservation. This would be a topic of future research.
  
  The results need to be structured more clearly. Right now, the sequence of results and examples in Sec. 4.1 appear somewhat arbitrary. While there certainly is some reasoning behind the structure and examples, it is not spelled out clearly, and thus the reader is left guessing about how to "connect the dots". Coming from an atmospheric science perspective, I suggest a structure that starts one specific case as an example, such as one of the frontal bands shown in Fig. 3., where the linear interpolation has clear deficiencies. Thereby, it would be helpful to also show additional atmospheric variables to illustrate the case (for example surface pressure or air temperature. A tropical cyclone or a Rossby wave breaking could be other interesting situations to present. After stepping through the example case, more statistically robust information could be provided, from considering a larger number of days or cases. Finally, you proceed to the application with the trajectory calculations, before considering energy conservation.
  
  Thanks a lot for this remark. We went through the section one more time to better explain our results. In summary, our structure of the results is to show first the interpolation method and then use it in a simulation. In the present work we only consider the horizontal velocity fields. Thus, first we show that the new interpolation method using the neural network outperforms linear interpolation and then we use these interpolated fields to demonstrate that also the trajectory simulation using these interpolated fields is better. The frontal bands shown in Fig. 3 are a coincidence, since the plot shows the error, which is highest at these bands.
  
  On many occasions, the results are presented in a qualitative way (closer/larger/etc.). In order to connect the results to the figures, and to make it possible to follow the interpretation and evaluation of the authors, it would be very useful to include concrete numbers alongside the qualitative interpretation, while referring to the respective figure panels. Examples are L. 182 and onward, L. 198 and onward.
  
  We follow the methodology in presenting our results in tables and figures in a quantitative manner with associated qualitative descriptions being provided in the text.
  
  The writing style of in particular the results section should be more distanced or objective. Now, the authors frequently use expressions like "we demonstrate", "we show" in the start of a paragraph, i.e. before actually having presented the evidence. As a critical reader, one might get the impression you are overselling the results. I strongly recommend changing this unnecessary forceful writing style to a more distanced, objective style. Let the reader see the evidence for themselves, while guiding them through the material, before drawing conclusions. Many paragraphs in the results are currently "upside down" in that way.
  
  We appreciate your remark and went through the entire manuscript again to soften some of our writing style.
  
  As another, related aspect, the figures are not properly described. At present, the length of the text describing the results is very much out of balance with the number of figures. For example in Sec. 4.1.1, L. 183, an entire 3 figures are referred to within just 3 sentences, but none of the sentences describes what actually is seen within the figures. Rather than leaving it up to the author to interpret the figures, use some sentences for each figure to describe what is displayed, and highlight what is important to take away. This applies to all figures in the manuscript.
  
  Thank you for pointing this out, we extended the description of the figures and softened some of our writing style.
  
  On several occasions (including Fig. 2, 4, 5), the figure captions contain information about the method or results that are not mentioned in the text. Such information must be placed in the main text.
  
  Our style of writing the manuscript is based on explaining the content of the figures in the captions and when referencing the figures commenting the results observed. We also went through the entire manuscript again to clarify various items.
  
  What are the limitations of the method in terms of computational effort? In L. 192, it is briefly mentioned that the computation time is a factor 10 larger than linear interpolation. Is there still an advantage of neural network approach compared to for example quadratic interpolation? This could be worth a short section in the discussion. The improved conservation of other properties is also interesting, but unfortunately not shown in more detail.
  
  At this stage it is too premature to compare the efficiency. We are also not implementing the method in the most efficient way. For a true comparison we would need to implement it in the best way also in FLEXPART and then compare the execution time.
  
  Detailed comments:
  Title: "deep learning inspired": unclear what this expression means, consider to remove/replace. State what aspect of trajectory calculations is improved (accuracy).
  We changed the title to Improving trajectory calculations by FLEXPART 10.4+ using single image superresolution.
  
  20: Can you back up this statement by a reference/example?
  
  We added a reference.
  
  21: "where a dense network" rephrase. If the point is that the numerical weather prediction process produces large amounts of gridded data, then it would be sufficient to state just that, without mentioning observations (which are not at all part of this manuscript). Remove "reanalysis model", a reanalysis is generated from regular NWP models.
  
  We simplified the sentence to say that NWP and observations generate large amounts of gridded data.
  
  25 onward: check citation of references, missing brackets.
  
  We added the missing brackets.
  
  27: remove "just to name a few"
  
  We removed it.
  
  30: logical gap, what is the connection to the previous paragraph?
  
  We removed the logical gap by moving the section “Related work” into the introduction.
  
  34: remove "surprisingly", this entirely depends on the perspective of the writer.
  
  We removed it.
  
  34: briefly define "convolutional neural network".
  
  We added that a CNN is a neural network whose layers are convolutions, which puts the input images through a set of convolutional filters, each of which activates certain features from the input.
  
  44: what do you mean by "variable-scale"?
  
  Here, variable-scale means that the neural network can cope with different resolutions of the wind fields. This way it can be applied multiple times to interpolate a meteorological field to the desired resolution. This is explained in the sentence after L 44.
  
  44: what to you mean by "deep" - how deep?
  
  Here, deep refers to the neural network having multiple layers.
  
  45: Rephrase: "showcase" sounds like snapshots or illustrations, but as a reader I look for reliable evidence.
  
  We rephrased “showcase” with “demonstrate”.
  
  Section 2: "Related work". This section does currently not serve a clear purpose, and is somewhat duplicate with the introduction. I recommend deleting this section here, and partly incorporating bits in the introduction, partly into a clearer method description.
  We moved the section “Related work” to the introduction.
  
  Section 3: "Methods". This section would benefit from a first paragraph that explains your overall approach, followed by a section that discusses the choice of the neural network, based on the range of choices that exist, in an accessible writing style.
  We added the overall approach to the “Methods” section. The choice of the neural network is then described at the end of the “architecture” section.
  
  Section 3.1: "Training data". The training data would be more natural to place after sections that describe the actual neural network and approach.
  We swapped the section “Training data” and “neural network architecture”.
  
  82: Why could this seem little data? How much training is commonly needed?
  
  The phrase is misleading. We will just state the number of training files. It is difficult to say how much training data is needed, at least a few thousand samples.
  
  107: rephrase using more distanced and objective terms. It could provide depth to the study to present a less well-performing approach in an appendix.
  
  Indeed, a comparison of different models would be an interesting study. Here, however, the focus is on improving the trajectory simulation.
  
  114: remove "for testing purposes"
  
  We removed it.
  
  117: 50 or 88 -> 50 and 88
  
  We changed “or” to “and”.
  
  Figure 1: several abbreviations and terms of the operations in the figure are not defined, include in caption or describing text. What do the bracketing lines indicate? The hierarchy between (a), (b) and (c) and between (a) and (d) could be made clearer in the figure, e.g. by lines that indicate "zooming in".
  We added an explanation about the dotted line, which just means that the ResBlock is repeated multiple times. It is difficult to indicate “zooming in” by lines in the figure.
  
  123: Add a statement about the purpose of the error metric, i.e. what is to be assessed.
  
  The error metrics are evaluating the accuracy of the interpolation and trajectory simulation. We add a statement to the revised version.
  
  127: here and elsewhere: ground truth -> truth. (ground truth would only make sense in a remote sensing context)
  
  We replaced “ground truth'' with “truth”.
  
  The notations for RMSE and SSIM could be simplified and clarified, for example using \hat{z} for the interpolated quantity, and using a,b instead of x,y (which is commonly used for spatial coordinates) for the two figures in the SSIM metric. How important is the "perceived similarity of two images" for the given application? This would be a suitable place to mention conservation issues due to interpolation.
  Indeed, for x and y we refer to two images, to avoid confusion we use now a and b. We include the SSIM metric because we interpret the gridded horizontal velocity fields as images.
  
  142, 144: unclear what "this" refers to.
  
  142: Replaced '...this is outside of the scope..' with '...directly implementing the neural network interpolation into FLEXPART is outside...'
  
  144: Replaced 'This does not make full use...' with 'Using gridded up-sampled testing data does not make full use..'
  
  145-149: unclear, please rephrase
  
  We are not sure what exactly was unclear. However, we replaced this text with the following one and hope it is clearer now: “Using gridded up-sampled testing data does not make full use of the neural network capabilities, since the neural network only produced values at a fixed resolution of $0.5^\circ\times0.5^\circ$ latitude/longitude, while we still use linear interpolation of the wind data to the exact particle position when computing their trajectories. However, the neural network could in principle also determine the wind components almost exactly at the particle positions upon repeatedly using the trained SISR model to increase the resolution high enough to obtain the wind values at the respective particle positions.”
  
  152 onward: the emphasized names do not appear to be re-used in the remainder of the manuscript. Maybe rather introduce 3-letter abbreviations, such as REF, LIN, NNI that then can re-appear in the results and figures.
  
  We removed the emphasized names.
  
  160: place references at the end of sentence
  
  Unfortunately, the sentence will become confusing when the references are not placed after the first sub-sentence (before the comma), since the explanation of the equations follows.
  
  163: clarify whether Xn, Yn are vectors with m elements, or for a specific time along the trajectory
  
  We added the time variable to the equation for clarification.
  
  172: This section seems to describe your approach, and would be better placed in the methods.
  
  We added a paragraph to the methods section to better explain our approach. Nevertheless, to remind the reader of the approach we leave the explanation here, too.
  
  Figure 2: the lines for lin u and lin v are exactly equal, is this coincidence? x-axis is lacking a unit. RMSE is defined with an index, but given without index here. Please explain in the result text how to interpret this figure.
  The linear interpolation is not dependent on the data in contrast to the neural network interpolation which is trained on different data. The interpolation of Fig. 2 is explained in L. 176 ff.
  
  Table 1: What do the arrows indicate? The caption contains a key result, that should be moved to the main text.
  The arrows indicate that for the RMSE a lower number while for the SSIM a higher number refers to a better interpolation. The key result is spelled out in text in L. 190.
  
  183 to 188: Need to guide the reader through the results. The comparison needs more structure, and quantitative examples from the figure where available to support the qualitative conclusions.
  
  We will extend the interpretation.
  
  191: Maybe express in relative terms, hardware-dependent?
  
  We now state that the linear interpolation is about 10 times faster than the neural network interpolation considering our hardware.
  
  195: distinguish "evaluate" and "apply" - an objective way to present the results would be to apply the method, display the results, and thereafter evaluate based on the error metrics.
  
  We reformulated the sentences to present the results in an objective way.
  
  Figure 3: Lacking panel labels. The top row does not seem to give additional information to the bottom row. I recommend using a continuous color scale; the two-color scale gives unjustified importance to errors larger than 5 m/s. Maps are missing coordinates. The RMSE in the title should be part of the text rather than a caption title. It would be useful to present a specific situation with meteorological fields for context.
  We split the figure in sub-figures with panel labels. The top row shows that high errors occur at fronts, this is then shown in a zoomed-in sub-figure in the bottom row. The color scale emphasizes the high errors, this way we see the strong difference in the error at the fronts.
  
  198: "before": rephrase
  
  Here, “before” referenced the previous section and we changed “before” to “one time up-scaling” to make it clear.
  
  199: "relative error reduction": where shown?
  
  The error reduction is shown in Table 2., we added a reference.
  
  200: "This holds...": can this information be presented as part of a more aggregated and thus robust result?
  
  First we showed an example and then using Table 2 the result is presented in a robust way.
  
  202: unclear, rephrase
  
  We now state that the neural network interpolation is 19% more accurate than the linear interpolation.
  
  Figure 4: See comments about Fig. 3, the color scale gives unjustified emphasis to wind errors above 2.5 m/s. Indicate in Fig. 3 where this zoom is taken. Arrows are difficult to see, take to separate panels, and use meteorological fields (e.g. sea level pressure or potential temperature) as reference in both sets of panels.
  We split the figure into sub-figures. Also here we want to emphasize high errors. The Arrows for the neural network interpolation almost coincide with the truth, thus the arrows of the true field are difficult to see.
  
  Figure 5: This figure needs more explanation. What bins have been used? With only 10 bins, it may be more appropriate to show the lines as step function. What error metric has been used? Can this figure be constructed on more than just one day to make it more robust?
  For each pixel we compute the relative error against the truth and increase the count of the corresponding bin (using 20 bins now).We now compute the error frequencies for all 138 levels and over 24h. This way the result is more robust.
  
  Figure 6: see Fig. 3.
  Table 2: see Table 1
  Figure 7: consider to remove this Figure. At this point, quantitative information may be sufficient/more useful than another illustration
  Quantitative information is given in Table 2. However, we consider it important to also show the error structure at a concrete example, and this is shown in Fig. 7. We combined the previous Fig. 3 and 4, and also Fig. 6 and 7.
  
  209: likely -> conceivable, provide reference
  
  Replaced the sentence with: 'However, this does not necessarily mean that trajectories advanced using the neural network interpolated fields are more accurate. Trajectories are not always equally sensitive to wind interpolation errors,...'
  
  215: it would be useful to briefly re-cap how these results are obtained. One case, several cases, specific region? How are trajectory errors distributed on a global map, do they mirror the interpolation errors?
  
  Since the trajectory errors result from interpolation, trajectory errors (for relatively short trajectory duration) are distributed quite similarly to the interpolation errors. For trajectories of longer duration (say, 10 days or longer), errors would be smeared out over larger areas, since initial errors are propagated along the trajectories. We do not think adding a figure showing the trajectory error distribution would provide meaningful additional information.
  
  We added ‘Here we show the results of the horizontal transport deviation (Eq. \eqref{eq:ahtd}) and standard deviations of particles advanced for 48 hours, using FLEXPART, after being initially globally distributed.’
  
  221: "smaller" - should this be "larger"?
  
  We reformulated the sentence to avoid confusion.
  
  223: "directly corresponds" - is this a result, your interpretation, or an assumption?
  
  Replaced with: '...interpolated ones, is likely a result of the lower frequency...'
  
  228: These paragraphs would better fit into a discussion section, together with other limitations. If possible, it would be useful to give more details, such that other studies can refer to your work.
  
  We feel that there is not enough material to justify a separate discussion section. In a nutshell, all existing interpolation methods, to the best of our knowledge, are not conservative and if conservation on the level of interpolation is important, then different design choices on the level of interpolation are necessary altogether not only for neural network but also for polynomial interpolation.
  
  242: remove "just to name a few"
  
  We removed it.
  
  245: would be useful to connect to weather phenomena here
  
  Indeed, it is helpful to connect to weather phenomena. Therefore, we have added a more detailed discussion on the interpolation errors along the cold front shown in Figure 4.
  
  250: remove "see Fig. 2"
  
  We removed it.
  
  263: this is an important limitation and should be taken up at different locations in the manuscript, including a discussion section. If non-conservation is an issue here, it would be useful to quantify. This would also give some balance to the study, which now mainly focuses on the advantages.
  
  Citation: https://doi.org/10.5194/egusphere-2022-441-AC7
RC2:
'Comment on egusphere-2022-441', Anonymous Referee #2, 14 Nov 2022

General

The manuscript entitled "" by Brecht and co-workers presents a very interesting and relevant study on using machine learning to interpolate (down-scale) meteorological wind data as taken from three-dimensional meteorological models in order to use them in advection calculation (semi-Lagrangian and Lagrangian). They apply their approach to global dataset (ERA5) and show that the ML approach is more successful in restoring the original, higher-resolution data than a simple linear-interpolation. Furthermore, they indicate similar improvements when ML interpolation is used in trajectory calculations in comparison to linear interpolation. The presented approach is a first step in the development of improved interpolations for Lagrangian models and semi-Lagrangian advection schemes. As the authors state, additional improvements can be expected when interpolation in time and onto particle positions could be incorporated. As such the study is highly relevant and should be published in GMD. The applied methods are sound, the manuscript is well written, the results presented in a concise and clear manner. Hence, I only have one 'major' comment and few minor suggestions.

Major comment

Construction of the degraded data: On line 85 it is described that the lower resolution data was obtained from sub-sampling the original ERA5 data. I am a bit surprised by this approach, since it does not necessarily reflect the representation in a coarser-resolution model, where the state variables in a larger grid cell should still represent the average in this grid cell and not a sub-sample. Could you please comment on the choice of this degradation strategy.

One direct result of the approach could be the large differences across frontal systems as indicated for the linear interpolation of coarse vs reference data. Likely, these differences would be smaller when average would have been used for degrading.

Minor comments

L40: Higher-order interpolation. It would be interesting to see how higher-order interpolation schemes would compete with the ML approach. Did you give this any try?

L65ff: Largely repeating the same points and references as in introduction. Consider removing/shortening it here or in intro.

L83: I would rather call this a 'vertical model layer' than a 'horizontal layer'.

L87: How much does the exclusive treatment of the horizontal wind components impact the flow's mass budget (continuity)? It is mentioned later (conclusions) that all interpolation methods suffer from potentially breaking conservation laws and that physics-based ML could improve things. Maybe it can already be mentioned here. Why was the vertical wind not included in this study? Are there any fundamental differences that make it impossible to directly train the model for vertical wind?

L115: Original levels are counted from the model top in IFS. So 0 to 50 would be the upper part of the atmosphere. What is the rational for cutting at level 50? What is the approximate pressure at this level? Does this separate into troposphere vs stratosphere?

Related to training two models for two vertical layers. How about training different models land and ocean as these give fundamentally different lower boundary conditions. How much does the performance increases in the ML method differ for land and ocean areas? How much for boundary layer (where turbulence is part of FLEXPARTs transport description) vs free troposphere?

L131: Are mu_x and mu_y scalars representing the overall image mean? If yes, I don't quite understand the use of the 11 x 11 Gaussian filter. Furthermore, I think it would be good to argue if and why SSIM should be a useful metric for comparing wind components as opposed to images. I suppose wind components will have a very different pdf from that of images (color channels)?

L132f: What is the motivation for K1 and K2? Why not simply mention C1=1E-4 and C2=9E-4?

L177: There is an exception to this observation! For SSIM linear interpolation in u seems to perform slightly better than model4.

L186, Fig.5: How would the same figure look like for the relative error? Are these large error associated with large wind speeds?

L191: It is mentioned elsewhere that FLEXPART was not run on the same compute architecture as the ML model. How comparable are the times given here? Consider adding CPU/GPU specs.

Fig 6: Figure caption wrong? I assume these are similar differences as in Fig. 3

Fig 7: Why do we not see the checkerboard pattern (as mentioned in the caption to Figure 4) here?

L234ff: Other downscaling approaches ingest additional high-resolution predictor variables (like topography or land cover) that have a direct impact on near-surface flow and spatial variability. Could such predictors be integrated into the present method as well?

Technical issues

Citation style: Seems to be wrong. Authors are given outside braces most of the time.

Equation 1: Consider using the same x, y notation as in equation 2.

L188: Additional figures in git repository? Shouldn't they rather be made available as part of a supplemental document/dataset? As git repository is not a permanent link/location, I would suggest to put figures elsewhere.

Citation: https://doi.org/10.5194/egusphere-2022-441-RC2
- AC8:
  'Reply on RC2', Rüdiger Brecht, 22 Dec 2022
  We thank the reviewer for providing valuable feedback on the first version of our submitted paper. We have done our best to take into account all remarks raised. In the following we give a detailed list of all the changes made in response to the points raised. Thank you once more for your help in improving our paper.
  
  Major comment
  Construction of the degraded data: On line 85 it is described that the lower resolution data was obtained from sub-sampling the original ERA5 data. I am a bit surprised by this approach, since it does not necessarily reflect the representation in a coarser-resolution model, where the state variables in a larger grid cell should still represent the average in this grid cell and not a sub-sample. Could you please comment on the choice of this degradation strategy.
  One direct result of the approach could be the large differences across frontal systems as indicated for the linear interpolation of coarse vs reference data. Likely, these differences would be smaller when average would have been used for degrading.
  The motivation for our choice of degrading data by leaving certain grid points entirely unchanged, is twofold:
  
  This approach has been used often in the past when studying wind interpolation errors for trajectory models (e.g., Kuo et al., 1985; Stohl et al., 1995). It is appropriate to be consistent with such past approaches.
  
  We agree that it would also be interesting to see how higher-resolution data could be reconstructed from lower-resolution data. This would then be more equivalent to downscaling approaches in weather prediction. However, for comparing the skill of different interpolation methods, this is not ideal. At the points where data are available at both high and low resolution, these data would be different in each case. Reconstruction of points in between would then not only reflect differences in the skill of interpolation but also the data differences at the points from where the interpolation is done. This would thus not allow a "clean" evaluation of different interpolation methods, mixing the effects of interpolation and grid-cell averaging for the coarse-resolution data points.
  
  Minor comments
  L40: Higher-order interpolation. It would be interesting to see how higher-order interpolation schemes would compete with the ML approach. Did you give this any try?
  At this stage it is too premature to compare the efficiency. We are also not implementing the method in the most efficient way. For a true comparison we would need to implement it in the best way also in FLEXPART and then compare the execution time.
  
  L65ff: Largely repeating the same points and references as in the introduction. Consider removing/shortening it here or in the intro.
  We moved the section “Related work” to the introduction.
  
  L83: I would rather call this a 'vertical model layer' than a 'horizontal layer'.
  The input of the neural network is a horizontal u or v velocity component. Here, “vertical model layer” refers to the horizontal u or v velocity.
  
  L87: How much does the exclusive treatment of the horizontal wind components impact the flow's mass budget (continuity)? It is mentioned later (conclusions) that all interpolation methods suffer from potentially breaking conservation laws and that physics-based ML could improve things. Maybe it can already be mentioned here. Why was the vertical wind not included in this study? Are there any fundamental differences that make it impossible to directly train the model for vertical wind?
  The vertical velocity is fundamentally different from the horizontal velocity as it is much more small scale. In practice the vertical velocity will require training of a more complicated neural network as neural networks have a tendency to learn large scale features first, which is referred to as a spectral bias. For this study we did not have the computational resources to experiment with the vertical velocity.
  
  L115: Original levels are counted from the model top in IFS. So 0 to 50 would be the upper part of the atmosphere. What is the rational for cutting at level 50? What is the approximate pressure at this level? Does this separate into troposphere vs stratosphere?
  We have it bottom to top. Cutting at index 50 (ca 8000m) results in separating troposphere and stratosphere and above
  
  Related to training two models for two vertical layers. How about training different models land and ocean as these give fundamentally different lower boundary conditions. How much does the performance increases in the ML method differ for land and ocean areas? How much for boundary layer (where turbulence is part of FLEXPARTs transport description) vs free troposphere?
  Indeed, distinguishing land and ocean in the training would be an alternative to our rather simple differentiation by height levels. However, there are also many other potential alternatives, such as developing different training data sets for climatically different regions (e.g., tropics, subtropics, midlatitudes), within or above the boundary layer, or for different meteorological situations. For developing an optimal method, it will be important to explore several of these options but it is beyond the scope of the current exploratory paper. With respect to the boundary layer, it is important to note that turbulence parameterizations have been switched off in FLEXPART for the current paper, as we wanted to study interpolation errors in isolation.
  
  L131: Are mu_x and mu_y scalars representing the overall image mean? If yes, I don't quite understand the use of the 11 x 11 Gaussian filter. Furthermore, I think it would be good to argue if and why SSIM should be a useful metric for comparing wind components as opposed to images. I suppose wind components will have a very different pdf from that of images (color channels)?
  Here, we stated the definition of the SSIM as used in practise, which uses the 11x11 Gaussian filter. We agree that the SSIM is not a traditional error measure. However, since our model is an adaptation from an image processing task we felt it was reasonable to present also the SSIM measure as well as it is useful for the machine learning community.
  
  L132f: What is the motivation for K1 and K2? Why not simply mention C1=1E-4 and C2=9E-4?
  For the sake of completeness we stated the definition of K1 and K2 as well.
  
  L177: There is an exception to this observation! For SSIM linear interpolation in u seems to perform slightly better than model4.
  We changed it to almost always has better metrics
  
  L186, Fig.5: How would the same figure look like for the relative error? Are these large error associated with large wind speeds?
  The largest errors generally occur where wind shears are largest, and this is usually associated with fronts and, generally, higher than average wind speeds. We discuss this now in more detail in the discussion of Fig. 4, which presents a clear example of this.
  
  We updated figure 5 to present the relative error.
  
  L191: It is mentioned elsewhere that FLEXPART was not run on the same compute architecture as the ML model. How comparable are the times given here? Consider adding CPU/GPU specs.
  The training of the neural network is done on the mentioned GPU device. The FLEXPART simulations are run on a CPU. Since we updated the interpolated fields before the simulation, all simulation times are the same.
  
  Fig 6: Figure caption wrong? I assume these are similar differences as in Fig. 3
  After reading the caption of Fig. 6 again we could not find an error.
  
  Fig 7: Why do we not see the checkerboard pattern (as mentioned in the caption to Figure 4) here?
  We do see the checkerboard pattern, however it is only every fourth pixel that stays the same, this way the checkerboard pattern is less visible.
  
  L234ff: Other downscaling approaches ingest additional high-resolution predictor variables (like topography or land cover) that have a direct impact on near-surface flow and spatial variability. Could such predictors be integrated into the present method as well?
  This is a very interesting idea and could be integrated into the current method. However, it would be another scope and thus relevant for future research.
  
  Citation: https://doi.org/10.5194/egusphere-2022-441-AC8

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Rüdiger Brecht on behalf of the Authors (22 Dec 2022) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (04 Jan 2023) by Christopher Horvat

RR by Anonymous Referee #1 (03 Feb 2023)

Suggestions for revision or reasons for rejection

**Second review of "Improving trajectory calculations by FLEXPART 10.4+ using single image superresolution" by Brecht et al., submitted to GMD**

The authors have made revisions to their manuscript in response to my earlier comments. However, I feel that several points have not been properly addressed. Below are my comments.

**Main comments**

1. Both reviewers have commented that the selection of every other grid point etc. to degrade the wind field is not consistent with what the information on the model grid represents. Even if the authors find studies that have used the same approach before, the problem remains that there is a spatial sub-sampling of the discretized fluid, rather than a corser representation in terms of averaged properties. The sub-sampling at an interval violates the perception of the fluid as a continuity, which matters for interpolation. Many other members of the targeted audience of geophysical models will have the same reaction as the two reviewers, and will immediately be sceptical to your method and results.

Therefore, it seems to me a moderate but necessary adjustment to your study to average 2x2 and 4x4 grid points to obtain the coarser version of their training and test grids.

Even if this is the first study of this kind in atmospheric science, it should get the community interested into a new methodology, rather than raise scepticism. I therefore strongly recomment that such a basic aspect of the fluid no be overlooked.

2. The argumentation about performance is somewhat contractictory or uneven. The introduction heavily emphasizes how surprising it is that the "simple" linear regression is still being used. However, this is by no means a surprise. Rather, previous studies have shown that the cost-benefit ratio of higher-order interpolation did not justify other interpolation methods. This is actually stated in L. 59.

Towards the outlook section, you provide an estimate of 1 order of magnitude increase in computation time from the single image superresolution approach to obtain 20-50% lower AHTD. These are quite the same numbers you cite for higher-order schemes in L. 59 (1 order of magnitude, 30%).

I do not find the conclusions balanced in light of these facts. There is quite some overhead with implementing GPU-enabled model code, training, etc. If the same gains can be achieved with simple higher-order interpolation, why is it worth exploring your methods further? I am sure the authors can come up with an answer to this question, but it would be nice to see this properly stated.

3. There are still numerous hard-to-read sections in the manuscript. I make some recommendations in the minor comments below. I recommend the authors read some instructions on how to improve the clarity of scientific writing (Gopen and Swan, 1990; Schultz 2009)

Detailed comments:

L. 13: we demonstrate -> we find

L. 34: rephrase in light of major comment #2

L. 45-49: This paragraph very similar to L. 24, shorten/rephrase

L. 50, 53, 56: simple/surprising: rephase in light of major comment #2

L. 70: project -> study

L. 71: variety -> range. Please back-up statement with a reference. Maybe it would be more correct to state that this can be the case, but there are for example differences between small-scale turbulence and horizontal turbulence.

L. 97: most impressive: state objectively, e.g. results with smallest AHTD

L. 97: greatest ease of training -> most straightforward training. It is not clear what this means in practice.

L. 97: exclusively -> only

L. 102: ...levels are counted... -> level indexes increase upward, contrary to ECMWF

L. 106: this choice of method is not consistent with the concept of a continuous fluid, which matters for interpolation, see major comment #1.

L. 120: how is the structure different, and how has this been quantified?

L. 176: compute trajectories: this should be part of the methods, rather than the results. Maybe does not need to be mentioned here.

L. 177: we demonstrate -> we compare the accuracy

L. 180: we demonstrate -> we investigate

L. 182: we demonstrate -> we proceed with

L. 184: this sentence needs to be expanded to a full description of what is seen in Fig. 2. Deciphering the meaning of this figure can not be left to the reader.

L. 188: we consider -> we first consider

L. 211-219: It was not possible for the reviewer to comprehend what is described here, a figure or table?

Figure 2: the caption needs to be rephrased to describe panel contents. Methodological statements need to be moved to the main text.

Figure 4: Methodological statements need to be moved to the main text.

L. 229: this is indeed the case -> restate what is "this"

L. 243: we have also checked -> how has this been done

L. 245: slightly better: how has this been quantified?

L. 251: restate what "this" refers to

L. 285: see the papers -> see the studies

**References**

Gopen, G. D. ; Swan, J., The Science of Scientific Writing, American Scientist, 78(6), 550 - 558, 1990

Schultz, D. M., Eloquent Science: A Practical Guide to Becoming a Better Writer, Speaker, and Atmospheric Scientist, American Meteorological Society, 2009.

Hide

ED: Reconsider after major revisions (08 Feb 2023) by Christopher Horvat

AR by Rüdiger Brecht on behalf of the Authors (01 Mar 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (13 Mar 2023) by Christopher Horvat

AR by Rüdiger Brecht on behalf of the Authors (14 Mar 2023)

Journal article(s) based on this preprint

21 Apr 2023

Improving trajectory calculations by FLEXPART 10.4+ using single-image super-resolution

Rüdiger Brecht, Lucie Bakels, Alex Bihlo, and Andreas Stohl

Geosci. Model Dev., 16, 2181–2192, https://doi.org/10.5194/gmd-16-2181-2023,https://doi.org/10.5194/gmd-16-2181-2023, 2023

Short summary

Rüdiger Brecht, Lucie Bakels, Alex Bihlo, and Andreas Stohl

Viewed

Total article views: 1,171 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
880	246	45	1,171	11	11

HTML: 880
PDF: 246
XML: 45
Total: 1,171
BibTeX: 11
EndNote: 11

Views and downloads (calculated since 11 Jul 2022)

Month	HTML	PDF	XML	Total
Jul 2022	135	38	7	180
Aug 2022	72	25	3	100
Sep 2022	143	34	7	184
Oct 2022	114	28	9	151
Nov 2022	125	33	8	166
Dec 2022	73	18	4	95
Jan 2023	48	15	0	63
Feb 2023	80	24	1	105
Mar 2023	57	23	5	85
Apr 2023	33	8	1	42
May 2023	0
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Cumulative views and downloads (calculated since 11 Jul 2022)

Month	HTML	PDF	XML	Total
Jul 2022	135	38	7	180
Aug 2022	72	25	3	100
Sep 2022	143	34	7	184
Oct 2022	114	28	9	151
Nov 2022	125	33	8	166
Dec 2022	73	18	4	95
Jan 2023	48	15	0	63
Feb 2023	80	24	1	105
Mar 2023	57	23	5	85
Apr 2023	33	8	1	42
May 2023	0
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Viewed (geographical distribution)

Total article views: 1,101 (including HTML, PDF, and XML) Thereof 1,101 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 04 Sep 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (1466 KB)
Metadata XML

Short summary

We use neural network based single image super resolution to improve upscaling of meteorological wind fields to be used for particle dispersion models. This deep learning based methodology improves the standard linear interpolation typically used in particle dispersion models. The improvement of wind fields leads to substantial improvement of the computed trajectories of the particles.


Total:	0
HTML:	0
PDF:	0
XML:	0