the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Quantifying Variability in Lagrangian Particle Dispersal in Ocean Ensemble Simulations: an Information Theory Approach
Abstract. Ensemble Lagrangian simulations aim to capture the full range of possible outcomes for particle dispersal. However, single-member Lagrangian simulations are most commonly available and only provide a subset of the possible particle dispersal outcomes. This study explores how to generate the variability inherent in Lagrangian ensemble simulations by creating variability in a single-member simulation. To obtain a reference for comparison, we performed ensemble lagrangian simulations by advecting the particles from the surface of the Gulf Stream, around 35.61° N, 73.61° W, in each member of the NATL025-CJMCYC3 ensemble to obtain trajectories capturing the full ensemble variability. Subsequently, we performed single-member simulations with spatially and temporally varying release strategies to generate comparable trajectory variability and dispersal. We studied how these strategies affected the number of surface particles connecting the Gulf Stream with the eastern side of the subtropical gyre.
We used an information theory approach to define and compare the variability in the ensemble with the single-member strategies. We defined the variability as the marginal entropy or average information content of the probability distributions of the position of the particles. We calculated the relative entropy to quantify the uncertainty of representing the full-ensemble variability with single-member simulations. We found that release periods of 12 to 30 weeks most effectively captured the full ensemble variability, while spatial releases with a 2.0° radius resulted in the closest match at timescales shorter than 10 days. Our findings provide insights to improve the representation of variability in particle trajectories and define a framework for uncertainty quantification in Lagrangian ocean analysis.
- Preprint
(10455 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-3847', Anonymous Referee #1, 25 Jan 2025
This article works to obtain the variability in particle dispersion based on Eulerian velocity data, by generating variability in single (one initial condition) trajectories by choosing ensembles whose initial conditions are (i) close to the initial condition, and (ii) at same location as the initial condition, but are at nearby times. Information-theoretical measures are used to quantify the variability of these simulations in the Gulf Stream region.
Being able to gain insights into variability and uncertainty of simulations based on Eulerian velocity data is an important issue. To my mind, the crucial thing here would be in the ability to parameterize the input uncertainties (in the measurements of the Eulerian velocity, in using various forms of diffusivity in the advection process, in interpolation methods used to estimate data at subgrid scales where no data is available, etc), as well as give careful assessments and/or sensitivity analyses of the other variables (time-of-flow, radii used for initial seeding, release date, etc) in the process. Any quantifier of Lagrangian dispersal uncertainty obtained will of course be strongly dependent on the quantifiers of, and the perspectives on modeling the impact of, these input uncertainties. My main comments below are related to this main theme, on which any conclusions obtained from this study are crucially dependent.
1. The authors use spatial radii ranging from 0.1 to 2 degrees (9 to 180 km) for their initial clouds of particles in the spatial uncertainty situation. They quote that computed average decorrelation lengthscale of 0.41 degrees. It's not clear to me how long the particles were advected -- one would of course expect nearby fluid parcels to spread-apart more and more as time progresses (reflected in Figure 4A, for example, where the entropy is used rather than decorrelation). To my mind, these decorrelation numbers are therefore not "useful." What is the connection to this time-of-flow, and how does this decorrelation change in relation to that? Is the fact that 0.41 degrees comparable in size to the 0.25 degree of resolution in the NATL025-CJMCYC3 model, i..e., that the Eulerian velocity is not resolved below this lengthscale, relevant? What is the physical motivation for using distances of 9-180 km to seed particles in assessing the dispersal over a certain time for a single initial condition? Not highlighting (I don't know whether it's buried somewhere and I couldn't find it) the time-of-flow is a serious limitation, because this will have a profound effect on any results obtained.
2. It seems that all calculations were done with 2 January as the release date. Given the unsteadiness (time-dependence) of the Eulerian velocity field, one will of course get different results if using a different release date. From an oceanographer's viewpoint, why is 2 January so important? Alternatively, there needs to be convincing results to show that "similar" results are obtained for different choices of release dates. Also, one needs some understanding of the sensitivity of the results of the time-of-flow chosen.
3. For the temporally-varying release, the authors have used time windows of "4, 12 and 20 weeks, all starting from 2 January 2010." Once again, the time-of-flow is not clear, and it is moreover not clear whether the time-of-flow for each of these releases is the same. For example, if the time-of-flow was 2 weeks uniformly across all of these simulations, the "4 week time window" would, in my understanding correspond to having some particles released on 2 January and travelling until 16 January, but then also particles released on 30 January and travelling until 13 February. And presumably particles on intermediate days. All of this has to be made absolutely clear to enable any assessment of the results obtained. Moreover, as above, there needs to be some rationale for selecting these time windows.
4. If my above interpretation of the particle release is correct, I think there are some issues which interfere with the interpretation of this process giving an indicator of Lagrangian dispersal. Notice that in the example the particles travelling from 2 January to 16 January are driven by velocity field data which is *completely* different from the particles released on 30 January and travelling to 13 February. From the dynamical systems perspective, this means that one is comparing things which are driven by completely different dynamical systems -- so what does the usage of these different final locations in a dispersal assessment actually mean? On the other hand, if the above interpretation is wrong, and there is some other way of thinking about this, what would the interpretation of that be? For example, if all particles are flowed till 13 February despite being released on different days, then again the dynamical systems are different, as is the time-of-flow (which strongly impacts correlation and any measure of dispersal). This sort of issue is relevant in the later calculations in the article as well, for example when computing time series of histograms described in Section 2.4---exactly how "time" is interpreted is crucial in an unsteady flow such as this.
5. The authors say "to ensure particles released on the same day followed different trajectories, we added small random perturbations to their release locations using uniform noise with an amplitude of 0.1 degrees." Does this mean that one chooses uniformly from a ball of amplitude 0.1 degrees centered at the initial location? What this actually means is that both temporal and spatial variability in the initial condition is used here, and not just temporal variability, does it not?
6. Using temporal variability but from a set location is the concept of "streaklines" in fluid mechanics. This has seen importance in assessing transport due to Lagrangian motion principally when a streakline can sensibly be used as some sort of "barrier" between coherent structures, notably when the position from which the streakline emanates is fixed (on a boundary), for example see Haller (J Fluid Mech 2004), Zhang (SIAM Review, 2013), Karrasch (SIAM J Applied Dyn Sys, 2016), Balasuriya (SIAM J Applied Dyn Sys, 2017). For genuinely unsteady flows, since the velocity field around any fixed Eulerian location changes with time, the dispersal from a streakline approach (as the authors do here in their temporal release strategy) will provide a curve (a tube in this case since a small cloud is released at each time) at any given final instance in time -- but exactly how to interpret this tube from a Lagrangian dispersal perspective is unclear. It will consist of particles which have flowed for different times, and (for example) particles in a particular cross-section of the tube will not necessarily have flowed for the same time. If instead the interpretation I have in point 3 above is used, what this means is that a particular subsample of the points in this "streaktube" will be obtained for each day of release -- and how one uses the collection of these subsamples to quantify Lagrangian dispersal appears ambiguous.
7. The authors do not seem to have used any randomness in the Lagrangian advection process (only randomness in initializing) which to my mind does not take into account fundamental contributors to Lagrangian dispersal in the ocean: effects of eddy diffusivity and uncertainties in the driving velocity fields. Eddy diffusivity modeling is a vast and important area (Berner et al, Bull Amer Meteorol Soc, 2017), for which many different models exist (e.g., Griffa, in Stochastic Modeling in Physical Oceanography, Birkhauser, 1996). Here, though, dispersal seems to happen though the taking of nearby initial conditions both spatially and temporally. This seems artificial when there is a "natural" way of including the primary physical issues via advecting a stochastic differential equation model with small noise, or using the alternative representation via the Fokker-Planck equation (e.g., Chandrasekhar, Rev Modern Phys, 1943) which explicitly governs a probability density of particles, or using more sophisticated eddy diffusivity models. The results from the Fokker-Planck, for example, would be an explicit quantifier of Lagrangian dispersal. There is a substantial literature on these methods, also in usage in oceanic data. Of course, running stochastic simulations of this nature, or attempting to solve the Fokker-Planck may incur substantial computational costs, but these would seem more compelling approaches to Lagrangian dispersal.
8. There are emerging tools for avoiding stochastic simulation and explicitly obtaining some quantifiers which are related to dispersal, in particular the idea of stochastic sensitivity (Balasuriya, SIAM Review, 2020) which has been shown to also be robust when applying to oceanographic data (Badza, Physica D, 2023). The authors' approach here of seeking dispersal measures for each initial condition (as opposed to a large ensemble) at first glance appears to be similar to finding the stochastic sensitivity (a scaled variance of the final distribution around the deterministic final location, when the Lagrangian evolution is subject on ongoing model noise) at each member (initial) location. Another approach along these lines which appears relevant is that of Branicki and Uda (SIAM J Appl Dyn Sys, 2023).
9. The authors use several quantifiers for dispersal based on their Lagrangian simulations: mixture probability distributions, connectivity, entropy, and KL-divergence. Computing each of these requires the final Lagrangian distributions. So, interpreting any of these depends strongly on how those Lagrangian distributions were obtained in a physically reasonable fashion (see my earlier points on this). Because of this, it is hard for me to interpret any of these information-theoretic results.
10. It appears that a major point that is being made is that the full-ensemble variability is being approximated by single-member simulations. This is stated in many places, but I am a little uncertain as to whether my interpretation of this statement is correct. My understanding is that "single-member" means one particular initial condition is chosen. By choosing "50 members" (line 158), Lagrangian simulations associated with 50 different initial conditions are chosen. Then the probability distributions associated with each of these 50 are combined in a mixture model to get the "full" distribution. Is this understanding correct? If so, it would appear that the "full ensemble" can be thought of as the 50 simulations collectively, and so this is collectively the "full ensemble"? If this is so, the claim that the "full-ensemble" is obtained from "single-member" simulations, presumably demonstrating some advantage, isn't too different from simply saying that one is choosing a full-ensemble comprising 50 members, presumably chosen to cover a region of interest at the initial time sufficiently well? If this is not true, things have not been expressed clearly. Basically, I think some clarification on this claim is necessary, explaining exactly what is meant and how it is achieved (and exactly what a "single-member simulation" means).
11. And finally to return to my preface to the numbered points: to interpret any of the results on dispersal, one needs to be comfortable that the strategies for computationally determining dispersal here have something to do with the physical issues which lead to dispersal. Many of the points above are related to this, asking for clarification as to why the actions used in this article to generate dispersed trajectories are meaningful, whether they can be parameterized in terms of something physical, effects of time, why diffusivity/stochasticity in the evolution is ignored, why other techniques which explicitly capture dispersion are not used, etc.
Based on my comments above, I feel that a major revision would be necessary for this article to be acceptable for publication in Nonlinear Processes in Geophysics.
Citation: https://doi.org/10.5194/egusphere-2024-3847-RC1 -
RC2: 'Comment on egusphere-2024-3847', Anonymous Referee #2, 01 Mar 2025
The manuscript concerns ensembles of trajectories of tracers generated by ocean models. The authors wish to generate the "variability" of multiple ensembles by manipulating a single trajectory. The manipulations are primarily perturbing an initial condition in space, or by starting the trajectory at a different time. In order for such a study to be useful, pinning down exactly what "variability" means is crucial. The word "variability" is used extensively in the first several pages of the manuscript without stating what it actually is; I think perhaps on page 7, in relation to "connectivity", the reader begins to see what might be meant. As far as I could tell, according to this manuscript, variability means either "connectivity", or various types of entropy of coarse-grained future distributions of trajectories. A case is not clearly made for why these quantities are useful for ocean dynamicists or oceanographers. That is, why are these quantities the gold standard by which oceanographers should assess "sameness" of (collections of) trajectories.
The manuscript does not mention models where subgrid-scale dynamics is simulated e.g. stochastically. This would appear to be a very relevant set of comparators. It is also well known that trajectories are influenced by the resolution of the model grid, and that very different dynamics can arise from the same model with different resolutions. This aspect is also not addressed; as far as I understand, only a single 1/4 degree model is used.
Line 90: "The first strategy varies the release locations". Isn't this exactly part of what one does with ensemble generation? What is the difference?
Section 2.2: I believe it is a poor choice to put the first results in an appendix. In fact, the manuscript reads as though it was recently chopped up and rearranged because it is impossible to read from start to finish via the Appendices.
The Appendices contain definitions and details that the reader has not yet come across when reading from start to finish. I would strongly recommend removing the appendices and putting the material in the body of the paper to help with the narrative flow.
Section 2.4: I could not understand how the probability distributions were being formed. The description is wordy, vague, and a bit sloppy. It needs precision and some formulas wouldn't hurt. Is a hexagon a bin? I could not find it stated.
Section 2.5: I could not understand what a mixture probability distribution is. This seems to be a crucial object in the manuscript, but the description was brief and ambiguous. Again, some formulas may help. There is a discussion about the optimal number of particles. In what sense optimal? Again the reader is referred to an Appendix that is too brief and does not provide any insight.
Section 2.6: Connectivity is not defined, and it is not explained in the Appendix. What is it?
Section 2.7: Similar to section 2.4, the section is written verbosely and somewhat sloppily, to the extent that I could not understand what the various definitions were.
Line 202: "ensemble of bins" what does this mean?
Line 203: "t is the particle age of the distribution". This may make sense if the distribution was unambiguously defined at some point.
Line 210: The authors write "P_A(X) = (1/2, 1/4, 1/8, 1/8)". As far as I understood from e.g. line 201, P(x_i) should be the probability of event x_i occuring, i.e. a number. Therefore since X=(x_1,x_2,x_3,x_4) -- see line 211 -- P_A(X) should equal 1, not a string of probabilities. This is just one example of the vague writing. If the authors really want P_A(X) to be a string of probabilities, that is fine, define some suitable object, and ensure that the writing is clear and consistent.
In summary, on the basis of both the writing and the scientific impact, I do not recommend publication.
The manuscript could be improved by going back to the drawing board and asking what exactly are the properties by which it is *most meaningful* to oceanographers to compare trajectories or ensembles of trajectories. Strong justifications and illustrations would need to be provided. Comparing with ocean models of different types, including stochastic components, and across different grid resolutions would add to the robustness and generalization of the subsequent results. A linear narrative and a much more precise presentation would also be required.
Citation: https://doi.org/10.5194/egusphere-2024-3847-RC2
Model code and software
Model code C. M. Pierard https://github.com/OceanParcels/NEMO_Ensemble_Lagrangian_Analysis.git
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
209 | 34 | 12 | 255 | 9 | 8 |
- HTML: 209
- PDF: 34
- XML: 12
- Total: 255
- BibTeX: 9
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 59 | 25 |
United Kingdom | 2 | 52 | 22 |
China | 3 | 16 | 6 |
France | 4 | 15 | 6 |
Netherlands | 5 | 10 | 4 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 59