the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
CHONK 1.0: landscape evolution framework: cellular automata meets graph theory
Boris Gailleton
Luca Malatesta
Guillaume Cordonnier
Jean Braun
Abstract. Landscape Evolution Models (LEMs) are prime tools to simulate the evolution of sourcetosink systems through ranges of spatial and temporal scales. Plethora of different empirical laws have been successfully applied to describe the different parts of these systems: fluvial erosion, sediment transport and deposition, hillslope diffusion, or hydrology. Numerical frameworks exist to facilitate the combination of different subsets of laws, mostly by superposing grids of fluxes calculated independently. However the exercise becomes increasingly challenging when the different laws are interconnected: for example when a lake breaks the upstreamdownstream continuum of the amount of sediment and water it receives and transmits; or when erosional efficiency depends of the composition of a sediment flux affected by multiple processes. In this contribution, we present a method mixing the advantages of cellularautomata and graph theory to address such cases. We demonstrate how the former guarantees finite knowledge of all fluxes independently from the processlaw implemented in the model while the latter offer a wide range of tools to process numerical landscapes, including landscapes with closed basins. We provide three scenario largely benefiting from our method: i) one where lake systems are primary controls on Landscape evolution, ii) one where sediment provenance is closely monitored through the stratigraphy and iii) one where heterogeneous provenance influences fluvial incision dynamically. We finally outline the way forward to make this method more generic and flexible.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(2397 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(2397 KB)  Metadata XML
 BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Boris Gailleton et al.
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221394', Kerry Callaghan, 01 Feb 2023
In this paper, the authors present a new framework for landscape evolution simulation, dubbed CHONK 1.0, that combines aspects of both graph theory and cellular automata methods. CHONK 1.0 uses D8 flow directions to construct a directed acyclic graph for the landscape, then uses this to break the landscape into depressions that are linked in a binary tree structure. Both water and sediment fluxes into and across lakes are simulated. The authors show examples of several different cases that this framework could be applied to.
I am glad to see more acknowledgment of the impacts of landscape complexity – specifically, closed lake depressions – in this work. I am excited to see the range of potential applications that the authors cover here. However, there are a few issues with this iteration of the paper, including:

A lack of clarity around the differences between the methods presented here and the works by Barnes (2019, 2021). This is of particular importance given the focus of GMD on model development.

More information about model input data and general use is needed.

There is a lack of clarity in the explanations in some of the application sections.
Line by line comments:
Line 89: “Cellular automata models are reduced complexity models”
followed by
line 94: “Cellular automata methods are not restrained to reduced complexity models”
My best guess here is that the authors meant that cellular automata models are often reduced complexity, but not always; but as it currently reads these are just two conflicting statements.
Line 99: “One limitation of a purely cellular automata model is that cells are processed at the same time at each time step, which is not compatible with quantities that are informed by spatial integration like the downstream accumulation of drainage area or sediment flux”
I found this statement unclear. Do you mean that all cells in a cellular automata model are processed at the same time? Or do you mean that the processing order is always the same?
Line 104: “but every nodes on the grid is treated as a cell” and also
line 176: “Each individual node of the matrix becomes a cell, noted i”
Can you define a cell and a node and/or clarify the difference between them?
Lines 143147: “It is worth noting that some algorithms have been specifically develop to explicitly process, calculate and fill depressions with arbitrarily given amount of water (e.g. L. Callaghan and D. Wickert, 2019; Barnes et al., 2019, 2021). However, these methods are only designed to fill pits with water and would require significant amount of modifications to be utilised as cellular automata processor, or even to any other purpose than what they are designed for.”
I will note that I am an author on the papers cited here. I think there may be some misrepresentation or misunderstanding of these works.
Callaghan & Wickert (2019) is basically a reduced complexity cellular automata method already (see section 5.2 in that paper), though it is true that it is only set up to move water into pits in that paper.
Barnes et al (2019) describes a graph structure that can be used to link depressions in a landscape to one another (essentially, the same or very similar structure as used in this work). While the focus of subsequent work has been on filling these depressions with water, the actual data structure has other potential applications, including topographic modification and statistical analysis of landscapes.
Barnes et al (2021) focuses on using Barnes et al (2019) to fill pits with water. However, it incorporates an option that would activate a cellular automata method: the option to include infiltration of water during downslope flow when moving water to fill depressions. This functionality already exists and, in contrast to the statement made in this work, it would not be that big of a leap to incorporate additional complexity into the cellular automaton portion.
Line 205: “Finally an 205 algorithm takes advantage of the modified DAG to calculate the depressionaware topological order.”
Did you mean finally ‘the’ algorithm?
Line 248: “For example if the current cell is already labelled with another depression index in which case both are labelled as twins and this cell represent their tipping node”
Line 250: “If one of the checked neighbours has a lower elevation, the cell is labeled as outlet and this depression will be the top one if not 250 later labeled as twin.”
I find these explanations to be unclear.
Line 258: “the minimum volume of a depression (0 if base depression),”
How does this differ from the volume of the depression listed on the line above?
Lines 255263: “Our implementation only slightly differs from Barnes et al. (2019). The original algorithm is tuned for computing speed, we compute more information about each depression useful for the model: – the volume of the depression Vtotal, note that it includes the volume of their children if any, – the minimum volume of a depression (0 if base depression), – a depression level, which represents the maximum distance in the tree from a base depression. Each base depression is at level 0, and each parent’s level is equal to the maximum level of their children plus 1, – the tipping node of the depression, which represents either the outlet of the whole subsystem, or the node joining two twins – the maximum elevation of the depression if filled.”
More clarity is needed around the difference between this implementation and Barnes et al (2019) including why and how the differences are important for this work and why a new framework was needed to achieve this. Note that Barnes et al (2019) does also compute depression volume, tipping node, maximum elevation, and other related information about the depressions. This structure is a major part of the work presented and so it is important to clarify this and to focus on the aspects of the work that are new.
Line 266: “based on matrices Evaporation rates”
Here a possible input matrix, evaporation, is mentioned. It would be helpful if there were information regarding what inputs are required, or possible, to use CHONK 1.0.
Line 290: “Some of these properties, like erosion or water for example, are reinitialised at each time step while others like topography 290 or sediment thickness gets updated at each time step”
It is not clear what the difference is between a reinitialisation and an update.
Lines 292297: “Three kinds of parameter inputs are currently available. First, external parameters which can be single values (e.g. dx, dy, dt), global arrays (e.g. 2D matrices of precipitation or uplift), spatially varying or even varying through time. Second, parameters that are labeldependent: a 2D matrix of labels defines discrete spatial areas and each label has a set of distinct parameters, for example different rocktype can be associated with different erodibility and diffusivity (Gailleton, 2021). And third parameters 295 that are fully dynamic: they are interdependent of each other and define by a function rather than a given value. Example of the latter are detailed in section 4.4.”
I found it difficult to imagine which types of parameters might match up with which of these three classes listed. This is another place where information about the required and/or possible inputs would be helpful.
Lines 325346 discuss the method for filling lakes with water and computing the water elevation.
Although I can see that the methods are not identical to those used in Barnes et al (2021), they are similar and the aim – to compute lake water levels – seems to be the same. Can you clarify how this differs from Barnes et al (2021) and why this was necessary to achieve the aims of the rest of the paper?
Line 476478: “It quickly becomes disconnected from the foreland as the blue linessets as described below must be submitted to specialized repositories. Please consider providing at least preliminary links to such assets for the period of minimum elevation shows (fig. 5C).”
I’m struggling to make sense of these sentences and wondering if they were in an incomplete state or a copy/paste gone wrong?
Page 1823:
These examples are great, but I did find myself wanting some more information. For example, what were the time steps used? What method was used to simulate the tectonic change? Are there more figures that might be informative? (for example, in application 1, is a graph of water flux out of the lake informative? Can a viewer learn something from seeing the result at more than one point in time?)
Within this, pages 2123:
I found the explanations of these applications to be significantly less clear. I found section 4.3 particularly confusing, both in the text describing this and in Figure 7 which I found difficult to interpret. More explanation of the colours used and description of what is being viewed in each panel would go a long way.
Line 545: “With hard tool enhancing incision, the area downstream of the harder 545 rocks lowers its base level which will propagate knickpoints up tributaries, regardless of their lithology” (and the rest of this section) 
How do you deal with mixed sediments, i.e. in the case where water moves over hard, then soft, then hard again? The sediments should be a mix of hard and soft. Are they assumed to mix evenly?
Citation: https://doi.org/10.5194/egusphere20221394RC1 

RC2: 'Comment on egusphere20221394', Sebastien Carretier, 03 Feb 2023
This manuscript presents a new landform evolution model, CHONK, with three major advances, that of filling lakes dynamically, of taking into account the tool effect in the abrasion of river bedrock as a function of the lithology of the transported sediments and that of being able to trace several properties of the sources of eroded rocks upstream. This opens up many possibilities for sourcetosink studies. After several readings, however, I found it difficult to understand why this new model was a different philosophical approach to other LEMs. I have the impression that it is more a matter of assembling relevant elements from previous models to allow for sediment and water tracking rather than a real change in philosophy (cellular automata + graph theory). I think that part of my misunderstanding comes from a language that uses implicit formulas and shortcuts that, for me, do not simplify the discourse but rather make it more obscure. I provide an annotated pdf pointing to all these elements of language that I did not fully understand. I quote one here as an example to illustrate my misunderstanding (line 320): "When the implicit lake solver is activated, lakes are not processed differently than the rest of the landscape, but cells in claulated areas affected by flow rerouting have reduced topographic gradient and less direct connection to the rest of the landscape, effectively simulating a "passive" landscape." What is "flow rerouting"? Why is the topographic gradient reduced? What is a passive landscape?
To justify this new model, the authors point out that the previous models deal separately with water discharge, lake resolution, sediment transport, etc. But here I do not see where the fundamental difference lies. LEMs that solve the topographic evolution explicitly (in terms of numerical scheme) start by propagating the water discharge, possibly by filling the lakes, then once the water flow field is known, these LEMs compute the erosionsedimentation balance on each cell, in cascade (See review in Tucker and Hancock, 2010, ESPL). The order in which the grid cells are processed can vary but is always from upstream to downstream, possibly distinguishing between catchments (as in Fastscape). The fact of classifying the cells in order of decreasing altitude, which was already present in the first models such as GOLEM or DRAINAL, determines a graph. The calculation of the water and sediment balance in these models follows the logic of cellular automata. There is nonlocality in some of these models, associated with a transport length, so that these LEMs have long since combined graphs, cellular automata and the consideration of the upstreamdownstream connection for water and sediment flows. Moreover, the only LEM to my knowledge that can really solves both water and sediment balances at the same time is EROS (now River.lab) with its purely Lagrangian approach using "precipitons". In short, while acknowledging the significant advances of this new model regarding lakes, the tool effect and source tracing (see also the Badlands model in Petit et al., Esurf 2023), I think the manuscript would benefit from a better explanation of the fundamental difference between the approach implemented in CHONK and other LEMs. This remark may seem unfair because a large introductory part of the manuscript is dedicated to justifying this difference. I suggest to give more precise examples in this part and to improve the discussion to explain why the numerical scheme of the existing models could not allow to perform the simulations presented in this manuscript. The necessary corrections are therefore essentially a rewriting of certain paragraphs.
As this is a manuscript submitted in a journal that presents the algorithm of the models rather than their applications around a scientific question, I think a pseudo code of this model should be presented, to first order, with at least the order of operations. At present, if I were to recode CHONK, I am not sure that I have the main elements to do so in what is presented in this manuscript.
The simulations presented as illustrations are quite relevant but no parameter values are given. A table should be added. It would also be interesting to have some indication of the calculation time of these simulations.
I am available to interact with authors to give more details if needed.
See my specific comments in the annotated manuscript.
Good luck with the revision.
Sebastien Carretier

EC1: 'Comment on egusphere20221394', Andrew Wickert, 08 Feb 2023
I encourage the authors to respond to the constructive criticism from the referees and to begin preparing a revised manuscript.
Citation: https://doi.org/10.5194/egusphere20221394EC1  AC1: 'Response to Referees', Boris Gailleton, 23 Jun 2023
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221394', Kerry Callaghan, 01 Feb 2023
In this paper, the authors present a new framework for landscape evolution simulation, dubbed CHONK 1.0, that combines aspects of both graph theory and cellular automata methods. CHONK 1.0 uses D8 flow directions to construct a directed acyclic graph for the landscape, then uses this to break the landscape into depressions that are linked in a binary tree structure. Both water and sediment fluxes into and across lakes are simulated. The authors show examples of several different cases that this framework could be applied to.
I am glad to see more acknowledgment of the impacts of landscape complexity – specifically, closed lake depressions – in this work. I am excited to see the range of potential applications that the authors cover here. However, there are a few issues with this iteration of the paper, including:

A lack of clarity around the differences between the methods presented here and the works by Barnes (2019, 2021). This is of particular importance given the focus of GMD on model development.

More information about model input data and general use is needed.

There is a lack of clarity in the explanations in some of the application sections.
Line by line comments:
Line 89: “Cellular automata models are reduced complexity models”
followed by
line 94: “Cellular automata methods are not restrained to reduced complexity models”
My best guess here is that the authors meant that cellular automata models are often reduced complexity, but not always; but as it currently reads these are just two conflicting statements.
Line 99: “One limitation of a purely cellular automata model is that cells are processed at the same time at each time step, which is not compatible with quantities that are informed by spatial integration like the downstream accumulation of drainage area or sediment flux”
I found this statement unclear. Do you mean that all cells in a cellular automata model are processed at the same time? Or do you mean that the processing order is always the same?
Line 104: “but every nodes on the grid is treated as a cell” and also
line 176: “Each individual node of the matrix becomes a cell, noted i”
Can you define a cell and a node and/or clarify the difference between them?
Lines 143147: “It is worth noting that some algorithms have been specifically develop to explicitly process, calculate and fill depressions with arbitrarily given amount of water (e.g. L. Callaghan and D. Wickert, 2019; Barnes et al., 2019, 2021). However, these methods are only designed to fill pits with water and would require significant amount of modifications to be utilised as cellular automata processor, or even to any other purpose than what they are designed for.”
I will note that I am an author on the papers cited here. I think there may be some misrepresentation or misunderstanding of these works.
Callaghan & Wickert (2019) is basically a reduced complexity cellular automata method already (see section 5.2 in that paper), though it is true that it is only set up to move water into pits in that paper.
Barnes et al (2019) describes a graph structure that can be used to link depressions in a landscape to one another (essentially, the same or very similar structure as used in this work). While the focus of subsequent work has been on filling these depressions with water, the actual data structure has other potential applications, including topographic modification and statistical analysis of landscapes.
Barnes et al (2021) focuses on using Barnes et al (2019) to fill pits with water. However, it incorporates an option that would activate a cellular automata method: the option to include infiltration of water during downslope flow when moving water to fill depressions. This functionality already exists and, in contrast to the statement made in this work, it would not be that big of a leap to incorporate additional complexity into the cellular automaton portion.
Line 205: “Finally an 205 algorithm takes advantage of the modified DAG to calculate the depressionaware topological order.”
Did you mean finally ‘the’ algorithm?
Line 248: “For example if the current cell is already labelled with another depression index in which case both are labelled as twins and this cell represent their tipping node”
Line 250: “If one of the checked neighbours has a lower elevation, the cell is labeled as outlet and this depression will be the top one if not 250 later labeled as twin.”
I find these explanations to be unclear.
Line 258: “the minimum volume of a depression (0 if base depression),”
How does this differ from the volume of the depression listed on the line above?
Lines 255263: “Our implementation only slightly differs from Barnes et al. (2019). The original algorithm is tuned for computing speed, we compute more information about each depression useful for the model: – the volume of the depression Vtotal, note that it includes the volume of their children if any, – the minimum volume of a depression (0 if base depression), – a depression level, which represents the maximum distance in the tree from a base depression. Each base depression is at level 0, and each parent’s level is equal to the maximum level of their children plus 1, – the tipping node of the depression, which represents either the outlet of the whole subsystem, or the node joining two twins – the maximum elevation of the depression if filled.”
More clarity is needed around the difference between this implementation and Barnes et al (2019) including why and how the differences are important for this work and why a new framework was needed to achieve this. Note that Barnes et al (2019) does also compute depression volume, tipping node, maximum elevation, and other related information about the depressions. This structure is a major part of the work presented and so it is important to clarify this and to focus on the aspects of the work that are new.
Line 266: “based on matrices Evaporation rates”
Here a possible input matrix, evaporation, is mentioned. It would be helpful if there were information regarding what inputs are required, or possible, to use CHONK 1.0.
Line 290: “Some of these properties, like erosion or water for example, are reinitialised at each time step while others like topography 290 or sediment thickness gets updated at each time step”
It is not clear what the difference is between a reinitialisation and an update.
Lines 292297: “Three kinds of parameter inputs are currently available. First, external parameters which can be single values (e.g. dx, dy, dt), global arrays (e.g. 2D matrices of precipitation or uplift), spatially varying or even varying through time. Second, parameters that are labeldependent: a 2D matrix of labels defines discrete spatial areas and each label has a set of distinct parameters, for example different rocktype can be associated with different erodibility and diffusivity (Gailleton, 2021). And third parameters 295 that are fully dynamic: they are interdependent of each other and define by a function rather than a given value. Example of the latter are detailed in section 4.4.”
I found it difficult to imagine which types of parameters might match up with which of these three classes listed. This is another place where information about the required and/or possible inputs would be helpful.
Lines 325346 discuss the method for filling lakes with water and computing the water elevation.
Although I can see that the methods are not identical to those used in Barnes et al (2021), they are similar and the aim – to compute lake water levels – seems to be the same. Can you clarify how this differs from Barnes et al (2021) and why this was necessary to achieve the aims of the rest of the paper?
Line 476478: “It quickly becomes disconnected from the foreland as the blue linessets as described below must be submitted to specialized repositories. Please consider providing at least preliminary links to such assets for the period of minimum elevation shows (fig. 5C).”
I’m struggling to make sense of these sentences and wondering if they were in an incomplete state or a copy/paste gone wrong?
Page 1823:
These examples are great, but I did find myself wanting some more information. For example, what were the time steps used? What method was used to simulate the tectonic change? Are there more figures that might be informative? (for example, in application 1, is a graph of water flux out of the lake informative? Can a viewer learn something from seeing the result at more than one point in time?)
Within this, pages 2123:
I found the explanations of these applications to be significantly less clear. I found section 4.3 particularly confusing, both in the text describing this and in Figure 7 which I found difficult to interpret. More explanation of the colours used and description of what is being viewed in each panel would go a long way.
Line 545: “With hard tool enhancing incision, the area downstream of the harder 545 rocks lowers its base level which will propagate knickpoints up tributaries, regardless of their lithology” (and the rest of this section) 
How do you deal with mixed sediments, i.e. in the case where water moves over hard, then soft, then hard again? The sediments should be a mix of hard and soft. Are they assumed to mix evenly?
Citation: https://doi.org/10.5194/egusphere20221394RC1 

RC2: 'Comment on egusphere20221394', Sebastien Carretier, 03 Feb 2023
This manuscript presents a new landform evolution model, CHONK, with three major advances, that of filling lakes dynamically, of taking into account the tool effect in the abrasion of river bedrock as a function of the lithology of the transported sediments and that of being able to trace several properties of the sources of eroded rocks upstream. This opens up many possibilities for sourcetosink studies. After several readings, however, I found it difficult to understand why this new model was a different philosophical approach to other LEMs. I have the impression that it is more a matter of assembling relevant elements from previous models to allow for sediment and water tracking rather than a real change in philosophy (cellular automata + graph theory). I think that part of my misunderstanding comes from a language that uses implicit formulas and shortcuts that, for me, do not simplify the discourse but rather make it more obscure. I provide an annotated pdf pointing to all these elements of language that I did not fully understand. I quote one here as an example to illustrate my misunderstanding (line 320): "When the implicit lake solver is activated, lakes are not processed differently than the rest of the landscape, but cells in claulated areas affected by flow rerouting have reduced topographic gradient and less direct connection to the rest of the landscape, effectively simulating a "passive" landscape." What is "flow rerouting"? Why is the topographic gradient reduced? What is a passive landscape?
To justify this new model, the authors point out that the previous models deal separately with water discharge, lake resolution, sediment transport, etc. But here I do not see where the fundamental difference lies. LEMs that solve the topographic evolution explicitly (in terms of numerical scheme) start by propagating the water discharge, possibly by filling the lakes, then once the water flow field is known, these LEMs compute the erosionsedimentation balance on each cell, in cascade (See review in Tucker and Hancock, 2010, ESPL). The order in which the grid cells are processed can vary but is always from upstream to downstream, possibly distinguishing between catchments (as in Fastscape). The fact of classifying the cells in order of decreasing altitude, which was already present in the first models such as GOLEM or DRAINAL, determines a graph. The calculation of the water and sediment balance in these models follows the logic of cellular automata. There is nonlocality in some of these models, associated with a transport length, so that these LEMs have long since combined graphs, cellular automata and the consideration of the upstreamdownstream connection for water and sediment flows. Moreover, the only LEM to my knowledge that can really solves both water and sediment balances at the same time is EROS (now River.lab) with its purely Lagrangian approach using "precipitons". In short, while acknowledging the significant advances of this new model regarding lakes, the tool effect and source tracing (see also the Badlands model in Petit et al., Esurf 2023), I think the manuscript would benefit from a better explanation of the fundamental difference between the approach implemented in CHONK and other LEMs. This remark may seem unfair because a large introductory part of the manuscript is dedicated to justifying this difference. I suggest to give more precise examples in this part and to improve the discussion to explain why the numerical scheme of the existing models could not allow to perform the simulations presented in this manuscript. The necessary corrections are therefore essentially a rewriting of certain paragraphs.
As this is a manuscript submitted in a journal that presents the algorithm of the models rather than their applications around a scientific question, I think a pseudo code of this model should be presented, to first order, with at least the order of operations. At present, if I were to recode CHONK, I am not sure that I have the main elements to do so in what is presented in this manuscript.
The simulations presented as illustrations are quite relevant but no parameter values are given. A table should be added. It would also be interesting to have some indication of the calculation time of these simulations.
I am available to interact with authors to give more details if needed.
See my specific comments in the annotated manuscript.
Good luck with the revision.
Sebastien Carretier

EC1: 'Comment on egusphere20221394', Andrew Wickert, 08 Feb 2023
I encourage the authors to respond to the constructive criticism from the referees and to begin preparing a revised manuscript.
Citation: https://doi.org/10.5194/egusphere20221394EC1  AC1: 'Response to Referees', Boris Gailleton, 23 Jun 2023
Peer review completion
Journal article(s) based on this preprint
Boris Gailleton et al.
Model code and software
CHONK 1.0: prototype Boris Gailleton https://github.com/bgailleton/CHONK
Boris Gailleton et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

513  255  23  791  11  13 
 HTML: 513
 PDF: 255
 XML: 23
 Total: 791
 BibTeX: 11
 EndNote: 13
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(2397 KB)  Metadata XML