the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
MultilayerPy (v1.0): A Python-based framework for building, running and optimising kinetic multi-layer models of aerosols and films
Abstract. Kinetic multi-layer models of aerosols and films have become the state-of-the-art method of describing complex aerosol processes at particle and film level. We present MultilayerPy: an open-source framework for building, running and optimising kinetic multi-layer models – namely the kinetic multi-layer model of aerosol surface and bulk chemistry (KM-SUB), and the kinetic multi-layer model of gas-particle interactions in aerosols and clouds (KM-GAP). The modular nature of this package allows the user to iterate through various reaction schemes, diffusion regimes and experimental conditions in a systematic way. In this way, models can be customised and the raw model code itself, produced in a readable way by MultilayerPy, is fully customisable. Optimisation to experimental data using local or global optimisation algorithms is included in the package along with the option to carry out statistical sampling and Bayesian inference of model parameters with a Markov Chain Monte Carlo (MCMC) sampler (via the emcee Python package). MultilayerPy abstracts the model building process into separate building blocks, increasing the reproducibility of results and minimising human error. This paper describes the general functionality of MultilayerPy and demonstrates this with use cases based on the oleic acid-ozone heterogeneous reaction system. The tutorials in the source code (written as Jupyter notebooks) and the documentation aim to encourage users to take advantage of this tool, which is intended to be developed in conjunction with the user base.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(837 KB)
-
Supplement
(81 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(837 KB) - Metadata XML
-
Supplement
(81 KB) - BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-143', Anonymous Referee #1, 30 May 2022
This work developed an open-source framework MultilayerPy to build and run kinetic multi-layer models. Researchers can utilize this framework to choose a certain reaction scheme, diffusion regime, and model component based on particular aims. It is a reproducible process. Local and global optimization are applied and further tested on the oleic acid-ozone heterogeneous reaction system. The work is well-presented, and the framework makes comparing aerosol models with experiment data easier. I have one major comment and several minor comments.
Major comments:
Overall, the quantity of case studies is insufficient to assess the framework's potential applications. For example, current case studies are not convincing in testing the multi-layer modeling ability of MultilayerPy. As shown in figure 3, there is no oleic acid concentration gradient between the layers. Could the authors provide more cases to test the ability to represent the layer differences? Also, will it be easy to find other reaction systems to test the differences between KG-SUB and KM-GAP? and how the optimization algorithms will address the differences?
Minor comments:
Line 29--33: Consider including more references about the specific scientific questions KM-SUB and KM-GAP have resolved. That can be the potential application of MultilayerPy.
Line 54: typo ``readable''
Line 62: I think the flexibility of adding unique processes to the framework is an important feature. However, the steps to achieve this are not well-documented in the rest of the paper. I recommend adding some descriptions about this at the steps listed in lines 93--100.
Figure 3: Would it be better to combine (b) and (c) together? It will make the differences more evident.
Line 252: Besides the codes, it would be easy for readers to follow if the steps to create the cases are briefly documented.
Line 280: How did you get the optimized value of $\alpha_{\rm s,0, ozone}$? Did you use the same model data used for the MCMC sampling procedure? I am thinking of training and validating data differences in Machine learning.
Citation: https://doi.org/10.5194/egusphere-2022-143-RC1 -
RC2: 'Comment on egusphere-2022-143', Anonymous Referee #2, 07 Jun 2022
General Comments
This manuscript develops an open-source package which can build kinetic multi-layer models. These models are useful in studying bulk-surface interactions for particles and the interplay between chemistry and thermodynamics. The ozonolysis of oleic acid system, which has been studied with both experimental work and kinetic modeling, is used as the most extensive example. Optimization of model parameters with experimental data is also explored via MCMC sampling techniques. Overall, this manuscript is easy to follow and this code base is going to be an extremely valuable tool for the community, though I believe a few additions to the notebooks code be helpful for those less familiar with code development (which is where this manuscript has the potential to have the biggest impact).
Major Comments
This manuscript intends to reduce the workload for those attempting to use kinetics modeling, but there are parts of the notebooks which I believe to be a bit difficult for novice users (regardless of their knowledge of the Python language). I spent quite a bit of time playing with the documents, and first would like to thank the authors, as I believe this tool is going to be quite useful for many in the field moving forward. I have a few major notes to make about the code base:
- What about the case where a user doesn’t want to consider a reaction? A good example for using this model could be in the case of water condensation, which is explored in the KM-GAP case studies (3.1). It seems that all chemical components need to be defined via the reaction scheme, but in the case of water diffusing through an organic aerosol, perhaps you are not interested in any further chemistry. One could just create a set of reactions with rate constants all equal to 0 so the chemistry doesn’t occur, but this doesn’t seem like a ‘best practice’ way to code this up. Perhaps I missed how to do this more elegantly, though, in which case it would be good to highlight this in the code or manuscript.
- In general, the units are difficult to follow. While Table A1 provides the unit description for a KM-SUB application, Table S2 provides unit values. There is a challenge in the code in identifying what the true input values are. Take parameter `delta_1`: it is reported as 0.8 nm but you have to input the unit as cm. If you take both tables together then you can decipher what’s happening, but I think it might be useful just to comment off the units after the parameters in the dictionary. It was just difficult for me to change parameters on a whim to representative values.
- Consider a ‘docs’ sect would be useful beyond the provided notebooks to understanding this code base.
Minor Comments:
Line 200: You comment that MCMC cannot be used to compare two different models. What exactly do you imply by this? MCMC in the context of this paper is used to sample the posterior distribution and could for different reaction systems extract the values of, for instance, accommodation coefficients. These coefficients are themselves comparable to the extent that the model represents the physical system. So, in that sense you can compare ‘between models’. Essentially, I am just wondering what you originally intended to say.
Further, I have a question about the Emcee package: are the final extracted parameters based on the lowest value of the cost function in the markov chain? Or is a separate optimizer run on that cost function fit to optimize the parameters? While MCMC itself tends toward the best fit parameters, it does not create the best fit parameters.
Line 262: The authors state that differences in the model performance result from a lack of time-resolved particle size information after 17 min. While this point is subtle, the differences after 17 min are because of the physical differences in the model parameters, which in part result from the optimization on only 17 min of data.
Line 301: You reference the strong correlation observed in Fig S1. I am wondering if in the 2d histogram the burn in period is removed –it appears you can see the walk as evidenced in the extremities on both the 1 and 2D histograms. It is not untypical in many applications to remove the MCMC burn-in for visualizing the posterior. With more restricted bounds it may be easier to view the correlation.
Table S2: check the labeling of [3]nonanal and [4]products and the consistency in the subscripts of the variables in the table. I think they may be switched.
In the simulate code (`simulate.py`), if you turn the save figs on, the figures save on a 0-based indexing rather than a 1-based indexing. The problem here is that for someone not privy to coding, this may be confusing in interpreting which reaction component is which (because the components are listed into the model with a 1-based indexing).
Editorial
Line 11: insert ‘the’ before ‘particle and film level’
Line 33: replace ‘were’ with ‘are’
Line 306: Should be ‘encourage’ (drop the s).
Citation: https://doi.org/10.5194/egusphere-2022-143-RC2 - AC1: 'Comment on egusphere-2022-143', Christian Pfrang, 16 Jul 2022
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-143', Anonymous Referee #1, 30 May 2022
This work developed an open-source framework MultilayerPy to build and run kinetic multi-layer models. Researchers can utilize this framework to choose a certain reaction scheme, diffusion regime, and model component based on particular aims. It is a reproducible process. Local and global optimization are applied and further tested on the oleic acid-ozone heterogeneous reaction system. The work is well-presented, and the framework makes comparing aerosol models with experiment data easier. I have one major comment and several minor comments.
Major comments:
Overall, the quantity of case studies is insufficient to assess the framework's potential applications. For example, current case studies are not convincing in testing the multi-layer modeling ability of MultilayerPy. As shown in figure 3, there is no oleic acid concentration gradient between the layers. Could the authors provide more cases to test the ability to represent the layer differences? Also, will it be easy to find other reaction systems to test the differences between KG-SUB and KM-GAP? and how the optimization algorithms will address the differences?
Minor comments:
Line 29--33: Consider including more references about the specific scientific questions KM-SUB and KM-GAP have resolved. That can be the potential application of MultilayerPy.
Line 54: typo ``readable''
Line 62: I think the flexibility of adding unique processes to the framework is an important feature. However, the steps to achieve this are not well-documented in the rest of the paper. I recommend adding some descriptions about this at the steps listed in lines 93--100.
Figure 3: Would it be better to combine (b) and (c) together? It will make the differences more evident.
Line 252: Besides the codes, it would be easy for readers to follow if the steps to create the cases are briefly documented.
Line 280: How did you get the optimized value of $\alpha_{\rm s,0, ozone}$? Did you use the same model data used for the MCMC sampling procedure? I am thinking of training and validating data differences in Machine learning.
Citation: https://doi.org/10.5194/egusphere-2022-143-RC1 -
RC2: 'Comment on egusphere-2022-143', Anonymous Referee #2, 07 Jun 2022
General Comments
This manuscript develops an open-source package which can build kinetic multi-layer models. These models are useful in studying bulk-surface interactions for particles and the interplay between chemistry and thermodynamics. The ozonolysis of oleic acid system, which has been studied with both experimental work and kinetic modeling, is used as the most extensive example. Optimization of model parameters with experimental data is also explored via MCMC sampling techniques. Overall, this manuscript is easy to follow and this code base is going to be an extremely valuable tool for the community, though I believe a few additions to the notebooks code be helpful for those less familiar with code development (which is where this manuscript has the potential to have the biggest impact).
Major Comments
This manuscript intends to reduce the workload for those attempting to use kinetics modeling, but there are parts of the notebooks which I believe to be a bit difficult for novice users (regardless of their knowledge of the Python language). I spent quite a bit of time playing with the documents, and first would like to thank the authors, as I believe this tool is going to be quite useful for many in the field moving forward. I have a few major notes to make about the code base:
- What about the case where a user doesn’t want to consider a reaction? A good example for using this model could be in the case of water condensation, which is explored in the KM-GAP case studies (3.1). It seems that all chemical components need to be defined via the reaction scheme, but in the case of water diffusing through an organic aerosol, perhaps you are not interested in any further chemistry. One could just create a set of reactions with rate constants all equal to 0 so the chemistry doesn’t occur, but this doesn’t seem like a ‘best practice’ way to code this up. Perhaps I missed how to do this more elegantly, though, in which case it would be good to highlight this in the code or manuscript.
- In general, the units are difficult to follow. While Table A1 provides the unit description for a KM-SUB application, Table S2 provides unit values. There is a challenge in the code in identifying what the true input values are. Take parameter `delta_1`: it is reported as 0.8 nm but you have to input the unit as cm. If you take both tables together then you can decipher what’s happening, but I think it might be useful just to comment off the units after the parameters in the dictionary. It was just difficult for me to change parameters on a whim to representative values.
- Consider a ‘docs’ sect would be useful beyond the provided notebooks to understanding this code base.
Minor Comments:
Line 200: You comment that MCMC cannot be used to compare two different models. What exactly do you imply by this? MCMC in the context of this paper is used to sample the posterior distribution and could for different reaction systems extract the values of, for instance, accommodation coefficients. These coefficients are themselves comparable to the extent that the model represents the physical system. So, in that sense you can compare ‘between models’. Essentially, I am just wondering what you originally intended to say.
Further, I have a question about the Emcee package: are the final extracted parameters based on the lowest value of the cost function in the markov chain? Or is a separate optimizer run on that cost function fit to optimize the parameters? While MCMC itself tends toward the best fit parameters, it does not create the best fit parameters.
Line 262: The authors state that differences in the model performance result from a lack of time-resolved particle size information after 17 min. While this point is subtle, the differences after 17 min are because of the physical differences in the model parameters, which in part result from the optimization on only 17 min of data.
Line 301: You reference the strong correlation observed in Fig S1. I am wondering if in the 2d histogram the burn in period is removed –it appears you can see the walk as evidenced in the extremities on both the 1 and 2D histograms. It is not untypical in many applications to remove the MCMC burn-in for visualizing the posterior. With more restricted bounds it may be easier to view the correlation.
Table S2: check the labeling of [3]nonanal and [4]products and the consistency in the subscripts of the variables in the table. I think they may be switched.
In the simulate code (`simulate.py`), if you turn the save figs on, the figures save on a 0-based indexing rather than a 1-based indexing. The problem here is that for someone not privy to coding, this may be confusing in interpreting which reaction component is which (because the components are listed into the model with a 1-based indexing).
Editorial
Line 11: insert ‘the’ before ‘particle and film level’
Line 33: replace ‘were’ with ‘are’
Line 306: Should be ‘encourage’ (drop the s).
Citation: https://doi.org/10.5194/egusphere-2022-143-RC2 - AC1: 'Comment on egusphere-2022-143', Christian Pfrang, 16 Jul 2022
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
430 | 141 | 9 | 580 | 36 | 4 | 5 |
- HTML: 430
- PDF: 141
- XML: 9
- Total: 580
- Supplement: 36
- BibTeX: 4
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Adam Milsom
Amy Lees
Adam M. Squires
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(837 KB) - Metadata XML
-
Supplement
(81 KB) - BibTeX
- EndNote
- Final revised paper