GECKO-A v1.0: Exploring VOC Oxidation Trajectories Through Comparison with the Master Chemical Mechanism
Abstract. Numerical models are crucial tools for understanding complex chemical systems such as the atmosphere, and their sensitivity across a range of conditions. In atmospheric chemistry models, reaction mechanisms are used to represent chemical transformations and define the underlying system of equations. Building highly explicit mechanisms that capture the full complexity of organic oxidation occurring in the atmosphere remains challenging owing to the large number of intermediates involved, the breadth of reaction pathways, and the limited availability of reliable kinetic and thermodynamic data. The Generator for Explicit Chemistry and Kinetics of Organics in the Atmosphere (GECKO-A) was developed to address these limitations by enabling the systematic construction of near-explicit mechanisms. Here, we present its first open-source release (v1.0), which incorporates updated chemical protocols and structure-activity relationships, together with its companion box model for mechanism integration. GECKO-A's performance is evaluated through systematic comparisons with the Master Chemical Mechanism (MCM v3.3.1), based on simulations of the oxidation of five representative hydrocarbons (butane, octane, dodecane, toluene, and α-pinene) under environmental conditions ranging from urban to remote. The two approaches yield similar oxidation pathways for small and structurally simple compounds. However, differences increase with the size and complexity of the carbon backbone. In particular, the simplifications inherent to the MCM tend to limit the formation of multifunctional products and promote earlier fragmentation, resulting in notable discrepancies in the predicted volatility of secondary organic carbon and, consequently, in secondary organic aerosol yields.
This paper is describes the first open-source release of the GECKO-A mechanism generation system and illustrates its capabilities by comparing its predictions for several representative compounds against those of the current version of the often-used Master Chemical Mechanism (MCM). This is classified as a model description paper, but actually does not do a very good job in this regard, for reasons discussed below. Its main contribution to the literature is that it shows that mechanisms must adequately represent multi-generation chemistry (i.e., reactions of products of products of products, etc.) in order to predict formation of organic aerosol (SOA) and water-soluble organic products. GECKO-A is capable of generating detailed mechanisms representing all generations of reactions leading to CO2 or nonvolatile products, while the MCM, though almost as detailed in its representation of first and second generation chemistry, only approximates the chemistry after the first two generations. This work shows that this causes gross underpredictions of SOA and soluble products, as well as affecting predictions of distributions of types of organic products formed.
This demonstration of the unsuitability of mechanisms that do not treat 3+ generation reactions adequately for predicting SOA or condensable products is important and needs to be published. This inadequate treatment of multi-generation chemistry is not unique to MCM; it is a characteristic of every mechanism, however detailed, that is used in current atmospheric models that attempt to simulate realistic atmospheric mixtures. Although multi-generation mechanisms such as those derived using GECKO-A are much too large for practical atmospheric modeling applications, they are essential for evaluating the predictive capability of mechanisms used in such models, and for eventually developing more predictive mechanisms of a more practical size. This paper is important in clearly demonstrating the need for this work.
GECKO-A is not a new model; its development and use was first published in 2005, and has been used in a number of published studies since then. Other than some updates to the chemistry and types of compounds that can be handled, it is unclear how its capabilities and scientific utility now is much different from the version described 20 years ago. However, the fact that it is now archived and is open source with presumably improved and more modular software will make this important model more sustainable for the future, and describing its capabilities in model description paper in GMD is entirely appropriate.
Unfortunately, this manuscript has several deficiencies as a model description paper. It gives only a brief description of how GECKO-A operates and what it does, though it is probably sufficient because these have been well described in the literature already. It does describe the various user-configurable parameters that control the necessary reductions that affect the size of the mechanisms produced, but does not give enough examples of how the parameters affect the output other than the size of the mechanisms produced. Knowledge of the effects of these parameters on model predictions is necessary for users of the model to understand effects of the approximations each represent, and which parameter choices are appropriate for their applications. In addition, although the paper provides links where one can obtain the software, and there are links to run a limited version of the model online, there are no users manuals or instructions on how to install and use the software that can be found in this manuscript or any of its citations. Although the comparisons with MCM, which take up the bulk of the paper, are characterized as an evaluation of GECKO-A, in fact this is more of an evaluation of MCM.
It is up to the editors of GMD to decide whether this manuscript can be accepted in this form as a model description paper. If I were in their position, I would require at least a link to obtain instructions for use and installation, and much more information on how varying the parameters on affect model predictions. Failing that, this could be recast as an evaluation of MCM, and essentially all the results of the comparison be published in an appropriate journal on that basis. This would certainly be publishable and a substantial contribution to modeling science.
If changes are made to make this more suitable as a model description paper, my suggestion is that the MCM comparisons, which certainly stand alone, be made a separate paper, with the comparisons in the model description paper be restricted to comparing MCM with GECKO-A mechanisms restricted to the two generations of chemistry that the MCM is designed to represent (i.e., using "maxgen"=2). Otherwise, this is not really a fair comparison, and the multi-generation effects, which can be investigated more cleanly by comparing GECKO-A predictions with different generation limitations, dominate the results. Showing the effects of varying generation limitations and other user-modifiable parameters, with all else held the same, should be a part of the model description paper. The second paper, or the second part of a two-part paper, can then use the more comprehensive comparison with MCM to emphasize the importance of representing multi-generation effects in atmospheric models, and its implications for development of mechanisms in models in general.
Given below is a discussion of various parts of the manuscript, with more detail given concerning some of my concerns, and some recommendations for improvements that the authors should consider.
Title, Abstract, and Introduction
I'm not sure that the title is optimum for this manuscript as written, though if it were split into two papers, or the model description aspects were de-emphasized and this submitted to another journal as an MCM comparison, then the title would change anyway. My comments below are based on assuming that the overall structure of the paper is retained.
The abstract states that GECKO-A's performance is being evaluated against MCM, but the results and discussion sections are more like an evaluation of MCM. The abstract needs to state that the results indicate the importance of representing multi-generation reaction in predicting SOA and water solubility. This is a major result of this work, which dominated much of the discussion of results in the existing manuscript.
The introduction should have more discussion of history of GECKO-A-A since first publication in 2005 and give a timeline of major changes in types of compounds covered, chemical estimation methods (e.g. SARs), algorithms, and software. It would be useful for the introduction to include a summary of the major publications using results using GECKO-A, and indicate changes that have made since then and whether they may have affected results.
Model Description
As indicated above, there needs to be a user manual or at least information about user and installation instructions can be obtained. This information could be included in the SI.
The manuscript has a relatively large section describing the box model that can be used to carry out simulations using the large mechanisms that are generated. This is useful for those who may consider using the box model and is needed so results presented concerning phase changes can be duplicated, but is not central to the scientific issues being studied, and I wonder if much of the detail here could be moved to the SI.
Mechanism Construction
It might be appropriate to more information about the sources of manually assigned rate constants and mechanisms in GECKO-A as well as the SARs used in the estimates. There is some short discussion of this, but is restricted to summarizing the major SARs, but not all of the types of reactions that are considered where SARs are not employed.
The mechanism construction section should also give a reference documenting the current inorganic and C1 mechanism used by GECKO-A (and for MCM in the comparisons). The reference should include a complete listing of the mechanism so people can duplicate the results presented here and if it is not available in a published paper it should be in the SI.
Mechanism Configuration
Table 1 is an important summary of parameters controlling use of GECKO-A that needs to be understood by users of the model and also by readers to interpret some of the results discussed in this paper. Even though they are discussed in the text, the table should include a few words to briefly indicate what each is for so it can serve as convenient reference for the reader or user. It would also be helpful if they were listed in the same order as discussed in the text (or vise-versa).
This section, or the SI, should clarify about how Ymax is calculated when yields of intermediates leading to the subject product are affected by bimolecular reactions of the intermediates (e.g., peroxy + NO, HO2, RO2), whose relative importance depends on atmospheric conditions. Are the competing bimolecular reactions all assumed to be equally important for calculation of Ymax? I presume that the parameters "highnoxfg" and "rx_ro2_multiclass") are relevant to this. If so, they should be discussed at the same time Ymax is first discussed. If not, the discussions of all these parameters need to be clarified. If extensive discussions are required to clarify this, it could be given in the SI.
More information about priority scheme for using surrogate species if "isomerfg" is selected should be given, at least in the SI. It seems to me that some efficiency might be gained if this were applied to peroxy radicals as well as stable products.
It would also be helpful to indicate which of these parameters are new to this version or significantly changed, and which were used in the 2005 version, and what were used in most of the previous published papers that used GECKO-A.
Website and Documentation
As indicated above, there is no mention or reference to a users manual or instructions on how to install and use this model. There is an online version of GECKO-A that is relatively easy to use, but its capability is limited to only one generation and uses other restrictive options, which is necessary because this is very computer resource intensive software. A Google search for a manual or instructions was unsuccessful, suggesting that such manuals do not exist, at least not online. The journal description of model description papers given online as guide to authors states that they "should" include a user manual. Is this not a strict requirement for model description papers in GSM?
The link given on Line 255, "(www.GECKO-Aa.lisa.u-pec.fr, last access January 2026)", did not work. I found "GECKO-Aa.lisa.u-pec.fr/" (no "www") using Google, and it looks like this is presumably what they meant.
MCM Description and Comparison with GECKO-A
The description of MCM should summarize which of the chemical estimates or SARs used when MCM 3.3.1 was constructed are different from those now used by GECKO-A, especially those that were found to affect the results of the comparison.
The most important difference between MCM and GECKO-A is that MCM only attempts to represent chemical detail for the first two generations of reactions, and uses highly approximate methods to represent reactions of subsequent generations. This is necessary since otherwise it would be impossible to derive the mechanism manually. Because of this, the section describing MCM should give more detail on the approximate methods MCM uses to represent higher generation processes. There is some discussion of this when it affected the results, but it is better that this discussion be in this section, which can be referred to in cases where it affects the results.
The comparison with MCM is appropriate because this is a well-used mechanism that is also intended to incorporate near-explicit chemical detail that employs similar if not the same estimation methods in most cases, while being developed using a totally different approach. However, to be useful as a fair comparison, the parameters used when deriving the GECKO-A mechanisms for comparison with MCM should be consistent with those used (or considered in a qualitative sense) when MCM was developed. The effects of the reduction parameters on the GECKO-A simulations should be demonstrated separately, so the comparison with MCM should employ comparable parameters. The limitation of generations in MCM makes it most directly comparable to GECKO-A mechanisms derived for no more than two generations, i.e., using "maxgen=2". Comparisons using 2-generation GECKO-A mechanisms would allow differences between MCM and GECKO-A on representation of the detailed chemistry to be more unambiguously studied. Since GECKO-A uses similar estimation methods and chemical assignments in many (but not all) cases, the comparison can also provide evidence that the GECKO-A software is implementing the chemistry as intended. As it is, the results of the comparison, which uses GECKO-A mechanisms derived for multiple generations and least reduction, are dominated by the fact that MCM was not designed to represent the higher generation reactions with chemical detail.
If MCM is compared with 2-generation GECKO-A mechanisms, then the method GECKO-A uses to represent higher generation products becomes important, so its difference compared to the MCM approach should also be discussed. It looks to me like the GECKO-A method is biased towards underestimating the effects of gas-phase reactivities and yields of higher generation products, while the MCM method is biased towards overestimating these because many of these products are not ultimately consumed. This should be pointed out and discussed if so.
The other parameter that affect the comparison with MCM is "brcut", whose effective value varies with compound in MCM but I don't think is ever lower than 5%, and is often much higher. On the other hand, Table 1 indicates that the GECKO-A mechanisms derived here used cutoff that was an order of magnitude lower. For best comparison with MCM, they should use "brcut"=5% and select the "isomerfg" option, which allows lumping of isomers that is often used in MCM. The other reduction parameters are mainly important in multi-generation derivations, so would not be important in this comparison of limited generation mechanisms.
Effects of Choices of GECKO-A Parameters
As indicated above, a major omission of this paper is lack of sufficient information showing how the GECKO-A-derived mechanisms are affected by the choice of user-modifiable parameters listed in Table 1. The only information regarding this given in Figure 1, which has plots showing the effects of using different selections for the maximum number of generations, whether isomers with the same substituents are lumped, and the choice of the low volatility vapor pressure cutoff, affect the numbers of species in the mechanisms. There is no information on the effects of these and other parameters on the actual predictions of the mechanism, such as SOA and soluble product formation, reactivity metrics, or organic product distributions. This is at least as important as effects of the parameters on mechanism size, if not more so.
Results
The results section is dominated by comparing predictions of MCM with the GECKO-A mechanisms produced with very low levels of reduction and no effective limitation on generations. Although not optimum for documenting GECKO-A for reasons discussed above, these results are important and need to be published, since they clearly show the that mechanisms, however detailed, that do not represent the higher generations are totally unsuitable for predicting SOA or soluble products. Overall I think present manuscript presents the comparisons results reasonably well, and use a reasonably comprehensive set of metrics for this purpose. The main omission, besides not showing the effects of how varying the GECKO-A parameters affect model predictions, is not showing how much SOA, solubility, and groups of various kinds can be attributed to each generation.
Conclusions
The conclusion section is useful in pointing out areas where the chemistry in GECKO-A needs to be improved, and the need for work on reduction approaches so computer generated mechanisms can be used to improve mechanisms in models. Although not presently usable in practical models, multi-generation mechanisms produced by systems such as GECKO-A are necessary to improve and evaluate such models.
As discussed above, I think the greatest contribution of this paper is the demonstration of the importance of representing higher generation reactions when predicting SOA or water soluble products. This has implications for the predictive capabilities of all mechanisms currently used in atmospheric models. Because it is not practical to use million-reaction mechanisms in such models, other approaches are needed to predict SOA and soluble products based on the actual chemistry involved. I do not think these conclusions are adequately pointed out in this section.