the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The first application of a numerically-exact, higher-order sensitivity analysis approach for atmospheric modelling: implementation of the hyperdual-step method in the Community Multiscale Air Quality Model (CMAQ) version 5.3.2
Abstract. Sensitivity analysis in chemical transport models quantifies the response of output variables to changes in input parameters. This information is valuable for researchers in data assimilation and model development. Additionally, environmental decision-makers depend upon these expected responses of concentrations to emissions when designing and justifying air pollution control strategies. Existing sensitivity analysis methods include the finite-difference method, the direct decoupled method (DDM), the complex variable method, and the adjoint method. These methods are either prone to significant numerical errors when applied to non-linear models with complex components (e.g., finite difference and complex step methods) or difficult to maintain when the original model is updated (e.g., direct decoupled and adjoint methods). Here, we present the implementation of the hyperdual-step method in the Community Multiscale Air Quality Model (CMAQ) version 5.3.2 as CMAQ-hyd. CMAQ-hyd can be applied to compute numerically exact first- and second-order sensitivities of species concentrations with respect to emissions or concentrations. Compared to CMAQ-DDM and CMAQ-adjoint, CMAQ-hyd is more straightforward to update and maintain while it remains free of numerical errors as those augmented models do. To evaluate the accuracy of the implementation, the sensitivities computed by CMAQ-hyd are compared with those calculated with other traditional methods or a hybrid of the traditional and advanced methods. We demonstrate the capability of CMAQ-hyd with the newly implemented gas-phase chemistry and biogenic aerosol formation mechanism in CMAQ. We also explored the cross-sensitivity of monoterpene nitrate aerosol formation to its anthropogenic and biogenic precursors to show the additional sensitivity information computed by CMAQ-hyd. Compared with the traditional finite difference method, CMAQ-hyd consumes fewer computational resources when the same sensitivity coefficients are calculated. This novel method implemented in CMAQ is also computationally competitive with other existing methods and could be further optimised to reduce memory and computational time overheads.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(22306 KB)
-
Supplement
(3935 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(22306 KB) - Metadata XML
-
Supplement
(3935 KB) - BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1017', Anonymous Referee #1, 18 Jul 2023
In this paper, Liu et al., have applied the hyper-dual sensitivity analysis approach to a chemical transport model (CMAQ). They find the method to be both accurate and compuationally relatively efficient for calculating first and second order sensitivities. In generl, the manuscript is scientificallyt sound and well written.
“DDM-3D” was developed by Yang et al., in 1997.
The development of (5) needs to be further explained.. Given it is a Taylor series expansion, about what value is the expansion?
In (16) and (17) is ENOx a function of space or time? If yes, the derivatives calculated are very complex, and indeed, it would be good for the authors to explain exactly how they are taking those derivatives and how the set of mathematical operations are being done. If ENOx is not space or time dependent, what is it (how is it mathematically defined)? I think I know what they are trying to do, but the current representation needs to be clarified and made mathematically more precise. They should indicate the spatial and temporal dependencies in the variables.
It would be good to know the specific cause of the instability from a mathematical viewpoint. Can you derive specifically how the instability grows? This is particularly of interest if the hyd code is truly exact as this would seem to imply some level of inexactness.
Table 1: The caption needs to state what is being compared. That information can also go directly on the graphs in Fig. 2, so Table 1 is not needed. It would be more effective that way as well.
Why not compare the results to another sensitivity analysis method implemented in a CTM, e.g., DDM-3D. This would seem to be much more in line with demonstrating the potential advantages of the method.
Given the description of what was involved, it is not apparent how much of a re-coding savings are involved between the hyd approach and others. Maybe a bit more on the relative effort withmore specifics.
Did they validate or evaluate the hyperdual module? The two words have rather different meanings.
One of the more interesting findings of the paper is the computational efficiency found in the hyd method applied to CMAQ vs. other applications. The discuss this a bit, but a bit more analysis would be of interest. For example, for the case of four or eight nodes, say, provide the module-by-module ratio of computational times.
Citation: https://doi.org/10.5194/egusphere-2023-1017-RC1 - AC1: 'Reply on RC1', Shannon Capps, 26 Aug 2023
-
RC2: 'Comment on egusphere-2023-1017', Jixiang Li, 21 Jul 2023
Thank you for sharing your research. This work marks the initial application of HYD in CMAQ, enabling the computation of second-order sensitivities of species concentrations with respect to emissions or concentrations. Moreover, it demonstrates computational competitiveness when compared to other existing methods, such as FDM. The research holds significant importance and stands as an excellent paper.
Citation: https://doi.org/10.5194/egusphere-2023-1017-RC2 -
AC2: 'Reply on RC2', Shannon Capps, 26 Aug 2023
Thank you very much for providing an accurate and succinct summary of the work represented in this manuscript and for assessing it to be valuable to the field.
Citation: https://doi.org/10.5194/egusphere-2023-1017-AC2
-
AC2: 'Reply on RC2', Shannon Capps, 26 Aug 2023
-
RC3: 'Comment on egusphere-2023-1017', Anonymous Referee #3, 04 Oct 2023
This manuscript describes the theoretical base and implementation of the new CMAQ-hyd model for calculation of the first and second order sensitivities. This model is based on the hyperdual step method implemented into the widely used CTM model CMAQ (US-EPA). The manuscript describes also the evaluation and testing of the model including performance tests. The work brings significant new scientific results in the area of air quality modelling. The hyperdual step method has been already known but its utilisation in the CTM models is novel and very beneficial for the scientific community. The paper is well structured and clearly written. The findings are generally well described, I have a few specific comments described below. All corresponding materials (model source code and testing data) are available which allow to reproduce the experiments. The topic of the paper fits very well to GMD scope and I recommend the manuscript for publication in GMD after a minor revision (see specific comments below).
Specific comments:
I. Comments to the manuscript:
l. 149, Abbreviation SI is not defined. You probably meant Supplementary Material. Please, use the full name in this first occurrence of the abbreviation.
l. 151-153, 162: The relation of a1, a2, a12 and Hh is not clearly formulated. Please, try to reformulate to make this step more comprehensible to the reader.
l. 214-232: This paragraph is not well formulated and the explanations are slightly disarranged. E.g. utilisation of the forward and reverse mode in CMAQ is not explained clearly. Similarly, the adjustments done in the CMAQ itself, CMAQ-adjoint, and CMAQ-hyd are not clearly distinguished and all paragraph needs multiple readings to comprehend meaning of it. Please, try to reformulate it in a more straightforward way to allow easy understanding also to reader which is not fully familiar with the details of the CMAQ internals.
l. 242-243: The definition of Cpm2.5 and Enox should be moved forward somewhere to line 238 before their utilisation.
l. 252: Superscripts inc, dec, and orig - explicit description of the meaning of these abbreviations might be beneficial even the following example gives a hint.
l.261-262: The last sentence partially repeats the statement of the sentence on lines 257-258. Moreover, section 2.1 discuses the hyperdual method and its errors while errors of the central FDM are discussed in the section 1 (l. 69-88).
l.269-270: The formulation "The second-order sensitivity evaluation is between a hybrid hyperdual-finite-difference method (HYD-FDM) and the hyperdual-step method." seems to be unclear. The approach is explained in the following text but the sentence can confuse the reader at the beginning. Please, reformulate.
l.286-287: You state here the evaluation has been done with 50% perturbation but in l.254-255 you assert the perturbation used has been 125% and 75%, i.e. 25%. I may overlook something and it may represent a different perturbation. Please, either correct these numbers (in case of the mistake) or add better explanation or description of these numbers (in case they are correct from some reason).
l. 293-316: The Fig. 3 seems to be poorly arranged as the overlapping points do not allow the comparison of individual results. My suggestion is to rearrange the Fig. 3 in a way which will allow to study better the behaviour of the FDM substraction and truncation errors with decreasing of the perturbations and to assess the "convergence" of FDM to hyperdual results. Four separate graphs might work better than the current unified graph. You can have a better idea how to deal with it.
l. 393-394: According figures 6g and 6h, the second order sensitivities of PM2.5 to TERP (Fig. 6g) are mostly negative, while to APIN (Fig. 6h) are mostly positive.
l. 445-446: The implementation of CVM is not higher. You probably meant the computational cost of this CVM based model.
l. 448-449: The sentence does not make sense. Probably words were left out after "are shown for the".
l. 450: You hide from the reader the number of processor cores/MPI processes. This fact is much more important than number of nodes. The organisation of the MPI processes to individual nodes can only influence the Infiniband (or another transport layer) overhead which is usually small in modern HPC system for such a type of tasks. Also, the extent of your configuration is important mainly for assessment of the parallelization efficiency with growing number of MPI processes involved as well as for memory demands of the model. Please, give the reader full information about your testing configuration. You can add these information into the Section 4. of Supplements and give a reference to it here.
l.483: I have slight doubts you can use the word "validated" here, I would suggest "evaluated" or a similar word. My reasoning is methodological. The purpose of tests done in section 3.1 is to evaluate the correctness and accuracy of the hyperdual based implementation. You compare results of the new more precise method with an established less precise method (according the theory) and you get some differences. You attribute these discrepancies between FDM and hyperdual results only to the nonlinearity of the model but how can you be sure they are not caused also by another reason, e.g. some problem in the hyperdual implementation? Yes, I am also convinced, that it is the result of FDM errors due to nonlinearity of the model but it is not a formal proof. You give a good supporting arguments in the following parts of the section 3.1 and they very well support the trust in the correctness of the CMAQ-hyd model implementation. But I still would be careful to call it formally validation. The thorough evaluation of the behaviour of CMAQ-hyd and its comparison with FDM done in 3.1 shows that the differences are of expected properties what allows to trust this new model.
l. 507: "..free from numerical noise.." - Only specific types of the numerical errors (truncation and subtractive cancellation errors) are eliminated by this method. Even they are the most important ones, I would suggest a more careful formulation here.II. Comments to the model source code:
1. The source files of the HDMod and HDMod_cplx do not contain license, authors, and reference to original author of the C++ module which it is based on. Please, fill it in.
2. The changes in the source code contain almost no comments. It would be beneficial to denote and comment your adaptations as it could help users to quickly orient in your changes and it also would be of use during adaptation of the CMAQ-hyd to the new versions of the CMAQ model. I do not insist on doing it before the publication of this paper but I consider beneficial to do it as soon as possible.Citation: https://doi.org/10.5194/egusphere-2023-1017-RC3 - AC3: 'Reply on RC3', Shannon Capps, 13 Nov 2023
- AC4: 'Reply on RC3', Shannon Capps, 14 Nov 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1017', Anonymous Referee #1, 18 Jul 2023
In this paper, Liu et al., have applied the hyper-dual sensitivity analysis approach to a chemical transport model (CMAQ). They find the method to be both accurate and compuationally relatively efficient for calculating first and second order sensitivities. In generl, the manuscript is scientificallyt sound and well written.
“DDM-3D” was developed by Yang et al., in 1997.
The development of (5) needs to be further explained.. Given it is a Taylor series expansion, about what value is the expansion?
In (16) and (17) is ENOx a function of space or time? If yes, the derivatives calculated are very complex, and indeed, it would be good for the authors to explain exactly how they are taking those derivatives and how the set of mathematical operations are being done. If ENOx is not space or time dependent, what is it (how is it mathematically defined)? I think I know what they are trying to do, but the current representation needs to be clarified and made mathematically more precise. They should indicate the spatial and temporal dependencies in the variables.
It would be good to know the specific cause of the instability from a mathematical viewpoint. Can you derive specifically how the instability grows? This is particularly of interest if the hyd code is truly exact as this would seem to imply some level of inexactness.
Table 1: The caption needs to state what is being compared. That information can also go directly on the graphs in Fig. 2, so Table 1 is not needed. It would be more effective that way as well.
Why not compare the results to another sensitivity analysis method implemented in a CTM, e.g., DDM-3D. This would seem to be much more in line with demonstrating the potential advantages of the method.
Given the description of what was involved, it is not apparent how much of a re-coding savings are involved between the hyd approach and others. Maybe a bit more on the relative effort withmore specifics.
Did they validate or evaluate the hyperdual module? The two words have rather different meanings.
One of the more interesting findings of the paper is the computational efficiency found in the hyd method applied to CMAQ vs. other applications. The discuss this a bit, but a bit more analysis would be of interest. For example, for the case of four or eight nodes, say, provide the module-by-module ratio of computational times.
Citation: https://doi.org/10.5194/egusphere-2023-1017-RC1 - AC1: 'Reply on RC1', Shannon Capps, 26 Aug 2023
-
RC2: 'Comment on egusphere-2023-1017', Jixiang Li, 21 Jul 2023
Thank you for sharing your research. This work marks the initial application of HYD in CMAQ, enabling the computation of second-order sensitivities of species concentrations with respect to emissions or concentrations. Moreover, it demonstrates computational competitiveness when compared to other existing methods, such as FDM. The research holds significant importance and stands as an excellent paper.
Citation: https://doi.org/10.5194/egusphere-2023-1017-RC2 -
AC2: 'Reply on RC2', Shannon Capps, 26 Aug 2023
Thank you very much for providing an accurate and succinct summary of the work represented in this manuscript and for assessing it to be valuable to the field.
Citation: https://doi.org/10.5194/egusphere-2023-1017-AC2
-
AC2: 'Reply on RC2', Shannon Capps, 26 Aug 2023
-
RC3: 'Comment on egusphere-2023-1017', Anonymous Referee #3, 04 Oct 2023
This manuscript describes the theoretical base and implementation of the new CMAQ-hyd model for calculation of the first and second order sensitivities. This model is based on the hyperdual step method implemented into the widely used CTM model CMAQ (US-EPA). The manuscript describes also the evaluation and testing of the model including performance tests. The work brings significant new scientific results in the area of air quality modelling. The hyperdual step method has been already known but its utilisation in the CTM models is novel and very beneficial for the scientific community. The paper is well structured and clearly written. The findings are generally well described, I have a few specific comments described below. All corresponding materials (model source code and testing data) are available which allow to reproduce the experiments. The topic of the paper fits very well to GMD scope and I recommend the manuscript for publication in GMD after a minor revision (see specific comments below).
Specific comments:
I. Comments to the manuscript:
l. 149, Abbreviation SI is not defined. You probably meant Supplementary Material. Please, use the full name in this first occurrence of the abbreviation.
l. 151-153, 162: The relation of a1, a2, a12 and Hh is not clearly formulated. Please, try to reformulate to make this step more comprehensible to the reader.
l. 214-232: This paragraph is not well formulated and the explanations are slightly disarranged. E.g. utilisation of the forward and reverse mode in CMAQ is not explained clearly. Similarly, the adjustments done in the CMAQ itself, CMAQ-adjoint, and CMAQ-hyd are not clearly distinguished and all paragraph needs multiple readings to comprehend meaning of it. Please, try to reformulate it in a more straightforward way to allow easy understanding also to reader which is not fully familiar with the details of the CMAQ internals.
l. 242-243: The definition of Cpm2.5 and Enox should be moved forward somewhere to line 238 before their utilisation.
l. 252: Superscripts inc, dec, and orig - explicit description of the meaning of these abbreviations might be beneficial even the following example gives a hint.
l.261-262: The last sentence partially repeats the statement of the sentence on lines 257-258. Moreover, section 2.1 discuses the hyperdual method and its errors while errors of the central FDM are discussed in the section 1 (l. 69-88).
l.269-270: The formulation "The second-order sensitivity evaluation is between a hybrid hyperdual-finite-difference method (HYD-FDM) and the hyperdual-step method." seems to be unclear. The approach is explained in the following text but the sentence can confuse the reader at the beginning. Please, reformulate.
l.286-287: You state here the evaluation has been done with 50% perturbation but in l.254-255 you assert the perturbation used has been 125% and 75%, i.e. 25%. I may overlook something and it may represent a different perturbation. Please, either correct these numbers (in case of the mistake) or add better explanation or description of these numbers (in case they are correct from some reason).
l. 293-316: The Fig. 3 seems to be poorly arranged as the overlapping points do not allow the comparison of individual results. My suggestion is to rearrange the Fig. 3 in a way which will allow to study better the behaviour of the FDM substraction and truncation errors with decreasing of the perturbations and to assess the "convergence" of FDM to hyperdual results. Four separate graphs might work better than the current unified graph. You can have a better idea how to deal with it.
l. 393-394: According figures 6g and 6h, the second order sensitivities of PM2.5 to TERP (Fig. 6g) are mostly negative, while to APIN (Fig. 6h) are mostly positive.
l. 445-446: The implementation of CVM is not higher. You probably meant the computational cost of this CVM based model.
l. 448-449: The sentence does not make sense. Probably words were left out after "are shown for the".
l. 450: You hide from the reader the number of processor cores/MPI processes. This fact is much more important than number of nodes. The organisation of the MPI processes to individual nodes can only influence the Infiniband (or another transport layer) overhead which is usually small in modern HPC system for such a type of tasks. Also, the extent of your configuration is important mainly for assessment of the parallelization efficiency with growing number of MPI processes involved as well as for memory demands of the model. Please, give the reader full information about your testing configuration. You can add these information into the Section 4. of Supplements and give a reference to it here.
l.483: I have slight doubts you can use the word "validated" here, I would suggest "evaluated" or a similar word. My reasoning is methodological. The purpose of tests done in section 3.1 is to evaluate the correctness and accuracy of the hyperdual based implementation. You compare results of the new more precise method with an established less precise method (according the theory) and you get some differences. You attribute these discrepancies between FDM and hyperdual results only to the nonlinearity of the model but how can you be sure they are not caused also by another reason, e.g. some problem in the hyperdual implementation? Yes, I am also convinced, that it is the result of FDM errors due to nonlinearity of the model but it is not a formal proof. You give a good supporting arguments in the following parts of the section 3.1 and they very well support the trust in the correctness of the CMAQ-hyd model implementation. But I still would be careful to call it formally validation. The thorough evaluation of the behaviour of CMAQ-hyd and its comparison with FDM done in 3.1 shows that the differences are of expected properties what allows to trust this new model.
l. 507: "..free from numerical noise.." - Only specific types of the numerical errors (truncation and subtractive cancellation errors) are eliminated by this method. Even they are the most important ones, I would suggest a more careful formulation here.II. Comments to the model source code:
1. The source files of the HDMod and HDMod_cplx do not contain license, authors, and reference to original author of the C++ module which it is based on. Please, fill it in.
2. The changes in the source code contain almost no comments. It would be beneficial to denote and comment your adaptations as it could help users to quickly orient in your changes and it also would be of use during adaptation of the CMAQ-hyd to the new versions of the CMAQ model. I do not insist on doing it before the publication of this paper but I consider beneficial to do it as soon as possible.Citation: https://doi.org/10.5194/egusphere-2023-1017-RC3 - AC3: 'Reply on RC3', Shannon Capps, 13 Nov 2023
- AC4: 'Reply on RC3', Shannon Capps, 14 Nov 2023
Peer review completion
Journal article(s) based on this preprint
Model code and software
CMAQ-hyd Jiachen Liu, Eric Chen, Shannon Capps https://zenodo.org/record/7938726
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
359 | 162 | 24 | 545 | 33 | 12 | 10 |
- HTML: 359
- PDF: 162
- XML: 24
- Total: 545
- Supplement: 33
- BibTeX: 12
- EndNote: 10
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Jiachen Liu
Eric Chen
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(22306 KB) - Metadata XML
-
Supplement
(3935 KB) - BibTeX
- EndNote
- Final revised paper