the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Enabling High Performance Cloud Computing for the Community Multiscale Air Quality Model (CMAQ) version 5.3.3: Performance Evaluation and Benefits for the User Community
Abstract. The Community Multiscale Air Quality (CMAQ) Model is a local-to-hemispheric scale numerical air quality modeling system developed by the U.S. Environmental Protection Agency (USEPA) and supported by the Center for Community Modeling and Analysis System (CMAS). CMAQ is used for regulatory purposes by the USEPA program offices and state and local air agencies, and is also widely used by the broader global research community to simulate and understand complex air quality processes and for computational environmental fate and transport, and climate and health impact studies. Leveraging state-of-the-science cloud computing resources for high performance computing (HPC) applications, CMAQ is now available as a fully tested, publicly available technology stack (HPC cluster and software stack) for two major cloud service providers (CSPs). Specifically, CMAQ configurations and supporting materials have been developed for use on their HPC clusters, including extensive online documentation, tutorials, and guidelines to scale and optimize air quality simulations using their services. These resources allow modelers to rapidly bring together CMAQ, cloud-hosted datasets, and visualization and evaluation tools on ephemeral clusters that can be deployed quickly and reliably worldwide. Described here are considerations in CMAQ v5.3.3 cloud use and the supported resources for each CSP, presented through a benchmark application suite that was developed as an example of typical simulation for testing and verifying components of the modeling system. The outcomes of this effort are to provide findings from performing CMAQ simulations on the cloud using popular vendor provided resources, to enable the user community to adapt this for their own needs and identify specific areas of potential optimization with respect to storage and compute architectures.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(4431 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4431 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-3045', Anonymous Referee #1, 21 Apr 2024
The manuscript introduces the adaptation and optimization of the CMAQ 5.3.3 for high-performance cloud computing environments. This paper fits the scope of GMD and can serve as a detailed reference to showcase how the CMAQ model can enhance computational efficiency and accessibility for diverse modeling tasks. Here are some minor suggestions that could be addressed to further improve the paper.
Line 115-125: How about illustrating the CMAQ workflow using a figure? It would help readers better understand how CMAQ works.
Line 130-150: This is lengthy and somewhat difficult to follow. Please break it into multiple paragraphs to enhance readability.
Line 160: Figure 1 only shows the rectangle of CONUS but lacks grid representation. I suggest exemplifying the grids over an area of interest with a zoom-in minimap.
Line 165-290: Section 3 offers valuable insights into CMAQ deployment from an engineering perspective. However, to align more closely with the scientific paper, consider pivoting towards system or experiment design to elucidate the methodology behind this work, while relocating detailed technical tutorials to an appendix.
Line 335-410: How about combining Figure 6/7, 8/9/10, 11/12/13/14/15/16? It is a little bit hard for readers to compare the results across multiple figures.
Line 495-560: The current discussion could be streamlined and organized into subtopics, such as the strengths of the proposed cloud-based implementation, scalability/reusability, limitations, and future research recommendations. Meanwhile, a conclusion section is recommended to summarize the research findings from this work.
Citation: https://doi.org/10.5194/egusphere-2023-3045-RC1 -
AC1: 'Reply on RC1', Saravanan Arunachalam, 20 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-3045/egusphere-2023-3045-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Saravanan Arunachalam, 20 Jun 2024
-
RC2: 'Comment on egusphere-2023-3045', Anonymous Referee #2, 23 Apr 2024
This manuscript presents a research effort to enable CMAQ modeling and data analysis on high performance cloud computing. CMAQ is a popular air quality model and has been widely applied for numerous regulation and research purposes. The application of CMAQ however, is somehow still limited since it requires preparing all inputs and run scripts on one single server. Enable CMAQ on cloud server would make it more convenient to run the model and would probably promote its applications to a broader community. The study is really a worth of efforts. The manuscript provided clear descriptions of the flow chart and sufficient details of each section of the modeling system, and also demonstrated the changes in cost and efficiency clearly. Therefore, I would recommend it to be accepted for publication, if the following minor comments could be properly addressed.
Â
comment#1: Computational efficiency for traditional CMAQ is not linearly increasing with more CPU cores and data I/O is a big reason. But for cloud-based CMAQ it seems horizontal advection is most time consuming, which is a little surprising. Please provide a brief discussion regarding this change.
Â
comment#2: I guess the current cloud version doesn’t support two-way mode WRF-CMAQ, please clarify if it is correct. Also, does it support online modules for MEGAN and dust emission?
Â
comment#3: Fig.3 and Fig.4 is not mentioned in the main text. It’s necessary to briefly explain the flowchart although the figure is quite self-explained.
Â
comment#4: Fig.8 and Fig.9: It’s interesting to notice that pinning on Azure speeds up vertical diffusion but on AWS slows it, please provide a discussion to briefly explain the difference.
Â
comment#5: It’s important to notice that only a few variables are saved to 1-layer conc file during the test in section4.2.2, while in real application the variables and layers may be much more and larger. Please provide a brief discussion to justify if the test runs shown in this study are representative for typical CMAQ applications.
Â
comment#6: Fig.18, Fig.19, and Fig.21: Showing screen print is straightforward but a little improper for journal publication, it’s better to summarize the important numbers into a concise figure or table.
Â
Â
Citation: https://doi.org/10.5194/egusphere-2023-3045-RC2 -
AC2: 'Reply on RC2', Saravanan Arunachalam, 20 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-3045/egusphere-2023-3045-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Saravanan Arunachalam, 20 Jun 2024
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-3045', Anonymous Referee #1, 21 Apr 2024
The manuscript introduces the adaptation and optimization of the CMAQ 5.3.3 for high-performance cloud computing environments. This paper fits the scope of GMD and can serve as a detailed reference to showcase how the CMAQ model can enhance computational efficiency and accessibility for diverse modeling tasks. Here are some minor suggestions that could be addressed to further improve the paper.
Line 115-125: How about illustrating the CMAQ workflow using a figure? It would help readers better understand how CMAQ works.
Line 130-150: This is lengthy and somewhat difficult to follow. Please break it into multiple paragraphs to enhance readability.
Line 160: Figure 1 only shows the rectangle of CONUS but lacks grid representation. I suggest exemplifying the grids over an area of interest with a zoom-in minimap.
Line 165-290: Section 3 offers valuable insights into CMAQ deployment from an engineering perspective. However, to align more closely with the scientific paper, consider pivoting towards system or experiment design to elucidate the methodology behind this work, while relocating detailed technical tutorials to an appendix.
Line 335-410: How about combining Figure 6/7, 8/9/10, 11/12/13/14/15/16? It is a little bit hard for readers to compare the results across multiple figures.
Line 495-560: The current discussion could be streamlined and organized into subtopics, such as the strengths of the proposed cloud-based implementation, scalability/reusability, limitations, and future research recommendations. Meanwhile, a conclusion section is recommended to summarize the research findings from this work.
Citation: https://doi.org/10.5194/egusphere-2023-3045-RC1 -
AC1: 'Reply on RC1', Saravanan Arunachalam, 20 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-3045/egusphere-2023-3045-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Saravanan Arunachalam, 20 Jun 2024
-
RC2: 'Comment on egusphere-2023-3045', Anonymous Referee #2, 23 Apr 2024
This manuscript presents a research effort to enable CMAQ modeling and data analysis on high performance cloud computing. CMAQ is a popular air quality model and has been widely applied for numerous regulation and research purposes. The application of CMAQ however, is somehow still limited since it requires preparing all inputs and run scripts on one single server. Enable CMAQ on cloud server would make it more convenient to run the model and would probably promote its applications to a broader community. The study is really a worth of efforts. The manuscript provided clear descriptions of the flow chart and sufficient details of each section of the modeling system, and also demonstrated the changes in cost and efficiency clearly. Therefore, I would recommend it to be accepted for publication, if the following minor comments could be properly addressed.
Â
comment#1: Computational efficiency for traditional CMAQ is not linearly increasing with more CPU cores and data I/O is a big reason. But for cloud-based CMAQ it seems horizontal advection is most time consuming, which is a little surprising. Please provide a brief discussion regarding this change.
Â
comment#2: I guess the current cloud version doesn’t support two-way mode WRF-CMAQ, please clarify if it is correct. Also, does it support online modules for MEGAN and dust emission?
Â
comment#3: Fig.3 and Fig.4 is not mentioned in the main text. It’s necessary to briefly explain the flowchart although the figure is quite self-explained.
Â
comment#4: Fig.8 and Fig.9: It’s interesting to notice that pinning on Azure speeds up vertical diffusion but on AWS slows it, please provide a discussion to briefly explain the difference.
Â
comment#5: It’s important to notice that only a few variables are saved to 1-layer conc file during the test in section4.2.2, while in real application the variables and layers may be much more and larger. Please provide a brief discussion to justify if the test runs shown in this study are representative for typical CMAQ applications.
Â
comment#6: Fig.18, Fig.19, and Fig.21: Showing screen print is straightforward but a little improper for journal publication, it’s better to summarize the important numbers into a concise figure or table.
Â
Â
Citation: https://doi.org/10.5194/egusphere-2023-3045-RC2 -
AC2: 'Reply on RC2', Saravanan Arunachalam, 20 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2023-3045/egusphere-2023-3045-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Saravanan Arunachalam, 20 Jun 2024
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
329 | 102 | 26 | 457 | 14 | 17 |
- HTML: 329
- PDF: 102
- XML: 26
- Total: 457
- BibTeX: 14
- EndNote: 17
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Christos I. Efstathiou
Elizabeth Adams
Carlie J. Coats
Robert Zelt
Mark Reed
John McGee
Kristen M. Foley
Fahim I. Sidi
David C. Wong
Steven Fine
Saravanan Arunachalam
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(4431 KB) - Metadata XML