the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Enhancing the advection module performance in the EPICC-Model V1.0 via GPU-HADVPPM4HIP V1.0 coupling and GPU-optimized strategies
Abstract. The rapid development of Graphics Processing Units (GPUs) has established new computational paradigms for enhancing air quality modeling efficiency. In this study, the heterogeneous-compute interface for portability (HIP) was implemented to parallel computing of the piecewise parabolic method (PPM) advection solver (HADVPPM) on China’s domestic GPU-like accelerators (GPU-HADVPPM4HIP V1.0). Computational performance was enhanced through three strategic optimizations: reducing the central processing unit (CPU) and GPU (CPU-GPU) data transfer frequency, thread-block coordinated indexing, and the Message Passing Interface and HIP (“MPI+HIP”) hybrid parallelization across heterogeneous computing clusters. Following validation of the GPU-HADVPPM4HIP V1.0 program’s offline computational consistency and the pollutant simulation performance of the Emission and atmospheric Processes Integrated and Coupled Community version 1.0 (EPICC-Model V1.0) on the Earth System Numerical Simulation Facility (EarthLab), comprehensive performance testing was conducted. Offline benchmark results demonstrated that GPU-HADVPPM4HIP V1.0 achieved a maximum speedup of 556.5x on a domestic GPU-like accelerator with compiler optimization. Integration of GPU-HADVPPM4HIP V1.0 into EPICC-Model V1.0, combined with optimized CPU-GPU communication frequency and thread-block coordinated indexing strategies, yielded model-level computational efficiency improvements of 17.0x and 1.5x, respectively. At the module level, GPU-HADVPPM4HIP V1.0 exhibited a 39.3 % computational efficiency gain when accounting for CPU-GPU data transfer overhead, which escalated to 20.5x acceleration when excluding communication costs. This coupling establishes a foundational framework for adapting air quality models to China’s domestic GPU-like architectures and identifies critical optimization pathways. Moreover, the methodology provides essential technical support for achieving full-model GPU implementation of the EPICC-Model, addressing both current computational constraints and future demands for high-resolution air quality simulations.
- Preprint
(3056 KB) - Metadata XML
-
Supplement
(105 KB) - BibTeX
- EndNote
Status: open (until 25 Nov 2025)
-
CEC1: 'No compliance with the policy of the journal', Juan Antonio Añel, 07 Oct 2025
reply
-
CC1: 'Reply on CEC1', Kai Cao, 09 Oct 2025
reply
Dear Juan A. Añel,
As noted, our previous article titled “GPU-HADVPPM4HIP V1.0: using the heterogeneous-compute interface for portability (HIP) to speed up the piecewise parabolic method in the CAMx (v6.10) air quality model on China’s domestic GPU-like accelerator” also made use of the open-source air quality model CAMx. The CAMx source code is listed on the official website (https://www.camx.com/download/source/) as available for download upon user registration and request. Moreover, prior versions of the CMAQ source code were also hosted on the official EPA website (https://www.epa.gov/cmaq), which similarly required users to register and submit a request for download. Similarly, in the our present work, users may register and request access to the model code via the IAP provided website (https://earthlab.iap.ac.cn/resdown/info_388.html) and the Zenodo repository (https://doi.org/10.5281/zenodo.17071574). We believe this procedure aligns with GMD’s code and data policy, as outlined at: https://www.geoscientific-model-development.net/policies/code_and_data_policy.html.
The source code of the EPICC-Model is provided for download on the IAP website (https://earthlab.iap.ac.cn/resdown/info_388.html), which features a Chinese-language interface. Many readers may find it difficult to navigate the page because of language barriers. Consequently, we contacted the EPICC-Model working group and proposed uploading the EPICC-Model source code to Zenodo. After consideration, the working group adopted our suggestion. Users now access the EPICC-Model source code by registering and submitting a request on the Zenodo address (https://doi.org/10.5281/zenodo.17071574). It is important to emphasize that we also downloaded the EPICC-Model source code from the IAP website (https://earthlab.iap.ac.cn/resdown/info_388.html), and our contribution constitutes only a part of the overall EPICC-Model framework. The code we developed has been previously uploaded to Zenodo in an earlier version. The remainder of the EPICC-Model was developed by the EPICC-Model working group or other research institutions, and we are not authorized to distribute code for which we were not involved in the development. We have only released the portions to which we hold copyright and have appropriately cited all relevant work by others. The accessibility of the code has been verified and remains effective, which anyone can freely download the whole EPICC-Model code though the email application, contacting to Working-Group@EPICC-Model.cn, according to the "Terms of Use" in the zonodo website (https://doi.org/10.5281/zenodo.17071574) shown in "code and data availabiliry" section in out manuscript.
By clearly indicating how the full model code can be obtained, we are confident that our manuscript meets the requirements set forth in GMD’s policy.
Kai Cao
Citation: https://doi.org/10.5194/egusphere-2025-2918-CC1 -
CEC2: 'Reply on CC1', Juan Antonio Añel, 10 Oct 2025
reply
For the records, and after this comment that seems to be from the first author, I have contacted the authors of the EPICC-Model v1.0 through the address provided in the Zenodo repository to check if it is actually possible to get access to the code and that it is stored in such repository. The corresponding authors and the Topical Editor have been cc'ed.
In the future, to be sure that comments are actually posted by authors, it would be good if they are submitted as "Author comments" and not "Community comments". In this regard, we would thank a confirmation from one of the corresponding authors that all the code developed for this manuscript is included in the Zenodo repository shared, and a copy of the license that you obtained when you got your copy of the EPICC-Model v1.0, to verify that you can not redistribute it.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-2918-CEC2 -
AC1: 'Reply on CEC2', Qizhong Wu, 27 Oct 2025
reply
Dear Prof. Juan A. Anel,
My name is Qizhong Wu, from Beijing Normal University, one of the corresponding authors on the manuscript "egusphere-2025-2918" and "egusphere-2025-4441". Thank you very much for your hard work. We have noticed that the EPICC wording group had replied to you another email containing an attachment "EPICC-Model_User_Agreement.pdf". You can reply to the EPICC working group with the signed pdf file, and then the working group will provide you with a download link containing the EPICC model codes, which stored on Zenodo services.
In the user agreement, there is a section "4. Prohibition of Transfer and Redistribution", "Users are strictly prohibited from transferring, selling, or sharing the EPICC-Model software or Zenodo account credentials, whether for profit or free of charge, to any third party". Therefore, users must directly obtain the EPICC model codes from the working group, we can’t redistribute it.
Actually, the model description of EPICC-Model can be found at https://earthlab.iap.ac.cn/resdown/info_388.html, and the model codes and test data files also have been provided in this website. Model users can register, download user_agreement file, upload signed user_agreement pdf file, and then download the model codes. But the website is in Chinese, the working group provides another website link in the Zenodo, https://doi.org/10.5281/zenodo.17071574, for international users. Before this article had been accepted for open discussion, we contacted to the EPICC-Model working group, who added a “Terms of Use” section at the Zenodo website link, and provided a contact email to apply for the model codes. We also tested that the model codes can be downloaded from Zenodo website after replying the email (Working-Group@EPICC-Model.cn) with the attached signed user_agreement pdf file.
Thanks again for your hard work. Regards,
Qizhong Wu
Beijing Normal University
2025/10/27
-
CEC3: 'Reply on AC1', Juan Antonio Añel, 27 Oct 2025
reply
Dear authors,
Many thanks for your reply. As I mentioned in the email correspondence with you days ago, we need to clarify first whether any of the authors of this manuscript is also a member of the developing team for the EPIC-Model. Depending on the answer, we will explore the possibilities to determine whether your manuscript complies with our policy. In the meantime, the review process and Discussion for your manuscript should be stalled.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-2918-CEC3 -
AC2: 'Reply on CEC3', Qizhong Wu, 27 Oct 2025
reply
Dear Prof. Juan A. Anel,
In our manuscript, we cited the model description about the EPICC-Model in “EPICC-Model Working Group.: Description and evaluation of the Emission and atmospheric Processes Integrated and Coupled Community (EPICC) Model version 1.0. Adv. Atmos. Sci., http://www.iapjournals.ac.cn/aas/en/article/doi/ 10.1007/s00376-025-4384-y, 2025.”, in that paper, we can find that the only author is “EPICC-Model Working Group” with the corresponding email, not the authors list or model developer members lists. Perhaps the working group wants to develop a new collaborative mechanism for the model.
The model module “GPU-HADVPPM4HIP V1.0” we contributed had been uploaded to Zenodo (https://zenodo.org/records/16916413), which includes the codes and test datasets, and can be downloaded without restrictions. The remainder of the EPICC-Model, CAMx, or CMAQ was developed by the EPICC-Model working group or other research institutions, and we are not authorized to distribute the model codes for which we were not involved in the development. Therefore, we provided the official download link for the corresponding model. We believe this procedure aligns with GMD’s code and data policy, as outlined at: https://www.geoscientific-model-development.net/policies/code_and_data _policy.html.
Sorry for the delay in responding for several days, as I have other things to complete in the past week.
Best Regards,
Qizhong Wu
Beijing Normal University
2025/10/27
Citation: https://doi.org/10.5194/egusphere-2025-2918-AC2 -
CEC4: 'Reply on AC2', Juan Antonio Añel, 27 Oct 2025
reply
Dear authors,
I do not understand why you continue avoiding to reply to our question. It is quite easy, and we need an answer to be able to move on on the assessment of your manuscript. I am making it clear, and we would expect a simple "no" or "yes".
Is any of the authors of this submitted manuscript involved in the development of EPICC-Model V1.0 or member of its development team?
Thanks,
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-2918-CEC4 -
AC3: 'Reply on CEC4', Qizhong Wu, 27 Oct 2025
reply
Dear Prof. Juan A. Anel,
Anyone of the authors in our submitted manuscript is not named “EPICC-Model Working Group”, and the "EPICC-Model Working Group" doesn't provide the name of members. At least I couldn't find it.
Best regards,
Qizhong Wu
Beijing Normal University
2025/10/27
Citation: https://doi.org/10.5194/egusphere-2025-2918-AC3 -
CEC5: 'Reply on AC3', Juan Antonio Añel, 28 Oct 2025
reply
Dear authors,
Given that you have failed to provide a public repository for the EPICC-Model V1.0, and clarify if any of the authors of the manuscript is a developer of it, we are sorry, but we can not consider for peer-review or publish your manuscript in Geoscientific Model Development due to its non-compliance with the policy of the journal.
At this point, you could want consider to withdraw the submission of your manuscript.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2025-2918-CEC5
-
CEC5: 'Reply on AC3', Juan Antonio Añel, 28 Oct 2025
reply
-
AC3: 'Reply on CEC4', Qizhong Wu, 27 Oct 2025
reply
-
CEC4: 'Reply on AC2', Juan Antonio Añel, 27 Oct 2025
reply
-
AC2: 'Reply on CEC3', Qizhong Wu, 27 Oct 2025
reply
-
CEC3: 'Reply on AC1', Juan Antonio Añel, 27 Oct 2025
reply
-
AC1: 'Reply on CEC2', Qizhong Wu, 27 Oct 2025
reply
-
CEC2: 'Reply on CC1', Juan Antonio Añel, 10 Oct 2025
reply
-
CC1: 'Reply on CEC1', Kai Cao, 09 Oct 2025
reply
-
RC1: 'Comment on egusphere-2025-2918', Anonymous Referee #1, 19 Nov 2025
reply
I have read the conflicting policies on open-source requirements between the authors and GMD. I hope this issue will be resolved soon.
The manuscript presents a method to accelerate the EPICC air quality model (EPICC-Model v1.0) using China GPU-like accelerators. The authors port the advection module, one of the most computationally expensive components, onto the GPU to reduce the computational burden. They demonstrate that the offline module achieves several orders of magnitude speed up relative to the CPU alone version and 1.5x faster when integrating GPU-HADVPPM4HP into the EPICC model. GPU acceleration of numerical models is in an early stage of development, and no mature GPU-enabled models are yet widely used in air quality forecasting. I highly appreciate the authors’ technical effort, including rewriting the model from Fortran to C and then to a GPU language.
Major comments
- Run the original model (CPU alone version) EPICC-GPU with the same inputs and configurations. Compare outputs (e.g., NOx, NH3, O3, PM2.5, and PM10) and calculate the relative differences between the two versions.
- Figure 5 shows that each CPU process and GPU handles a specific number of rows and columns. This raises concerns that the model may be hardcoded to a fixed domain. How can the current design be generalized to different grid definitions.
- My main concern is the practical performance of the model. While the GPU port version clearly shows gains in offline and coupled tests. Its real world usefulness is questionable. In this paper, the authors use 10 CPU processes with 10 GPUs and report an overall performance improvement of only 1.5x. The cost of one GPU can exceed that of a 64 core CPU. Why not increase the number of CPUs instead, rather than pairing them with GPUs for such limited gain.
- Currently there seems to be no straightforward solution.
- The authors could load more computation (by porting more code) onto the GPU for a single memory allocation and copy. However, this approach introduces serial dependent computation within the kernel, reducing overall performance.
- The authors could share one GPU across multiple processors, but this creates competition for host and device data transfer, which becomes a bottleneck.
Significant benefits from GPU computing for numerical models may only be recognized once PCI bandwidth is substantially improved, or unified memory becomes more widely supported.
General comments
- The authors list many performance metrics, such as 556.5x, 17.0x, 1.5x, 20.5x, and 39.3% in the abstract, but it is unclear what each is being compared against. This is confusing for readers.
- Line 146: Typo “initial condition” should be IC.
- A critical requirement for porting modules onto GPUs is that loops must be vectorized and independent, with each iteration not dependent on previous ones. What methods do the authors use to verify that the loops in the direction module meet this requirement?
- Line 192: The text “as shown in Figure 2, the implementation of parallel computing ...”, does not align with the content of the figure.
- Line 360: The authors state that WRF is a state-of-the-art mesoscale numerical weather prediction. Note that NCAR has developed MPAS as the next generation model. Also, WRF v3.9.1 is used in this study, which is 8 years old. The latest version is 4.7.1.
- Which variables are evaluated in Table 4?
- The term “data scale” requires clarification.
- Lines 614 – 618: The authors state that eliminating non-critical variables such as sea salt aerosols could accelerate the model. A well designed GPU port should retain the integrity of the original model. Common chemical mechanisms involve hundreds of species with reaction pathways. Omitting components may compromise scientific completeness.
Citation: https://doi.org/10.5194/egusphere-2025-2918-RC1 -
RC2: 'Comment on egusphere-2025-2918', Anonymous Referee #2, 21 Nov 2025
reply
The authors present a heterogeneous CPU-GPU implementation of a horizontal advection module in the EPICC air quality model. I commend the authors’ efforts to increase air quality modeling efficiency towards timely high-resolution forecasting to protect human health and wellbeing. While generally comprehensive, the presentation at times drifts away from the work’s novelty and omits evaluating the horizontal advection module when it is coupled back to the full model.
General comments
- A 1:1 CPU:GPU ratio with only 10 CPU cores does not necessarily reflect operational runtime configurations. Considering heterogeneous compute environments and hardware availability, would CPU:GPU ratios greater than 1 lead to resource competition and/or serialize kernel calls, compromising efficiency gains?
- Considering the comparison between implementations summarized in Table 4, Section 4.2.2 seems somewhat out-of-scope, more in line with evaluation of the EPICC model itself than the performance of the accelerated horizontal advection implementation, which is this work’s novelty.
- The statements claiming a 25.0x efficiency gain seem to refer to a baseline early heterogeneous implementation that was less efficient than the standard CPU implementation. This seems to inflate the actual efficiency gain, which is confusing or potentially misleading.
- I would be interested in an accuracy evaluation of the fully coupled implementation. I wonder if the different hardware’s handling of floating precision operations could lead to greater discrepancies than presented in the offline evaluation.
Specific comments
- Line 390: Mean over what dimensions?
- Line 481: Units on 107?
Technical corrections
- Lines 145-146: “Initial conditions (BC)” should be “initial conditions (IC)”
- Line 447: Correlation coefficient should be lower-case “r”.
- For flow, “China’s domestic GPU-like accelerators” could be simplified to just “accelerator” or similar after the first occurrence.
Citation: https://doi.org/10.5194/egusphere-2025-2918-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 867 | 86 | 36 | 989 | 27 | 13 | 14 |
- HTML: 867
- PDF: 86
- XML: 36
- Total: 989
- Supplement: 27
- BibTeX: 13
- EndNote: 14
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
In your "Code and Data Availability" statement and the restricted Zenodo repository that you have linked for the code of the EPICC-Model V1.0, you say that the model code is available upon request and after signing an agreement. I am sorry, but we can not accept this, which is forbidden by our policy. Because of this lack of compliance with our policy, your manuscript should have never been accepted for Discussions . Our policy clearly states that all the code and data necessary to replicate a manuscript must be published openly and freely to anyone before submission.
Therefore, we are granting you a short time to solve this situation. You must open the repository for the EPICC-Model V1.0 as soon as possible, making it available publicly to anyone, and reply to this comment when you have done it.
I must note that if you do not fix this problem, we cannot continue with the peer-review process or accept your manuscript for publication in our journal.
Juan A. Añel
Geosci. Model Dev. Executive Editor