the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Parflow 3.9: development of lightweight embedded DSLs for geoscientific models
Zbigniew P. Piotrowski
Jaro Hokkanen
Daniel Caviedes-Voullieme
Olaf Stein
Stefan Kollet
Abstract. Recognizing the leap in high-performance computing with accelerated co-processors, we propose a lightweight approach to adapt legacy codes to next generation hardware and achieve efficiently a high degree of performance portability. We focus on abstracting the computing kernels at the loop levels based on the lightweight, preprocessor-based embedded Domain Specific Language (eDSL) concept in conjunction with Unified Memory management. We outline a set of code pre-adaptations that facilitate the proposed abstraction. In two geophysical code applications programmed in C and Fortran, we demonstrate the efficiency of the eDSL approach in adaptation to NVIDIA GPUs with: native CUDA and Kokkos eDSL backends achieving up to 10–30 fold speedup. Our experience suggests that the proposed lightweight eDSL code adaptation is less expensive in terms of Full Time Equivalent of effort than adaptation based on complex DSL approaches, even if no earlier GPU competence exists.
- Preprint
(558 KB) - Metadata XML
- BibTeX
- EndNote
Zbigniew P. Piotrowski et al.
Status: open (until 25 Oct 2023)
-
RC1: 'Comment on egusphere-2023-1079', Anonymous Referee #1, 22 Sep 2023
reply
The paper claims to propose a novel approach for the adaptation of legacy codes to next generation hardware by using an ebedded Domain Specific Language (eDSL) concept. It also presents two application examples.The main questions the paper would need to answer are what the novelity of the approach is, how it differs from existing approaches and why it is favorable. If it is a general approach it should be applicable to very different kinds of existing code and achieve performance portability, i.e. the program should run with an acceptable efficiency on different hardware platforms (not just run at all).The approach presented by the authors unfortunately is simply the use of preprocessor macros to encapsulate rather basic constructs like memory allocation or loops. This is a programming technique, which is hardly a novelity. Preprocessor macros have been used intensively in the last century. However, programming experts do not recommend the usage of macros as the can circumvent major syntax checkings of the compiler. When the constructs get more complicated than in the examples presented by the authors they are also hard to read and often the code is hard to debug. Calling something as oldfashioned as precompiler macros "eDSL" does not make them more modern. I doubt that many programmers would call memory allocation or loops as part of a "kernel".The authors then present two examples for the application of their "eDSL": the hydrology code ParFlow and the flow solver MPDATA. The presented graphs demonstrate, that the codes run both on GPUs and CPUs and that they are considerably faster on GPUs. However, based on the given information it is hard to assess how valid this information is. Usually GPUs require a different organisation of memory and program code than CPUs for optimal performance and some algorithms are easier transferable to GPUs than other. Therefore would be necessary to know more about the numerical algorithms used to solve the problems. Was an explicit or an implicit time stepping scheme used? If an implicit scheme, which linear solver? Some solvers operate well on GPUs, but require much more iterations, than better solvers, which are not easily transferable to the simplified architecture of a GPU. It would also be interesting, if the codes achieve a significant fraction of the peak performance on both architectures. However, this informations are missing.As modern simulation codes are complex pieces of software consisting components as grid manger, matrix assembly, nonlinear and linear solvers etc. it is not clear, how this macro-based eDSL should be applied. For many problems the solution of the linear equation systems is the most expensive part of the software. Usually highly optimized libraries like Hypre, PETSc... are used to perform this task. How should the eDSL of the authors be generalized to software like this? The approach seems most suitable for rather simple stencil-based problems on regular grids. However, these kinds of problems are easily rewritten in more powerful DSLs, which do not just produce different loop commands, but performance optimized code for different platforms.The article is written in a rather vague and imprecise style. The introduction reads more like a short history of the development of high performance computing (with too few citations) and the chapter on application agnostic eDSL for accelerators is also not very concrete. The title of the paper is not really fitting. According to the text, the authors want to present a general approach for geoscience models, not a new version of Parflow.Overall, the authors present precompiler macros, a decades old programming technique as new approach to modernize legacy codes and demonstrate performance gains, which can not really be put in perspective. To me this looks like old wine in new skins. As I see neither the novelty nor the added scientific value of this approach, I recommend to reject the paper.Citation: https://doi.org/
10.5194/egusphere-2023-1079-RC1
Zbigniew P. Piotrowski et al.
Zbigniew P. Piotrowski et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
209 | 60 | 8 | 277 | 3 | 3 |
- HTML: 209
- PDF: 60
- XML: 8
- Total: 277
- BibTeX: 3
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1