Preprints
https://doi.org/10.5194/egusphere-2025-1733
https://doi.org/10.5194/egusphere-2025-1733
02 Jun 2025
 | 02 Jun 2025

Best practices in software development for robust and reproducible geoscientific models based on insights from the Global Carbon Project models

Konstantin Gregor, Benjamin F. Meyer, Tillmann Gaida, Victor Justo Vasquez, Karina Bett-Williams, Matthew Forrest, João P. Darela-Filho, Sam Rabin, Marcos Longo, Joe R. Melton, Johan Nord, Peter Anthoni, Vladislav Bastrikov, Thomas Colligan, Christine Delire, Michael C. Dietze, George Hurtt, Akihiko Ito, Lasse T. Keetz, Jürgen Knauer, Johannes Köster, Tzu-Shun Lin, Lei Ma, Marie Minvielle, Stefan Olin, Sebastian Ostberg, Hao Shi, Reiner Schnur, Urs Schönenberger, Qing Sun, Peter E. Thornton, and Anja Rammig

Abstract. Computational models play an increasingly vital role in scientific research, by numerically simulating processes that cannot be solved analytically. Such models are fundamental in geosciences and offer critical insights into the impacts of global change on the Earth system today and in the future. Beyond their value as research tools, models are also software products and should therefore adhere to certain established software engineering standards. However, scientists are rarely trained as software developers, which can lead to potential deficiencies in software quality like unreadable, inefficient, or erroneous code. The complexity of these models, coupled with their integration into broader workflows, also often makes reproducing results, evaluating processes, and building upon them highly challenging.

In this paper, we review the current practices within the development processes of the state-of-the-art land surface models used by the Global Carbon Project. By combining the experience of modelers from the respective research groups with the expertise of professional software engineers, we bridge the gap between software development and scientific modeling to outline key principles and tools for improving software quality in research. We explore four main areas: 1) model testing and validation, 2) scientific, technical, and user documentation, 3) version control, continuous integration, and code review, and 4) the portability and reproducibility of workflows.

Our review of current models reveals that while modeling communities are incorporating many of the suggested practices, significant room for improvement remains in areas such as automated testing, documentation, and reproducible workflows. For instance, there is limited adoption of automated documentation and testing, and provision of reproducible workflow pipelines remains an exception. This highlights the need to identify and promote essential software engineering practices within the scientific community. Nonetheless, we also discuss numerous examples of practices within the community that can serve as guidelines for other models and could even help streamline processes within the entire community.

We conclude with an open-source example implementation of these principles built around the LPJ-GUESS model, showcasing portable and reproducible data flows, a continuous integration setup, and web-based visualizations. This example may serve as a practical resource for model developers, users, and all scientists engaged in scientific programming.

Competing interests: At least one of the (co-)authors is a member of the editorial board of Geoscientific Model Development. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share

Journal article(s) based on this preprint

25 Mar 2026
| Review and perspective paper
Best practices in software development for robust and reproducible geoscientific models based on insights from the Global Carbon Budget's dynamic vegetation models
Konstantin Gregor, Benjamin F. Meyer, Tillmann Gaida, Victor Justo Vasquez, Karina Bett-Williams, Matthew Forrest, João P. Darela-Filho, Sam Rabin, Marcos Longo, Joe R. Melton, Johan Nord, Peter Anthoni, Vladislav Bastrikov, Thomas Colligan, Christine Delire, Michael C. Dietze, George Hurtt, Akihiko Ito, Lasse T. Keetz, Jürgen Knauer, Johannes Köster, Tzu-Shun Lin, Lei Ma, Marie Minvielle, Stefan Olin, Sebastian Ostberg, Hao Shi, Reiner Schnur, Qing Sun, Peter E. Thornton, and Anja Rammig
Geosci. Model Dev., 19, 2407–2436, https://doi.org/10.5194/gmd-19-2407-2026,https://doi.org/10.5194/gmd-19-2407-2026, 2026
Short summary Editorial statement
Konstantin Gregor, Benjamin F. Meyer, Tillmann Gaida, Victor Justo Vasquez, Karina Bett-Williams, Matthew Forrest, João P. Darela-Filho, Sam Rabin, Marcos Longo, Joe R. Melton, Johan Nord, Peter Anthoni, Vladislav Bastrikov, Thomas Colligan, Christine Delire, Michael C. Dietze, George Hurtt, Akihiko Ito, Lasse T. Keetz, Jürgen Knauer, Johannes Köster, Tzu-Shun Lin, Lei Ma, Marie Minvielle, Stefan Olin, Sebastian Ostberg, Hao Shi, Reiner Schnur, Urs Schönenberger, Qing Sun, Peter E. Thornton, and Anja Rammig

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-1733', Anonymous Referee #1, 23 Aug 2025
    • RC2: 'Reply on RC1', Anonymous Referee #1, 23 Aug 2025
      • AC3: 'Reply on RC2', Konstantin Gregor, 22 Jan 2026
    • RC3: 'Reply on RC1', Anonymous Referee #1, 23 Aug 2025
      • AC4: 'Reply on RC3', Konstantin Gregor, 22 Jan 2026
    • AC1: 'Reply on RC1', Konstantin Gregor, 20 Jan 2026
      • AC5: 'Short addition regarding the revised manuscript', Konstantin Gregor, 25 Feb 2026
  • RC4: 'Comment on egusphere-2025-1733', Anonymous Referee #2, 23 Dec 2025
    • AC2: 'Reply on RC4', Konstantin Gregor, 20 Jan 2026

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-1733', Anonymous Referee #1, 23 Aug 2025
    • RC2: 'Reply on RC1', Anonymous Referee #1, 23 Aug 2025
      • AC3: 'Reply on RC2', Konstantin Gregor, 22 Jan 2026
    • RC3: 'Reply on RC1', Anonymous Referee #1, 23 Aug 2025
      • AC4: 'Reply on RC3', Konstantin Gregor, 22 Jan 2026
    • AC1: 'Reply on RC1', Konstantin Gregor, 20 Jan 2026
      • AC5: 'Short addition regarding the revised manuscript', Konstantin Gregor, 25 Feb 2026
  • RC4: 'Comment on egusphere-2025-1733', Anonymous Referee #2, 23 Dec 2025
    • AC2: 'Reply on RC4', Konstantin Gregor, 20 Jan 2026

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Konstantin Gregor on behalf of the Authors (19 Feb 2026)  Author's response   Author's tracked changes   Manuscript 
ED: Referee Nomination & Report Request started (20 Feb 2026) by Juan Antonio Añel
RR by Anonymous Referee #2 (25 Feb 2026)
ED: Publish subject to minor revisions (review by editor) (26 Feb 2026) by Juan Antonio Añel
AR by Konstantin Gregor on behalf of the Authors (26 Feb 2026)  Author's response   Author's tracked changes   Manuscript 
ED: Publish as is (02 Mar 2026) by Juan Antonio Añel
ED: Publish as is (02 Mar 2026) by Juan Antonio Añel (Executive editor)
AR by Konstantin Gregor on behalf of the Authors (10 Mar 2026)  Manuscript 

Journal article(s) based on this preprint

25 Mar 2026
| Review and perspective paper
Best practices in software development for robust and reproducible geoscientific models based on insights from the Global Carbon Budget's dynamic vegetation models
Konstantin Gregor, Benjamin F. Meyer, Tillmann Gaida, Victor Justo Vasquez, Karina Bett-Williams, Matthew Forrest, João P. Darela-Filho, Sam Rabin, Marcos Longo, Joe R. Melton, Johan Nord, Peter Anthoni, Vladislav Bastrikov, Thomas Colligan, Christine Delire, Michael C. Dietze, George Hurtt, Akihiko Ito, Lasse T. Keetz, Jürgen Knauer, Johannes Köster, Tzu-Shun Lin, Lei Ma, Marie Minvielle, Stefan Olin, Sebastian Ostberg, Hao Shi, Reiner Schnur, Qing Sun, Peter E. Thornton, and Anja Rammig
Geosci. Model Dev., 19, 2407–2436, https://doi.org/10.5194/gmd-19-2407-2026,https://doi.org/10.5194/gmd-19-2407-2026, 2026
Short summary Editorial statement
Konstantin Gregor, Benjamin F. Meyer, Tillmann Gaida, Victor Justo Vasquez, Karina Bett-Williams, Matthew Forrest, João P. Darela-Filho, Sam Rabin, Marcos Longo, Joe R. Melton, Johan Nord, Peter Anthoni, Vladislav Bastrikov, Thomas Colligan, Christine Delire, Michael C. Dietze, George Hurtt, Akihiko Ito, Lasse T. Keetz, Jürgen Knauer, Johannes Köster, Tzu-Shun Lin, Lei Ma, Marie Minvielle, Stefan Olin, Sebastian Ostberg, Hao Shi, Reiner Schnur, Urs Schönenberger, Qing Sun, Peter E. Thornton, and Anja Rammig

Model code and software

Model workflow showcase Konstantin Gregor https://doi.org/10.5281/zenodo.15191116

Konstantin Gregor, Benjamin F. Meyer, Tillmann Gaida, Victor Justo Vasquez, Karina Bett-Williams, Matthew Forrest, João P. Darela-Filho, Sam Rabin, Marcos Longo, Joe R. Melton, Johan Nord, Peter Anthoni, Vladislav Bastrikov, Thomas Colligan, Christine Delire, Michael C. Dietze, George Hurtt, Akihiko Ito, Lasse T. Keetz, Jürgen Knauer, Johannes Köster, Tzu-Shun Lin, Lei Ma, Marie Minvielle, Stefan Olin, Sebastian Ostberg, Hao Shi, Reiner Schnur, Urs Schönenberger, Qing Sun, Peter E. Thornton, and Anja Rammig

Viewed

Total article views: 3,153 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
2,685 409 59 3,153 56 74
  • HTML: 2,685
  • PDF: 409
  • XML: 59
  • Total: 3,153
  • BibTeX: 56
  • EndNote: 74
Views and downloads (calculated since 02 Jun 2025)
Cumulative views and downloads (calculated since 02 Jun 2025)

Viewed (geographical distribution)

Total article views: 2,987 (including HTML, PDF, and XML) Thereof 2,987 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 11 Apr 2026
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
Geoscientific models are crucial for understanding Earth’s processes. However, they sometimes do not adhere to highest software quality standards, and scientific results are often hard to reproduce due to the complexity of the workflows. Here we gather the expertise of 20 modeling groups and software engineers to define best practices for making geoscientific models maintainable, usable, and reproducible. We conclude with an open-source example serving as a reference for modeling communities.
Share