Short Communication: The need for open-source hardware, software, and data-sharing specifications in geomorphology

Moodie, Andrew J.; Barefoot, Eric; Hutton, Eric; Nguyen, Charles; Wickert, Andrew D.; Marr, Jeffrey

doi:10.5194/egusphere-2025-4770

Preprints

https://doi.org/10.5194/egusphere-2025-4770

Preprints

17 Oct 2025

| 17 Oct 2025

Short Communication: The need for open-source hardware, software, and data-sharing specifications in geomorphology

Andrew J. Moodie, Eric Barefoot, Eric Hutton, Charles Nguyen, Andrew D. Wickert, and Jeffrey Marr

Abstract. Geomorphologists have more data and computational resources available than ever before. Collaboration between researchers specializing in different modes of inquiry (e.g., numerical, experimental, and field-based) often accelerates impactful scientific insights, but tools to facilitate these collaborations are lacking. In this article, we present four challenges to collaboration in the geomorphology community, and provide a framework that addresses these challenges to enable research utilizing the full extent of data and computational resources available today. We report a component of this framework, a newly developed specification for a shareable data schema called sandsuet. The schema is designed to accommodate most kinds of rasterized geomorphology data, and makes it easy to package, publish, and share those data. Finally, we present possibilities for community development of resources to address other challenges to collaboration in geomorphology.

Received: 10 Oct 2025 – Discussion started: 17 Oct 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Andrew J. Moodie, Eric Barefoot, Eric Hutton, Charles Nguyen, Andrew D. Wickert, and Jeffrey Marr

Status: final response (author comments only)

RC1: 'Comment on egusphere-2025-4770', Stuart Grieve, 14 Nov 2025

I'd like to thank the authors for this manuscript. Work to improve reproducibility, openness and data sharing within geomorphology research feels increasingly important in a time when data volumes are increasing and funding in many countries is contracting. This manuscript describes the outcomes of a recent workshop where geomorphologists came together to discuss key challenges within the geomorphology community, which were identified as: data archival and sharing; modular software; open hardware and academic credit. The authors propose a new framework, sandpiper, to begin to address these challenges and the remainder of the manuscript describes a data specification, sandsuet, which addresses the first of these concerns. The manuscript is clear and well written and sets a good balance between articulating the challenges we as a community face, while moving us towards potential solutions.
General comments
Given the nature of this manuscript I do not have significant requests for additional work or analysis, but rather have a some questions and observations that I hope can help the authors to strengthen the work.
1) The first line of the abstract talks about geomorphologists having more data and compute than ever before. Do the authors believe that this is a problem that is unique to geomorphology, or that given the nature of our work, presents in a unique manner relative to other disciplines? I think there is scope to expand the introduction by giving some context from other fields outside the geosciences, identifying solutions that may exist and be appropriate for us to adopt, and highlighting where our disciplinary context makes that difficult. We wrote about data, code and reproducibility a few years back (Grieve et al., 2020) and you might find some useful references within that work to help frame these ideas.
2) I am really pleased that you are building this solution on top of NetCDF rather than developing a new standard from scratch. But when reading the manuscript I got really far through before this became apparent. Some foregrounding of NetCDF, and its wide adoption and use in other disciplines would help get people up to speed more quickly.
3) The manuscript talks throughout about building community, which I agree is vital for any effort such as this. There has been a lot of work done in the Research Software Engineering community around how to build and sustain communities around software projects. One nice example comes from the R community (Boettiger et al., 2015) and there is the work of Katz et al. (2018) taking a broader view of things. There are also a lot of more general resources on the Software Sustainability Institute's site: https://www.software.ac.uk/resource-hub These might help the authors frame this need within the context of what can be achieved, and what resources are needed to achieve it.
4) Similarly, in section 2.4 the authors discuss scientific credit, which is indeed very important in ensuring that everyone that contributes to a project is being recognised. There is a big body of research on software citation, both understanding how and why people cite or don't cite software, but also looking at practical solutions that make it easier to cite software or other "non-traditional" outputs. One example of this would be the citation file format, and it's integration into platforms like github: https://citation-file-format.github.io/ as well as the broader work of the FORCE11 Software Citation Working Group (Smith et al., 2016; https://force11.org/group/software-citation-working-group/). Some other relevant recent work on this topic includes Katz et al. (2018, 2021).
Line 43: "and needs in a standards" I struggled to parse this sentence.
References
Boettiger, C., Chamberlain, S., Hart, E., & Ram, K. (2015). Building software, building community: lessons from the rOpenSci project. Journal of open research software, 3(1), e8-e8.
Katz, D.S., McInnes, L.C., Bernholdt, D.E., Mayes, A.C., Hong, N.P.C., Duckles, J., Gesing, S., Heroux, M.A., Hettrick, S., Jimenez, R.C. and Pierce, M., 2018. Community organizations: Changing the culture in which research software is developed and sustained. Computing in Science & Engineering, 21(2), pp.8-24.
Katz, D.S., Hong, N.P.C., Clark, T., Muench, A., Stall, S., Bouquin, D., Cannon, M., Edmunds, S., Faez, T., Feeney, P. and Fenner, M., 2021. Recognizing the value of software: a software citation guide. F1000Research, 9, p.1257.
Katz, D.S. and Chue Hong, N.P., 2018, July. Software citation in theory and practice. In International Congress on Mathematical Software (pp. 289-296). Cham: Springer International Publishing.
A. M. Smith, D. S. Katz, K. E. Niemeyer, and FORCE11 Software Citation Working Group, “Software citation principles,” PeerJ Comput. Sci., vol. 2, no. e86, 2016 [Online]. Available: https://doi.org/10.7717/peerj-cs.86
-- Stuart Grieve

Citation: https://doi.org/10.5194/egusphere-2025-4770-RC1
RC2: 'Comment on egusphere-2025-4770', Anonymous Referee #2, 18 Nov 2025

This paper describes sandsuet, a standard for saving geomorphic raster data sets. The authors distinguish between data sharing and data archiving. My understanding of the difference is that data for sharing has the minimum possible information and is for quick dissemination of data. Data archiving is for storing important data sets which one may want to preserve for a long time so more information would be provided than would be provided with shared data. I don't really know when one gets to the archival stage - maybe after publication? To be honest I had never recognized the two data needs as separate, but after reading it makes sense why more a more flexible data sharing standard is needed.

I think this paper represents the best of our community. I am grateful to the authors and workshop participants for developing sandsuet and presenting it to the community. These types of contributions are generally thankless, as the authors indirectly point out by stating that we need better ways to give credit to open source developers. Ultimately, working on things like this means less time working on the science that is typically recognized when it comes to raise time. But this is an important contribution!
If this data sharing standard catches on it will mean that 1) I can use sandsuet without thinking about how I might want to save my data in the short term. 2) If I write some code to analyze my data saved in sandsuet format, then I can easily apply my code to analyze other people's data that are saved in sandsuet format. 3) If I share my data analysis tools/code, that is a double win for the community. If sandsuet catches on, there would likely be less recreation of analysis code, assuming I/others share my analysis code in a way that is actually useable to others. But that is a different topic not addressed here.
So this is great and please publish it. I have some minor comments. Hopefully they will help the authors see the stumbles that a newbie might have when attempting to use sandsuet. Also, what does sandsuet mean? I wondered why they chose this name.
Everything about sharing of raster data, what absolutely needs to be provided, and the definition of auxiliary variables all made sense to me. What confused me a bit is the statement that sandset is agnostic to the file storage format (L 151). I have never used netcdf, but I know a tiny bit of coding. I found Figure 1 very helpful and imagined that elevation was an instance of the sandsuet class. Later I went and looked at the demo to try and better understand, and I realized that (I think) elevation would be of type netcdf.dataset. I guess I could create my own data class if I really wanted to. I guess if I didn't use Python I could use Fortran or whatever language. But the whole time I was reading I was hung up on the file storage format. Would this really work if someone saved their data in excel? What if someone saved their data in text files (e.g. esrii ascii format). Maybe it is beyond the scope of this paper, but I thought it would be helpful to address how this might work when different scientists share data in different file storage formats.
I also wondered how one researcher might access the sandsuet data of another researcher. I don't think this was addressed. Do I need to know that Jane Doe has data that I want? Will Jane Doe post this on a list serve? Do I need to email Jane Doe? I know this is a tricky issue. I just thought it might be addressed.
I guess this paper must be the first paper coming out of the described conference. In that way, it served a dual purpose - to describe the point of the conference and to present their solution to address one aspect of the conference, data sharing. Personally, I found the discussion about open source hardware, software, and credit to open source developers slightly out-of-place because really this paper is about data sharing. I was excited to learn about shared experimental facilities, but that is not what this paper is about. I would have preferred more discussion of file storage formats and how these data would be shared, rather than a tease about community-designed experimental hardware and shared facilities. I also understand that authors may need to use this paper to meet multiple goals.

Some tiny comments.

Figure 1e : I think, but I'm not sure, that if grid location x,y is purple at the top, it does not need to be purple at depth. Because the exposed depth side shows no variation with depth, it might incorrectly imply that age cannot vary in z. If it can vary with z, then maybe show that in the example map?

Line 154 - Does time have multiple dimensions? Is this something everyone knows about but me?

Line 221 - I guess the designers are trying to be as flexible as possible, but why not impose the ordering of the spatial dimensions? It would be easier if everyone ordered in the same way, and it's not a huge ask. Maybe I don't understand what that means.

Line 225 - dimension names are arbitrary was confusing to me. They aren't exactly arbitrary. They should be descriptive enough that others understand what they represent, right?

Citation: https://doi.org/10.5194/egusphere-2025-4770-RC2
AC1: 'response to reviewer comments', Andrew Moodie, 11 Dec 2025

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4770/egusphere-2025-4770-AC1-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2025-4770-AC1

Andrew J. Moodie, Eric Barefoot, Eric Hutton, Charles Nguyen, Andrew D. Wickert, and Jeffrey Marr

Viewed

Total article views: 391 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
255	107	29	391	20	13

HTML: 255
PDF: 107
XML: 29
Total: 391
BibTeX: 20
EndNote: 13

Views and downloads (calculated since 17 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	123	17	8	148
Nov 2025	72	31	9	112
Dec 2025	49	56	12	117
Jan 2026	11	3	0	14

Cumulative views and downloads (calculated since 17 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	123	17	8	148
Nov 2025	72	31	9	112
Dec 2025	49	56	12	117
Jan 2026	11	3	0	14

Viewed (geographical distribution)

Total article views: 394 (including HTML, PDF, and XML) Thereof 394 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 09 Jan 2026

Short summary

Geomorphologists have more data and computational resources available than ever before, but lack tools to facilitate collaborations needed to integrate data from different modes of study (e.g., field, experimental, modeling). In this article, we discuss challenges to collaboration in geomorphology, and report a new schema for sharing data. The schema is designed to accommodate most kinds of rasterized geomorphology data, and makes it easy to package, publish, and share those data.


Total:	0
HTML:	0
PDF:	0
XML:	0