the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
ImageGrains 2.0: Improved precision and generalization for grain segmentation
Abstract. Recent advances in deep-learning–based image segmentation have enabled the development of automated approaches to detect individual grains and measure them for geoscientific applications. These methods facilitate the creation of much larger and more precise datasets than traditional manual grain measurements. However, they typically perform best as specialized models trained on homogeneous, task-specific datasets, and often show reduced accuracy when used to generalize to different data types.
Here, we present an updated framework, ImageGrains 2.0 that leverages Cellpose-SAM, a recently published next-generation deep-learning model originally developed for cell segmentation in biomedical research. It currently represents the state of the art for dense segmentation in 2D and 3D biomedical datasets, and yields robust, and is capable to generalize across distinctly different image datasets. These properties allow us to re-train the model with geoscientific dataset comprising annotated images of fluvial gravel, coarse pro-glacial deposits, and X-ray computer tomography scans of glacial till and marine sand. We benchmark the segmentation performance of the method against ground-truth annotations, compare it to the performance of other segmentation methods, and we evaluate measurement accuracy. Our results indicate that this approach outperforms existing methods and confirm that the outstanding performance of Cellpose-SAM is transferable to segment sediment grains. We analyze the size and shape of these segmented grains and find that an increase in grain segmentation accuracy leads to more precise and realistic morphometric results, e.g., more accurate grain size distributions. Additionally, we introduce an interactive graphical user interface for image annotation and correction of model predictions, facilitating the use of the framework in a broader range of image settings. Furthermore, this study underscores the importance of curating of more publicly available datasets, which could pave the way towards the generation of a foundation model for segmenting granular particles in geoscientific imagery.
- Preprint
(5272 KB) - Metadata XML
-
Supplement
(1267 KB) - BibTeX
- EndNote
Status: open (until 03 Mar 2026)
-
RC1: 'Comment on egusphere-2025-6346', Laure Guerit, 18 Feb 2026
reply
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2025-6346/egusphere-2025-6346-RC1-supplement.pdfReplyCitation: https://doi.org/
10.5194/egusphere-2025-6346-RC1 -
RC2: 'Comment on egusphere-2025-6346', Pauline Delorme, 23 Feb 2026
reply
Review on « ImageGrains 2.0 : improved precision and generalization for grain segmentation »
by Mair et al., submitted to Earth Surface Dynamics.
Summary
The manuscript presents ImageGrains 2.0, an updated workflow for automated grain segmentation in geoscientific imagery based on a fine‑tuned Cellpose‑SAM deep‑learning architecture. The authors adapt a biomedical segmentation model to detect sediment grains across a wide range of environments, including fluvial gravel, conglomerates, pro‑glacial deposits, and XR‑CT scans. The model is trained on an expanded and diversified dataset (IG2). Benchmarking against previous approaches demonstrates that the new model substantially improves segmentation accuracy and the reliability of grain morphometric measurements. The manuscript is generally well written and supported by clear illustrations. The proposed method represents a meaningful advance for automated grain‑size and shape analysis.
Major comments
The introduction would benefit from a clearer explanation of why grain‑size and grain‑shape data matter for geomorphology, sediment transport, and landscape evolution. The authors cite relevant literature but do not articulate the broader implications (e.g., hydraulic roughness, abrasion, sediment sorting, transport thresholds, paleo‑environmental reconstruction).
I do not fully understand what the main improvement is compared to Mair et al. (2024). Figures 2 in Mair et al. (2024) and in this paper are very similar, and the text only briefly mentions the ViT backbone. I would appreciate if the authors provided more detail on the differences between Cellpose and Cellpose‑SAM.
One of the major novelties of the paper appears to be the enriched dataset (IG2). In line 129, the authors claim that the dataset includes “the occurrence of vegetation…”, however Figure 1 does not show any vegetation in the selected images. It would be useful to know what density or type of vegetation was included in the training data to better assess the model’s ability to detect grains in vegetated channels.
For non‑specialists in deep‑learning methods, the paper is somewhat difficult to read. Section 2.3 (Cellpose‑SAM: re‑training and inference) is highly technical and hard to follow. The authors describe several modifications to the initial model but do not explain the relevance or motivation behind these changes.
In Section 2.5 (Evaluating segmentation performance), lines 275–280, the authors state that they exclude grains smaller than 8 pixels. This seems reasonable to avoid misidentification, but in Figure 1 (subset NZ1), it appears that all the small particles composing the river bed are not detected and therefore not included in the statistics. This could be problematic for users who wish to apply the model output to physical modelling of river morphology.
Figure 6 shows a comparison of many metrics, but some of them are not discussed in the main text (e.g., ΔIRn and Δeccentricity). The figure contains too many metrics that are not properly addressed, making it difficult to assess the improvement in size and shape accuracy.
Several figure captions are too short or incomplete. For example, Figure 2b is not described, and some figures lack information about what is being shown or how to interpret the metrics.
Section 4.2.2 (3D Segmentation) is promising but underdeveloped. XR‑CT is expensive and not widely accessible. A comparison with more common 3D methods (SfM, point clouds, laser scanning) would strengthen the discussion. This limitation is acknowledged later (lines 546–550), but it should be integrated earlier and more explicitly.
Minor comments
Table S1 shows a strong imbalance between subsets (e.g., APF_2 has 59 tiles, CT only 6). This may bias the model toward fluvial gravel. The authors should discuss how this affects generalization.
Line 218: “if we left these out these data” → remove the extra “these”.
Regarding the generalization capability (lines 319–320), the authors justify excluding S1_2 and PR for generalization testing, but the rationale is unclear.
Lines 349–350: “Most notably, mean differences (including the ± one sigma standard deviation range) in grain size are below 12% for both the a‑ and b‑axes…” This is incorrect when looking at Figure 6 (Δ mean b‑axis).
Line 356: “For our default model, mean delta values are consistently below 2%…” This is incorrect for mean Δeccentricity and mean Δazimuth.
Figure S2: it would be helpful to add a ±10% shaded area for clarity.
A discussion of the environmental impact of such methods would have been appreciated.
Lines 585–587: “in a broad range of image datasets and applications” — please elaborate. Which types of applications? How might the limitations of the method affect these applications?
Finally, as highlighted in the “future directions” section, one of the critical limitations of this method is the need for the existence or creation of a large dataset of manually annotated images (a substantial amount of work was required to create the IG2 dataset). This represents a considerable effort, and in many research contexts it is difficult to envision such extensive manual annotation as a feasible or scalable requirement.
This manuscript addresses an important methodological bottleneck in quantitative geomorphology. The dataset is rich, the benchmarking is solid, and the proposed workflow is a real improvement. However, the paper would benefit from clearer explanations of the model architecture, a more transparent comparison with previous work, and a deeper discussion of limitations, especially regarding small grains, dataset imbalance, and the accessibility of 3D methods.
I’m confident the athors can adress the comments, and looking forward to read a revised version.
Sincerely,
Pauline Delorme (ENS-PSL, département de Géosciences)
Citation: https://doi.org/10.5194/egusphere-2025-6346-RC2
Data sets
ImageGrains 2.0 dataset D. Mair et al. https://doi.org/10.5281/zenodo.17866826
Model code and software
ImageGrains 2.0 models D. Mair et al. https://doi.org/10.5281/zenodo.15309323
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 272 | 182 | 22 | 476 | 30 | 13 | 15 |
- HTML: 272
- PDF: 182
- XML: 22
- Total: 476
- Supplement: 30
- BibTeX: 13
- EndNote: 15
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1