the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Dataset variability and carbonate concentration influence the performance of local visible-near infrared spectral models
Abstract. The application of visual and near infrared soil spectroscopy (vis–NIR) is an easy and cost-efficient way to gain a wide variety of soil information to cover high spatial and temporal resolution in large-scale soil surveys and in local field-scale studies. However, unlike for conventional methods, the prediction accuracy of vis–NIR spectral models cannot yet be estimated before the data collection, which hampers its application at the local scale where often a high precision is required (e.g., field experiments). In this study we used soil data from six agricultural fields in Eastern Switzerland and calibrated i) field-specific (local) models and ii) general models (combining all fields) for organic carbon, total carbon, total nitrogen, permanganate oxidizable carbon and pH using partial least squares regression. 24 out of 30 local models showed an accurate or even excellent performance (ratio of performance to deviation (RPD) > 2) and the root mean square errors (RMSE) of prediction were, except for pH, maximum five times higher than the lab measurement error. The variability of a specific soil property and the mean carbonate concentration in the dataset were the two factors influencing the performance of the local models. We found a significant relationship between the coefficient of variation in the dataset and the metrics for model performance (R2, percental RMSE and RPD). Starting from a tolerable prediction error for the spectral measurements, the regressions can be used to develop a sampling design that matches the corresponding target variability. The five inaccurately performing local models with RPD < 2 were on the two fields with highest carbonate content raising the question if local vis–NIR models are suitable for soils with high carbonate concentration. General models combining the datasets from all six fields showed an accurate overall performance but the RMSE on the field level were higher compared to the local models.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(1811 KB)
-
Supplement
(268 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(1811 KB) - Metadata XML
-
Supplement
(268 KB) - BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1087', Anonymous Referee #1, 26 Jul 2023
Please find my comments in the attached pdf file.
- AC1: 'Reply on RC1', Simon Oberholzer, 12 Sep 2023
-
RC2: 'Comment on egusphere-2023-1087', Anonymous Referee #2, 18 Aug 2023
The paper is well-written and addresses a topical issue i.e. the scale of the data sets for optimizing spectral models. However, the authors rely on quality performance parameters that are not independent: RMSE, R² and RPD. Minasny et al. already demonstrated some years ago that there is an overlap between these parameters and that in recent literature the RPD is mostly used for global comparison between studies. Bellon Maurel proposed to use the PIQ which is better suited for non-normally distributed data sets.
Section 2.7. Regressions between RPD and CV are spurious, as the standard deviation appears in both terms. These relations should not be used (e.g. Fig. 5 and 7). The same applies to the relation between PRMSE and CV. Such relations always give a positive trend.
Finally, the authors conclude that local models perform better than general models, but do not really give a recommendation on a sampling strategy for fields/regions with no prior knowledge of the soil properties.
Throughout the manuscript
I am not a native speaker, but I would avoid using ‘high’ and ‘low’, e.g. small and large errors. Please check with a native speaker.
Generally, ‘concentration’ refers to solutions, while ‘content’ is a more general term that can also refer to a solid in another solid. You use both terms, but I would advise to systematically use ‘content’.
Citation: https://doi.org/10.5194/egusphere-2023-1087-RC2 - AC2: 'Reply on RC2', Simon Oberholzer, 12 Sep 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-1087', Anonymous Referee #1, 26 Jul 2023
Please find my comments in the attached pdf file.
- AC1: 'Reply on RC1', Simon Oberholzer, 12 Sep 2023
-
RC2: 'Comment on egusphere-2023-1087', Anonymous Referee #2, 18 Aug 2023
The paper is well-written and addresses a topical issue i.e. the scale of the data sets for optimizing spectral models. However, the authors rely on quality performance parameters that are not independent: RMSE, R² and RPD. Minasny et al. already demonstrated some years ago that there is an overlap between these parameters and that in recent literature the RPD is mostly used for global comparison between studies. Bellon Maurel proposed to use the PIQ which is better suited for non-normally distributed data sets.
Section 2.7. Regressions between RPD and CV are spurious, as the standard deviation appears in both terms. These relations should not be used (e.g. Fig. 5 and 7). The same applies to the relation between PRMSE and CV. Such relations always give a positive trend.
Finally, the authors conclude that local models perform better than general models, but do not really give a recommendation on a sampling strategy for fields/regions with no prior knowledge of the soil properties.
Throughout the manuscript
I am not a native speaker, but I would avoid using ‘high’ and ‘low’, e.g. small and large errors. Please check with a native speaker.
Generally, ‘concentration’ refers to solutions, while ‘content’ is a more general term that can also refer to a solid in another solid. You use both terms, but I would advise to systematically use ‘content’.
Citation: https://doi.org/10.5194/egusphere-2023-1087-RC2 - AC2: 'Reply on RC2', Simon Oberholzer, 12 Sep 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
331 | 100 | 21 | 452 | 35 | 11 | 14 |
- HTML: 331
- PDF: 100
- XML: 21
- Total: 452
- Supplement: 35
- BibTeX: 11
- EndNote: 14
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Simon Oberholzer
Laura Summerauer
Markus Steffens
Chinwe Ifejika Speranza
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(1811 KB) - Metadata XML
-
Supplement
(268 KB) - BibTeX
- EndNote
- Final revised paper