Dataset variability and carbonate concentration influence the performance of local visible-near infrared spectral models

Oberholzer, Simon; Summerauer, Laura; Steffens, Markus; Ifejika Speranza, Chinwe

doi:https://doi.org/10.5194/egusphere-2023-1087

Preprints

https://doi.org/10.5194/egusphere-2023-1087

Preprints

30 Jun 2023

| 30 Jun 2023

Dataset variability and carbonate concentration influence the performance of local visible-near infrared spectral models

Simon Oberholzer, Laura Summerauer, Markus Steffens, and Chinwe Ifejika Speranza

Abstract. The application of visual and near infrared soil spectroscopy (vis–NIR) is an easy and cost-efficient way to gain a wide variety of soil information to cover high spatial and temporal resolution in large-scale soil surveys and in local field-scale studies. However, unlike for conventional methods, the prediction accuracy of vis–NIR spectral models cannot yet be estimated before the data collection, which hampers its application at the local scale where often a high precision is required (e.g., field experiments). In this study we used soil data from six agricultural fields in Eastern Switzerland and calibrated i) field-specific (local) models and ii) general models (combining all fields) for organic carbon, total carbon, total nitrogen, permanganate oxidizable carbon and pH using partial least squares regression. 24 out of 30 local models showed an accurate or even excellent performance (ratio of performance to deviation (RPD) > 2) and the root mean square errors (RMSE) of prediction were, except for pH, maximum five times higher than the lab measurement error. The variability of a specific soil property and the mean carbonate concentration in the dataset were the two factors influencing the performance of the local models. We found a significant relationship between the coefficient of variation in the dataset and the metrics for model performance (R², percental RMSE and RPD). Starting from a tolerable prediction error for the spectral measurements, the regressions can be used to develop a sampling design that matches the corresponding target variability. The five inaccurately performing local models with RPD < 2 were on the two fields with highest carbonate content raising the question if local vis–NIR models are suitable for soils with high carbonate concentration. General models combining the datasets from all six fields showed an accurate overall performance but the RMSE on the field level were higher compared to the local models.

Received: 23 May 2023 – Discussion started: 30 Jun 2023

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 1811 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (1811 KB)

Supplement (268 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

10 Apr 2024

Best performances of visible–near-infrared models in soils with little carbonate – a field study in Switzerland

Simon Oberholzer, Laura Summerauer, Markus Steffens, and Chinwe Ifejika Speranza

SOIL, 10, 231–249, https://doi.org/10.5194/soil-10-231-2024,https://doi.org/10.5194/soil-10-231-2024, 2024

Short summary

Simon Oberholzer, Laura Summerauer, Markus Steffens, and Chinwe Ifejika Speranza

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-1087', Anonymous Referee #1, 26 Jul 2023

Please find my comments in the attached pdf file.

Citation: https://doi.org/10.5194/egusphere-2023-1087-RC1
- AC1: 'Reply on RC1', Simon Oberholzer, 12 Sep 2023
  
  Dear Editor and Reviewers,
  Please find our answers to Review 1 in the attached document.
  We hope we addressed all comments to your satisfaction.
  On behalf of all co-authors,
  Simon Oberholzer
  
  Citation: https://doi.org/10.5194/egusphere-2023-1087-AC1
RC2:
'Comment on egusphere-2023-1087', Anonymous Referee #2, 18 Aug 2023

The paper is well-written and addresses a topical issue i.e. the scale of the data sets for optimizing spectral models. However, the authors rely on quality performance parameters that are not independent: RMSE, R² and RPD. Minasny et al. already demonstrated some years ago that there is an overlap between these parameters and that in recent literature the RPD is mostly used for global comparison between studies. Bellon Maurel proposed to use the PIQ which is better suited for non-normally distributed data sets.
Section 2.7. Regressions between RPD and CV are spurious, as the standard deviation appears in both terms. These relations should not be used (e.g. Fig. 5 and 7). The same applies to the relation between PRMSE and CV. Such relations always give a positive trend.
Finally, the authors conclude that local models perform better than general models, but do not really give a recommendation on a sampling strategy for fields/regions with no prior knowledge of the soil properties.
Throughout the manuscript
I am not a native speaker, but I would avoid using ‘high’ and ‘low’, e.g. small and large errors. Please check with a native speaker.
Generally, ‘concentration’ refers to solutions, while ‘content’ is a more general term that can also refer to a solid in another solid. You use both terms, but I would advise to systematically use ‘content’.

Citation: https://doi.org/10.5194/egusphere-2023-1087-RC2
- AC2: 'Reply on RC2', Simon Oberholzer, 12 Sep 2023
  
  Dear Editor and Reviewers,
  Please find attached the answers to the second review in the attached document.
  We hope we addressed all comments to your satisfaction.
  On behalf of all co-authors,
  Simon Oberholzer
  
  Citation: https://doi.org/10.5194/egusphere-2023-1087-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-1087', Anonymous Referee #1, 26 Jul 2023

Please find my comments in the attached pdf file.

Citation: https://doi.org/10.5194/egusphere-2023-1087-RC1
- AC1: 'Reply on RC1', Simon Oberholzer, 12 Sep 2023
  
  Dear Editor and Reviewers,
  Please find our answers to Review 1 in the attached document.
  We hope we addressed all comments to your satisfaction.
  On behalf of all co-authors,
  Simon Oberholzer
  
  Citation: https://doi.org/10.5194/egusphere-2023-1087-AC1
RC2:
'Comment on egusphere-2023-1087', Anonymous Referee #2, 18 Aug 2023

The paper is well-written and addresses a topical issue i.e. the scale of the data sets for optimizing spectral models. However, the authors rely on quality performance parameters that are not independent: RMSE, R² and RPD. Minasny et al. already demonstrated some years ago that there is an overlap between these parameters and that in recent literature the RPD is mostly used for global comparison between studies. Bellon Maurel proposed to use the PIQ which is better suited for non-normally distributed data sets.
Section 2.7. Regressions between RPD and CV are spurious, as the standard deviation appears in both terms. These relations should not be used (e.g. Fig. 5 and 7). The same applies to the relation between PRMSE and CV. Such relations always give a positive trend.
Finally, the authors conclude that local models perform better than general models, but do not really give a recommendation on a sampling strategy for fields/regions with no prior knowledge of the soil properties.
Throughout the manuscript
I am not a native speaker, but I would avoid using ‘high’ and ‘low’, e.g. small and large errors. Please check with a native speaker.
Generally, ‘concentration’ refers to solutions, while ‘content’ is a more general term that can also refer to a solid in another solid. You use both terms, but I would advise to systematically use ‘content’.

Citation: https://doi.org/10.5194/egusphere-2023-1087-RC2
- AC2: 'Reply on RC2', Simon Oberholzer, 12 Sep 2023
  
  Dear Editor and Reviewers,
  Please find attached the answers to the second review in the attached document.
  We hope we addressed all comments to your satisfaction.
  On behalf of all co-authors,
  Simon Oberholzer
  
  Citation: https://doi.org/10.5194/egusphere-2023-1087-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Reconsider after major revisions (04 Oct 2023) by Kristof Van Oost

AR by Simon Oberholzer on behalf of the Authors (15 Nov 2023) Author's response Author's tracked changes Manuscript

ED: Publish subject to minor revisions (review by editor) (29 Nov 2023) by Kristof Van Oost

ED: Referee Nomination & Report Request started (12 Dec 2023) by Kristof Van Oost

RR by Anonymous Referee #1 (29 Dec 2023)

ED: Publish subject to minor revisions (review by editor) (09 Jan 2024) by Kristof Van Oost

AR by Simon Oberholzer on behalf of the Authors (18 Jan 2024) Author's response Author's tracked changes Manuscript

ED: Publish as is (06 Feb 2024) by Kristof Van Oost

ED: Publish as is (21 Feb 2024) by Kristof Van Oost (Executive editor)

AR by Simon Oberholzer on behalf of the Authors (26 Feb 2024) Author's response Manuscript

Journal article(s) based on this preprint

10 Apr 2024

Best performances of visible–near-infrared models in soils with little carbonate – a field study in Switzerland

Simon Oberholzer, Laura Summerauer, Markus Steffens, and Chinwe Ifejika Speranza

SOIL, 10, 231–249, https://doi.org/10.5194/soil-10-231-2024,https://doi.org/10.5194/soil-10-231-2024, 2024

Short summary

Simon Oberholzer, Laura Summerauer, Markus Steffens, and Chinwe Ifejika Speranza

Supplement

https://doi.org/10.5194/egusphere-2023-1087-supplement

Simon Oberholzer, Laura Summerauer, Markus Steffens, and Chinwe Ifejika Speranza

Viewed

Total article views: 452 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
331	100	21	452	35	11	14

HTML: 331
PDF: 100
XML: 21
Total: 452
Supplement: 35
BibTeX: 11
EndNote: 14

Views and downloads (calculated since 30 Jun 2023)

Month	HTML	PDF	XML	Total
Jun 2023	11	2	1	14
Jul 2023	77	18	6	101
Aug 2023	70	8	2	80
Sep 2023	49	13	5	67
Oct 2023	46	10	2	58
Nov 2023	16	13	1	30
Dec 2023	17	6	0	23
Jan 2024	16	3	2	21
Feb 2024	13	8	1	22
Mar 2024	14	15	0	29
Apr 2024	2	4	1	7
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0

Cumulative views and downloads (calculated since 30 Jun 2023)

Month	HTML	PDF	XML	Total
Jun 2023	11	2	1	14
Jul 2023	77	18	6	101
Aug 2023	70	8	2	80
Sep 2023	49	13	5	67
Oct 2023	46	10	2	58
Nov 2023	16	13	1	30
Dec 2023	17	6	0	23
Jan 2024	16	3	2	21
Feb 2024	13	8	1	22
Mar 2024	14	15	0	29
Apr 2024	2	4	1	7
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0

Viewed (geographical distribution)

Total article views: 450 (including HTML, PDF, and XML) Thereof 450 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 30 Aug 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (1811 KB)
Metadata XML

Short summary

This study evaluates the suitability of visible - near infrared spectroscopy for soil research projects of local extent. The prediction error of local spectral models was linearly correlated with the variability of the soil property. Additionally, a high carbonate content in the soil led to lower model performance. These findings contribute to a better understanding of soil spectroscopy in projects of local extent and help facilitate the establishment and implementation of new studies.


Total:	0
HTML:	0
PDF:	0
XML:	0

Dataset variability and carbonate concentration influence the performance of local visible-near infrared spectral models

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Supplement

Viewed

Viewed (geographical distribution)