the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Large errors in common soil carbon measurements due to sample processing
Abstract. To build confidence in the efficacy of soil carbon (C) crediting programs, precise quantification of soil organic carbon C (SOC) is critical. Detecting a true change in SOC after a management shift has occurred, specifically in agricultural lands, is difficult as it requires robust soil sampling and soil processing procedures. Informative and meaningful comparisons across spatial and temporal time scales can only be made with reliable soil C measurements and estimates, which begin on the ground and in soil testing facilities. To gauge soil C measurement inter-variability, we conducted a blind external service laboratory comparison across eight laboratories selected based on status and involvement in SOC quantification for C markets. To better understand how soil processing procedures and quantification methods commonly used in soil testing laboratories affect soil C concentration measurements, we designed an internal experiment assessing the individual effect of several alternative procedures (i.e., sieving, fine grinding, and drying) and quantification methods on total (TC), inorganic (SIC), and organic (SOC) soil C concentration estimates. We analyzed 12 different agricultural soils using 11 procedures that varied either in the sieving, fine grinding, drying, or quantification step. We found that a mechanical grinder, the most commonly used method for sieving in service laboratories, did not effectively remove coarse materials (i.e., roots and rocks), thus resulted in higher variability and significantly different C concentration measurements from the other sieving procedures (i.e., 8 + 2 mm, 4 mm, and 2 mm with rolling pin). A finer grind generally resulted in a lower coefficient of variance where the finest grind to < 125 µm had the lowest coefficient of variance, followed by the < 250 µm grind, and lastly the < 2000 µm grind. Not drying soils in an oven (at 105 °C) prior to elemental analysis on average resulted in a relative difference of 3.5 % lower TC, and 5 % lower SOC due to inadequate removal of moisture. Compared to the reference method used in our study where % TC was quantified by dry combustion on an elemental analyzer, % SIC was measured using a pressure transducer, and % SOC was calculated by the difference of % TC and % SIC, predictions of all three soil properties (% TC, % SIC, % SOC) using Fourier Transformed Infrared Spectroscopy were in high agreement (R2 = 0.97, 0.99, 0.90, respectively). For % SOC, quantification by loss on ignition had a low coefficient of variance (5.42 ± 3.06 %) but the least agreement (R2 = 0.83) with the reference method.
- Preprint
(1092 KB) - Metadata XML
-
Supplement
(953 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-1470', Anonymous Referee #1, 24 Jun 2024
In their manuscript the authors present the uncertainty of total carbon, soil inorganic carbon and soil organic carbon measurements depending on sample processing and measurement. The authors show substantial differences that are mainly driven by sieving and measuring methods with LOI being highly variable. It is or great importance to have such comparisons and critical assessments. The need for accurate soil C measurements is getting more important for an evolving C market. A substantial overestimation of C changes would be bad for the actual climate effect and a substantial underestimation of soil C would reduce the economic benefit of the C market. The experimental set up using 11 procedures and the comparison with 8 commercial laboratories is an important approach to reach a better homogenization of analyses approaches and make soil C measurements more consistent.
I have two main concerns:
- The authors need to elaborate their discussion on the application of chemometric approaches by combining MIR and predictive modelling (e.g. Line 505-512, 523-526 and 567-570) It is true that such approaches can work well as reported in the cited literature. However, it needs to be clear that this all depends on the availability of a representative soil spectral library that it large enough to develop models for prediction. The good prediction in this study is expected and bias the generalized conclusion. The model was trained on the KSSL and thus covers the spectral variability of the soils used here. Additionally, the sample pre-treatment was very similar between the P0 method here and the initial data for the model presented in Seybold et al (2019). Seybold et al (2019) measured TC by dry combustion, used the pressure transducer method for SIC and determined SOC by difference. Thus, it is a good model for the soils selected here. However, the transferability of such models is difficult and a major challenge to overcome. For example, sample grinding is important for the transferability. Grinding was also an aspect that motivated the authors to test. It is true, the differences in grinding are not so significant when all samples are similarly prepared and the model trained for the corresponding grinding is applied but transferring a model trained on finely milled sample to samples that are coarser and vice versa brings uncertainty and challenges (Sandermann et al., 2023). Recently, Safanelli et al. (2023) reported that even combining spectra obtained from different devices can be difficult and requires important pre-processing. More importantly, the authors report that sample processing resulted in larger uncertainty of the predictions. Therefore, the authors need to constrain their conclusion here that such approaches are only working when the conditions of a good and regional model are given. Otherwise, the model error (e.g. RMSE) will be too large to detect changes in TC, SOC and SIC.
- As far as I understand P0 is the reference method here but also the method used in the authors research lab. It is not clear why the authors are so certain that this method is the most rigorous. For example, in Line 161-163 the authors just argue with their “expert opinion”. Many labs use ball mills that are more efficient in grinding (e.g. <50um), oven drying at 105°C might cause losses of OC in some high C soils (this is only briefly touched at the end) and the pressure transducer methods requires the direct addition of acid to the soil, which can alter the organic matter (fumigation is less harsh). The authors need a reference method here to compare to but they also need to critically discuss the constrains of P0 here. It is even more important to have a good justification here given the conflict of interest that exists here between the research and the commercial lab the authors are part of at the same time.
Specific comments:
Line 14: Please specify what "involvement in SOC quantification for C markets" means
The abstract contains many details but no conclusion of the study.
Line 53: Please specify if the authors mean "quality assurance and quality control"
Line 54: Please specify NAPT for readers that are not familiar with US organisation. This holds true for all other abbreviations that are not explained.
Line 60: Root and rock fragments are not considered as part of the fine soil that is important for the biogeochemical processes. However, rocks and roots are still components of soils.
Line 63-65: Do the authors have any reference that commercial labs do not remove coarse fragments. To my experience, research labs apply sieving and in general same sample preparation for agricultural and non-agricultural soils. Also, soil inventories prepare the fine soil prior to C measurements.
Line 65-67: It is not clear to me why regenerative agriculture results in more coarse fragments in deeper soil. Also, the authors refer here rather to conservational land management rather that regenerative land management, which is a very broad and not well-defined term.
Line 77: Also here, the authors should be specific since it is considered as "fine soil"
Line 97: The authors should specify if near-infrared of mid-infrared regions.
Also, such approaches require a well-trained model based on large enough soil spectral library. This is a critical step for the quantification of soil C using chemometric approaches. Therefore, it follows a different concept compared to the other more direct methods.
Line 121: I would rather expect that the dual homogenisation by sieving to 8 followed by 2 mm would result in lower variability.
Table 1: is pH, %SOC and %SIC are measured with analytical replicates? The authors should add errors to the values.
Table 1 caption: How was pH measured and what are the texture classes applied? It is not clear what "Colorado State University following procedure P0" is. Please provide details of refer to Table 2 here.
Line 218-219: This is not very precise. It is not clear which model and was used and on which data it is trained.
Line 223-224: This is most likely attributed to the fact that the used model for the prediction based on the KSSL is developed with samples of similar degree of grinding. In the cited paper, Sanderman et al (2023) conclude that the model trained on fine milled samples was not well transferable on the coarser samples.
Therefore, the authors used a model that was trained for a certain milling. This makes this testing of grinding here not very useful. in comparison, Sanderman et al (2023) developed separate models for roughly 2400 samples of the KSSL. They conclude that a model that was trained with coarse samples and predicted coarse samples was performing similar to a model that was trained with fine soils and predicted fine soils. However, the transfer of models was not satisfying.
Line 231: Before it was mentioned that acid fumigation was only performed for P9. Here it reads like every sample was fumigated. Please clarify.
Line 241-243: How were CO2 and H2O interferences corrected?
Line 245-248: The authors should add more details regarding the predictions. Seybold et al (2019) developed PLSR based on the NSSC-KSSL. is this also used here? What do the authors mean with "respective geographical region"? Were the models local? Was there any spectral pre-processing like re-sampling, filtering, normalization or bassline correction?
Line 270: "External service labs provided values for % TC, % SIC, and % SOC." can be removed.
Line 271: Looking at Table S3, it seems not fair to just select the extremes here. Most differences are rather lower. It is hard to tell from the table. Maybe boxplots per soil with different symbols for the labs would be easier to read. Anyway, the authors should also mention the range of differences and not only the extremes.
Line 284: Yes, it is an astonishing range of measured values between labs. It is also surprising that the reference measurement (CSU lab) shows a large variability of soil B and H. These are two soils with high pH. I wonder if this could be an effect of the carbonate removal. What is your explanation for large differences between the five analytical replicates? Additional, the external labs did not measure in replicates?
Line 288: Why are the no coarse materials at all in soil L for the P3?
Line 305-307: This relationship seems to be mainly driven by the on P3 point at 0.8 difference in plant material and 1 STD %SOC (right corner). This is in general a very weak correlation and might not add much when the one point is considered as an outlier.
Line 323-324: Do the author mean a relationship to SOC, similar to the plant material in Fig. 3?
Figure 4 and results section: This paper is mainly about errors that are important for the SOC because this will be of interest for the C markets. Therefore, I wonder if Figure S3 with the SOC differences between soils and methods should be the main figure in the manuscript and the current Figure 4 could more to the SI. This might need a restructuring of the section as well.
Line 339: Significances are shown in Table S6?
Figure 6: X axis label, colour and legend are redundant.
Line 396-397: Here the focus is on SOC for the C market.
Line 398-399: Please see my comment regarding the variability on CSU lab for some soils. This is also concerning. Here it would be good to have replicates from the individual labs.
Line 460: This would be a very interesting aspect of the manuscript. The C market needs stocks of C and not concentrations alone. Therefore, the effect of removed or not removed coarse fragments would be most significant. Even the calculation of SOC stocks includes large uncertainties and this would add up with the method uses (e.g. Poeplau et al. 2017). The authors do back on the envelope calculations later in the implications section. Would it be possible to discuss the stock effects even more by estimating the stock differences here for all methods using the soils bulk densities?
Line 465-468: Yes, plant material would be low in mass but might be important in volume and thus could have an impact of stocks as well.
Line 567: This should be Fig. S9
References:
Safanelli, J.L., Sanderman, J., Bloom, D., Todd-Brown, K., Parente, L.L., Hengl, T., Adam, S., Albinet, F., Ben-Dor, E., Boot, C.M., Bridson, J.H., Chabrillat, S., Deiss, L., Demattê, J.A.M., Scott Demyan, M., Dercon, G., Doetterl, S., Van Egmond, F., Ferguson, R., Garrett, L.G., Haddix, M.L., Haefele, S.M., Heiling, M., Hernandez-Allica, J., Huang, J., Jastrow, J.D., Karyotis, K., Machmuller, M.B., Khesuoe, M., Margenot, A., Matamala, R., Miesel, J.R., Mouazen, A.M., Nagel, P., Patel, S., Qaswar, M., Ramakhanna, S., Resch, C., Robertson, J., Roudier, P., Sabetizade, M., Shabtai, I., Sherif, F., Sinha, N., Six, J., Summerauer, L., Thomas, C.L., Toloza, A., Tomczyk-Wójtowicz, B., Tsakiridis, N.L., Van Wesemael, B., Woodings, F., Zalidis, G.C., Żelazny, W.R., 2023. An interlaboratory comparison of mid-infrared spectra acquisition: Instruments and procedures matter. Geoderma 440, 116724. doi:10.1016/j.geoderma.2023.116724
Sanderman, J., Gholizadeh, A., Pittaki‐Chrysodonta, Z., Huang, J., Safanelli, J.L., Ferguson, R., 2023. Transferability of a large mid‐infrared soil spectral library between two Fourier‐transform infrared spectrometers. Soil Science Society of America Journal 87, 586–599. doi:10.1002/saj2.20513
Sanderman, J., Smith, C., Safanelli, J.L., Morgan, C.L.S., Ackerman, J., Looker, N., Mathers, C., Keating, R., Kumar, A.A., 2023. Diffuse reflectance mid-infrared spectroscopy is viable without fine milling. Soil Security 13, 100104. doi:10.1016/j.soisec.2023.100104
Poeplau, C., Vos, C., Don, A., 2017. Soil organic carbon stocks are systematically overestimated by misuse of the parameters bulk density and rock fragment content. Soil 3, 61–66. doi:10.5194/soil-3-61-2017
Citation: https://doi.org/10.5194/egusphere-2024-1470-RC1 -
AC2: 'Reply on RC1', Rebecca Even, 14 Aug 2024
Dear Editor and Referee 1,
We thank you for your time and attention to this manuscript and appreciate the feedback and suggestions provided. We have addressed each comment and included further information with our proposed revisions in the attached document.
Sincerely,
Rebecca Even and co-authors
-
RC2: 'Comment on egusphere-2024-1470', Jörg Matschullat, 12 Jul 2024
Review of ‘Large errors in common soil carbon measurements due to sample processing’ by Rebecca J Even and others (egusphere-2024-1470)
The manuscript submitted to ‘Soil’ touches a highly relevant topic, namely the correct quantification of soil organic carbon (SOC) plus other carbon species to realistically represent soil carbon (not only) in sequestration claims.
The presented work is based on some kind of round-robin analysis of aliquoted soil material which had been prepared by the authors and shipped to various laboratories for subsequent quantitative C-analysis.
While I consider the motivation and overarching idea certainly worth for a SOIL contribution, the quality of the present status of the manuscript does not permit acceptance. In the following, I go through the manuscript from beginning to end and point out the present weaknesses – regardless of whether this is a very minor issue or a bigger one.
Abstract
Line 19/20: A mechanical grinder is no instrument for sieving.
Lines 22/23: That finer grind leads to lower variance is nothing new and can easily be explained.
Lines 23/25: Not drying soil samples prior to further processes leads to errors similarly is nothing new.
Introduction
Lines 45/46: “sample preparation is considered the first step…”. This perception of the authors underlies various expressions of this manuscript, although they do refer (towards the end) to Minasny et al. (2017), where it is correctly argued that the field sampling design is by far the largest source of error. The – in my eyes – slightly distorted relevance of all subsequent steps (independent of the fact that these are relevant, too) reverberates throughout this manuscript and may lead to misperceptions with unexperienced readers and people who prefer to seek the mistakes in the laboratory works and not in their own field work.
Lines 68ff: The discussion on sieving here and later again appears somewhat odd to me. It is known that soil material must be dried (minimum air-dried, better 40°C) prior to sieving and that optimum sieving results also demand humidity control in the sieving lab to avoid badly reproducible results.
Lines 84ff: After dry-sieving (2 mm), plus checking for possible remaining fine root material which will have to be removed by handpicking, the soil samples must be ground to analytical grade. The best results with the lowest standard deviation are obtained with a grain size smaller 63 micrometers. This is of particular relevance if methods like elemental analysis (EA) with very low inweights are being used (the authors refer to a machine by Elementar that is specifically designed to serve isotopic work. The standard machine, e.g., EL Cube by Elementar, takes maximum inweights of 20 to 50 mg), demanding maximum analytical sample homogeneity.
Lines 98ff: The statement relating to neutral or basic pH soils is incorrect. Even soils with highly acidic pH (3.5 to 4.5) can show significant amounts (percentages) of inorganic as well as of organic carbon. Ferralsols/oxisols from the inner wet tropics serve as example.
Line 100: must read ‘Soil Survey Staff’
Line 111: check year of McCarty et al (2010) in reference list
Materials and Methods
Line 136: I suggest splitting the very long table caption into a concise header and to move the details into a table footer to make the table more appealing. Instead of ‘soil identification number’, it should read e.g., ‘code’ since no numbers are being used. The sequence of the table column headers should be repeated in the table header – no different sequence.
Line 145ff: The initially stated criticism on the authors bias with the lab parts emerges here once again. To take one single sample of a 50x50 cm x 15 cm deep soul pit is radically insufficient to represent, e.g., a hectare. I suggest to simply rephrase the experimental setup from the onset and clearly and unmistakably explain that while the biggest mistakes occur in inappropriate sampling, this paper focuses on all subsequent steps and uses homogenized soil samples to test sample preparation and analysis steps.
Line 149: ‘Soil was collected from different places on the butcher paper…’. A) What is butcher paper made of? Does it contain any carbon like all other papers? If so, discuss. B) The sub-sampling description here does not suffice to allow others to judge the procedure. We generally use multi-step quartering or mechanical sample dividers to obtain true aliquots.
Line 166: I read that the soil sample was homogenized as field moist material. This would certainly introduce possible errors since even smaller differences in soil humidity make homogenizing differ between samples or different humidities.
Line 169ff: all SI units must be set with a space between number and unit. This is valid throughout and should be corrected in the entire manuscript.
Line 180ff: Similar to Table 1, the header should be split into a concise header and detailed explanation below the table as a footer. The table itself prints badly in my copy. Please check.
Line 230: Instrumentation nomenclature needs to be homogenized throughout (compare with line 243).
Line 231: To reduce possible misunderstandings, introduce a comma between ‘% TC’ and ‘and % SOC’…
Line 237: a unit is missing after ‘0.04’. Personal remark: Our lab regularly obtains a lower limit of determination of 0.04 wt-% in standard application for C and a related SD between 0.02 and 0.04 wt-% on an EL Cube).
Line 251: Check publication year for R Core Team in reference list.
Results
Page 10, Figure 1. Any figure or table should never directly follow a chapter or section header. The three sub-figures display three different scales (Y-axis). That is certainly not ideal to allow for an unbiased understanding of the figure’s message.
Line 271: Unit is missing after ‘and 1.45’
Page 12, Figure 2: The procedures (x-axis) should display horizontal indicators (here P0, P1, etc.). To simplify, and since the term ‘Procedure’ is printed below, the number would suffice. Again, to avoid perception bias, the legend should explicitly point out that the is factor 10 between the y-axis of a) and b).
Page 13, Figure 3: This figure prints badly. The symbols need to be bigger, and the axis formatting with black lettering and slightly larger and horizontal (x-axis) lettering.
Page 16, Figure 5: Same as with fig. 2, including homogenized axis scales
Page 17, Figure 6: While again the axis scales should be equal, this figure is somewhat odd to me and appears to compare “apples and pears”. Direct comparison is only possible with one modification of degrees of freedom.
Discussion
Lines around 421: I cannot agree with these conclusions/recommendations. It should go without saying that only experienced laboratories that adhere to GLP do qualify. That implicitly means that there is a very tight quality control and documentation. No other labs should be considered. To determine organic carbon (TOC), acid fumigation is a necessity. However, the related process must be clearly defined.
Lines 453ff: Here and at other occasions, the authors point out lab costs for some more time-consuming procedures. I like to remind them that the by far most costly part of obtaining decent analytical results for anything is high-quality sample acquisition. The rest is relatively cheap and should not serve to argue for cost-savings. More precisely here: 2 mm sieving should be beyond discussion. One my sieve 8 mm or whatever in the field already to reduce the material to be transported to the lab and kept on hold in freezer or fridge, but that is irrelevant in this context. The authors seem not to know automated sieving machines (e.g., Fritsch, Retsch) that allow lab personnel to do other work while the sample(s) is being sieved. Automated sieving comes with the added advantage that it increases reproducibility of the process.
Line 460: Check spelling for Ryterr 2012
Line 461: In consequence to what I expressed above, I cannot agree with the suggestion made here.
Line 470: Grinding just like sieving should be free of individual bias. There are various mills on the market that allow for multiple (up to 8) samples to be ground to analytical grade in a few minutes with almost perfect homogeneity (as shown by laser granulometry).
Line 488ff: Again in addition to what I wrote above on drying, air-drying (20–25 °C) is the conditio-sine-qua-non. Yet, if no other critical analyses (e.g. mercury) need to be undertaken on that material, then 40–60 °C drying is better since it compensates for inhomogeneities in laboratory climatology. See also line 491.
Line 494ff: I do not understand the argumentation that their ‘results were not texture or OM-dependent…’ How so?
Line 516: better to use ‘it is’
Line 525f: Direct comparisons are only possible within one methodologically-consistent approach. One can run the EA prior to sample acidification to obtain TC, then run another aliquot after acidification to obtain TOC – the difference of which allows for the calculation of TIC. To shift instruments (methods) and determine, e.g., TC with one technique (e.g., Leco CS-analyzer), then TOC with EA is no good idea to obtain high-quality results. However, if done correctly in all steps, then you must expect very small errors between TC and TOC results from one and another method.
References
The reference list demands homogenization in formatting, bibliographical completeness, and accuracy of all citations. See, e.g., Bates et al. 2015; Bernoux and Cerri 2005; Lenth 2022; McCarthy et al. n.d.; R Core Team 2022.
Bottom line: As already mentioned, the motivation of the authors deserves applause. However, the submitted manuscript falls somewhat short to deliver what it takes in order to meet the self-set goals. I suggest a thorough revision prior to re-submission.
Citation: https://doi.org/10.5194/egusphere-2024-1470-RC2 -
AC1: 'Reply on RC2', Rebecca Even, 14 Aug 2024
Dear Editor and Referee 2,
We thank you for your time and attention to this manuscript and appreciate the feedback and suggestions provided. We have addressed each comment and included further information with our proposed revisions in the attached document.
Sincerely,
Rebecca Even and co-authors
-
AC1: 'Reply on RC2', Rebecca Even, 14 Aug 2024
Data sets
Soil carbon measurements R. Even https://zenodo.org/doi/10.5281/zenodo.11223422
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
878 | 320 | 28 | 1,226 | 43 | 18 | 20 |
- HTML: 878
- PDF: 320
- XML: 28
- Total: 1,226
- Supplement: 43
- BibTeX: 18
- EndNote: 20
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1