the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Harmonisation of methane isotope ratio measurements from different laboratories using atmospheric samples
Abstract. Establishing interlaboratory compatibility among measurements of stable isotope ratios of atmospheric methane (δ13C-CH4 and δD-CH4) is challenging. Significant offsets are common because laboratories have different ties to the VPDB or SMOW-SLAP scales. Umezawa et al. (2018) surveyed numerous comparison efforts for CH4 isotope measurements conducted from 2003 to 2017 and found scale offsets of up to 0.5 ‰ for δ13C-CH4 and 13 ‰ for δD-CH4 between laboratories. This exceeds the World Meteorological Organisation Global Atmospheric Watch (WMO-GAW) network compatibility targets of 0.02 ‰ and 1 ‰ considerably.
We employ a method to establish scale offsets between laboratories using their reported CH4 isotope measurements on atmospheric samples. Our study includes data from eight laboratories with experience in high-precision isotope ratio mass spectrometry (IRMS) measurements for atmospheric CH4. The analysis relies exclusively on routine atmospheric measurements conducted by these laboratories at high-latitude stations in the Northern and Southern Hemispheres, where we assume each measurement represents sufficiently well-mixed air at the latitude for direct comparison. We use two methodologies for interlaboratory comparisons: (I) assessing differences between time-adjacent observation data and (II) smoothing the observed data using polynomial and harmonic functions before comparison. The results of both methods are consistent, and with a few exceptions, the overall average offsets between laboratories align well with those reported by Umezawa et al. (2018). This indicates that interlaboratory offsets remain robust over multi-year periods. The evaluation of routine measurements allows us to calculate the interlaboratory offsets from hundreds, in some cases thousands of measurements. Therefore, the uncertainty in the mean interlaboratory offset is not limited by the analytical error of a single analysis but by real atmospheric variability between the sampling dates and stations. Using the same method, we assess this uncertainty by investigating measurements from four high-latitude sites analysed by the INSTAAR laboratory. After applying the derived interlaboratory offsets, we present a harmonised time series for δ13C-CH4 and δD-CH4 at high northern and southern latitudes, covering the period from 1988 to 2023.
Competing interests: One of the co-authors, Thomas Röckmann, is on the editorial board for AMT
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(1486 KB) - Metadata XML
-
Supplement
(242 KB) - BibTeX
- EndNote
Status: closed
-
RC1: 'Review of Dasgupta et al', Anonymous Referee #1, 18 Jul 2025
Manuscript summary
The Dasgupta et al. manuscript aims to examine interlaboratory offsets in atmospheric CH4 stable isotope measurements. The authors choose the approach of comparing atmospheric measurements at high latitude sites over the more traditional round-robin interlaboratory exercise in which cylinders containing identical air are measured at each lab. The authors argue that their approach offers the advantage of many samples over a long time span, resulting in overall more representative estimates of interlab offsets. The authors compare the interlab offsets obtained in this way with a prior study (Umezawa et al 2018) that used the round-robin approach. Finally, the interlab offsets are used to place all CH4 isotope data sets on the same scale to yield unified datasets for NH and SH high latitudes.
Major comments:
Evaluation and monitoring of inter-laboratory measurement offsets for CH4 isotopes is important, as CH4 isotopic measurements are a key constraint for the CH4 budget. The approach of intercomparing measurements (versus a round-robin exercise) is justifiable and has strong merits; mainly the large number of available samples and comparison over multiple years. The main drawback of this approach is spatial and temporal variability of air masses (much more important in Arctic than Antarctic), and this is thoroughly considered and discussed in the manuscript. I agree that applying this approach to relatively long-term data sets would result in some of the spatiotemporal variability effectively averaging out to zero, although systematic offsets between sites would remain. The statistical methods used in uncertainty estimation seem appropriate.
I think it would have been better to apply this approach in parallel with a more traditional round-robin intercomparison. Both approaches have merits, it has been quite a few years since the last intercomparison, and there does seem to be a fairly consistent offset between the prior round-robin offset estimates (Umezawa et al) and those obtained with the new approach (Fig. 3). That said, I realize that doing a round-robin with multiple labs can take years and is expensive so this is not something that I am recommending as part of the revisions.
In my opinion, the type of intercomparison /merging of some published datasets that is performed here is something that would typically be presented as part of the methodology for a study of global CH4 isotope trends / budget, rather than as a stand-alone paper. I would therefore recommend the following additions to the study to make it more publishable:
- Examine and include discussion of temporal trends in the inter-laboratory offsets (if any)
- Use the interlab offsets determined here to place all the CH4 isotope datasets (from all stations, not just high latitudes) from the laboratories involved in the intercomparison on the same measurement scale and publish that larger unified dataset as part of the study
Minor comments:
I found the discussion of some individual offsets between pairs of labs a bit scattered and hard to follow – perhaps consolidate into another table?
Line 458: “INSTAAR dD-CH4 values from SH are excluded because they stand out from the other time series, even after offset correction (Fig. 4a).”
There is no panel labeled “a” on Fig 4; also INSTAAR SH dD data are not shown at all on this figure
Line 241: “relative to NIWA” (remove “the”)
Citation: https://doi.org/10.5194/egusphere-2025-2439-RC1 -
AC1: 'Reply on RC1', Bibhasvata Dasgupta, 22 Aug 2025
RC1: 'Review of Dasgupta et al', Anonymous Referee #1, 18 Jul 2025 reply
Manuscript summary
The Dasgupta et al. manuscript aims to examine interlaboratory offsets in atmospheric CH4 stable isotope measurements. The authors choose the approach of comparing atmospheric measurements at high latitude sites over the more traditional round-robin interlaboratory exercise in which cylinders containing identical air are measured at each lab. The authors argue that their approach offers the advantage of many samples over a long time span, resulting in overall more representative estimates of interlab offsets. The authors compare the interlab offsets obtained in this way with a prior study (Umezawa et al 2018) that used the round-robin approach. Finally, the interlab offsets are used to place all CH4 isotope data sets on the same scale to yield unified datasets for NH and SH high latitudes.
Major comments:
Evaluation and monitoring of inter-laboratory measurement offsets for CH4 isotopes is important, as CH4 isotopic measurements are a key constraint for the CH4 budget. The approach of intercomparing measurements (versus a round-robin exercise) is justifiable and has strong merits; mainly the large number of available samples and comparison over multiple years. The main drawback of this approach is spatial and temporal variability of air masses (much more important in Arctic than Antarctic), and this is thoroughly considered and discussed in the manuscript. I agree that applying this approach to relatively long-term data sets would result in some of the spatiotemporal variability effectively averaging out to zero, although systematic offsets between sites would remain. The statistical methods used in uncertainty estimation seem appropriate.
I think it would have been better to apply this approach in parallel with a more traditional round-robin intercomparison. Both approaches have merits, it has been quite a few years since the last intercomparison, and there does seem to be a fairly consistent offset between the prior round-robin offset estimates (Umezawa et al) and those obtained with the new approach (Fig. 3). That said, I realize that doing a round-robin with multiple labs can take years and is expensive so this is not something that I am recommending as part of the revisions.
In my opinion, the type of intercomparison /merging of some published datasets that is performed here is something that would typically be presented as part of the methodology for a study of global CH4 isotope trends / budget, rather than as a stand-alone paper. I would therefore recommend the following additions to the study to make it more publishable:
- Examine and include discussion of temporal trends in the inter-laboratory offsets (if any)
- Use the interlab offsets determined here to place all the CH4 isotope datasets (from all stations, not just high latitudes) from the laboratories involved in the intercomparison on the same measurement scale and publish that larger unified dataset as part of the study
Response: We agree with the reviewer that our method is not a stand-alone solution for computing interlaboratory offsets but rather a complementary approach to traditional round-robin intercomparisons. There is actually a round-robin exercise presently ongoing, which involves more laboratories than the ones that participate in the present work. We still think that our approach is very useful by itself, especially for laboratories that perform high-precision atmospheric background measurements of the isotopic composition.
Further, we had initially indeed planned this part as a data harmonisation step for an inversion study. As this first part grew successively larger, and after establishing the most appropriate approach within our consortium, we decided to split it off and present this as a method manuscript where the data harmonisation effort is described in more detail than what would be possible if it were combined with an inversion study. The second part, building on this dataset, uses the offset-corrected, harmonised high-latitude isotope time series of methane to perform a 2-box Bayesian inversion. That manuscript (to be submitted) is aimed at source and sink apportionment and an evaluation of the δD-CH4 as an additional constraint on isotope-enabled atmospheric inversions in improving our understanding of the methane budget.
- Temporal trends in the inter-laboratory offsets (if any): We agree that co-located measurements can be used to monitor possible shifts in inter-laboratory offsets, and our consortium is actually making active use of this opportunity. However, this is particularly difficult for high-latitude sites because the sampling date is often not close to the analysis date. In particular, in the southern hemisphere (Antarctica), air samples are in many cases only shipped once per year to the laboratories. In addition, the UU laboratory re-measured cylinders collected by UHEI up to more than two decades after they were sampled. Thus, a general application of this approach is not easy, and it requires dedicated knowledge of the analytical conditions in each specialist laboratory. Therefore, we decided not to include this part in the manuscript, which is targeted at a wider community and should provide updated offsets between laboratories that can also be used by others in a straightforward manner to combine datasets from different laboratories.
- Interlaboratory offsets determined here to place all the CH4 isotope datasets (from all stations, not just high latitudes): We chose to offset correct and harmonise only high latitude datasets. Methane isotopes show spatial gradients across the globe, and we choose only the high-latitude regions because they are more remote from the main sources and reasonably well mixed; thus, records from different stations should still represent relatively similar air. This is not true for lower latitude sites, which are more and differently affected by local and regional sources.
Minor comments:
I found the discussion of some individual offsets between pairs of labs a bit scattered and hard to follow – perhaps consolidate into another table?
Response: We have chosen two reference laboratories, MPI-BGC and TU/NIPR, to harmonise datasets, which we think is the more appropriate and thorough approach rather than pair-wise comparisons. Including numbers with still more reference laboratories would result in too much data and take away from the central narrative of the work.
Line 458: “INSTAAR dD-CH4 values from SH are excluded because they stand out from the other time series, even after offset correction (Fig. 4a).”
There is no panel labeled “a” on Fig 4; also INSTAAR SH dD data are not shown at all on this figure
Response: Apologies, this should be just ‘Fig. 4’. The revised Fig. 4 now includes both INSTAAR SH dD and Pre-2015 MPI data.
Figure 4 (revised and attached): Harmonised data series with interlaboratory offset applied (mean ± uncertainty) to the observed data from eight laboratories using MPI-BGC as the reference laboratory. Relative to MPI-BGC, the TU/NIPR scale is enriched (+0.54 ‰) for δ13C-CH4 and depleted (−10.8 ‰) for δD-CH4.
Line 241: “relative to NIWA” (remove “the”)
Response: Agreed.
-
RC2: 'Comment on egusphere-2025-2439', Anonymous Referee #2, 27 Jul 2025
General comments
The manuscript describes a method to quantify inter-laboratory offsets for measurements of δ13C and δD in atmospheric methane, with the aim of harmonising each timeseries with a common reference. Offsets have been quantified previously by round-robin methods, and the method described here uses the timeseries of atmospheric measurements. After identifying periods of overlapping sampling, two approaches are presented to determine an offset relative to the selected reference laboratory MPI-BGC. These are a nearest neighbour approach that selects pairs of measurements sufficiently close in time then taking the difference and fitting the timeseries with a function, then taking the difference between pairs of functions. The offsets determined from the timeseries data are compared and found in agreement with previous intercomparison exercises (Umezawa et al 2018). Finally, the offsets are applied to the data to produce a harmonised timeseries for δ13C and δD in the Northern and Southern Hemispheres.
Specific comments
Nearest-neighbour approach:
It was not clear to me how the pairs of test and reference data points are selected. Is the data for one laboratory from all sites combined, then any test point matched with the closest reference in time regardless of site? Or is the nearest-neighbour approach applied to the data from each site separately?
Line 195 – Is the same value of spatial uncertainty used when two measurements at the same station are compared? For example, Table 1 shows that INSTAAR and MPI-BGC have overlapping records at Alert, so the comparison here uses measurements of the same air masses, while the other station pairings are geographically separated.
I suggest moving the two paragraphs from lines 378 and 388 from p. 13 into this section to describe the uncertainty calculation method.
Smoothed data approach:
Line 221 describes that the difference between two sites using the smoothed data approach is calculated from the difference between the long-term average of two sites. How does this account for the trend in the isotope ratio (i.e. the polynomial part)? That is, the mean of a longer running site will be earlier in the time series than a more recent site. Why not use the function to generate evenly spaced values for both sites then take the difference in the same way as the time-matching approach?
Line 228 – the authors recognise that the fitted curve should not be used to interpolate “significant gaps”. Can they provide a maximum duration that is used to exclude such portions.
Line 228 – “In this case, the extrapolated parts of the curve fit …” should be corrected to “In this case, the interpolated parts of the curve fit …”
Line 231 – the uncertainty calculation is unclear. In the description “average sum of squares of the root mean square error” what is being averaged here? The calculation of the offset was described on line 221 as the difference between the long-term average for both sites, which implies a single value. If the average is over all pairings of test site and reference site then should there also be a spatial uncertainty that accounts for co-located and geographically separate sites?
Harmonised long-term datasets
Line 459 - Can the authors provide more discussion for the statement “INSTAAR δD-CH4 values from SH are excluded because they stand out from the other time series, even after offset correction”. It is not clear what is standing out for these measurements, and does it show some conditions where the offset correction is ineffective? Section 3.2 shows little difference between the hemisphere-specific offsets. The reference to Fig. 4a also needs to be updated as the figure does not have sub-figures with letter labels.
Figure 5 – The line colours for Northern Hemisphere and Southern hemisphere are very difficult to distinguish, can this be replotted more clearly.
Citation: https://doi.org/10.5194/egusphere-2025-2439-RC2 -
AC2: 'Reply on RC2', Bibhasvata Dasgupta, 22 Aug 2025
RC2: 'Comment on egusphere-2025-2439', Anonymous Referee #2, 27 Jul 2025 reply
General comments
The manuscript describes a method to quantify inter-laboratory offsets for measurements of δ13C and δD in atmospheric methane, with the aim of harmonising each timeseries with a common reference. Offsets have been quantified previously by round-robin methods, and the method described here uses the timeseries of atmospheric measurements. After identifying periods of overlapping sampling, two approaches are presented to determine an offset relative to the selected reference laboratory MPI-BGC. These are a nearest neighbour approach that selects pairs of measurements sufficiently close in time then taking the difference and fitting the timeseries with a function, then taking the difference between pairs of functions. The offsets determined from the timeseries data are compared and found in agreement with previous intercomparison exercises (Umezawa et al 2018). Finally, the offsets are applied to the data to produce a harmonised timeseries for δ13C and δD in the Northern and Southern Hemispheres.
Specific comments
Nearest-neighbour approach:
It was not clear to me how the pairs of test and reference data points are selected. Is the data for one laboratory from all sites combined, then any test point matched with the closest reference in time regardless of site? Or is the nearest-neighbour approach applied to the data from each site separately?
Response: The data from all high-latitude sites per laboratory and hemisphere are first combined into a lab-specific NH or SH dataset. Then each point from the reference lab is matched with the closest test lab data point in time. We apply this approach at the laboratory level instead of comparing site-by-site because it allows us to use larger and longer datasets. To account for possible differences between locations, we include a fixed spatial uncertainty in our calculations. We have clarified this in the revised manuscript.
Line 195 – Is the same value of spatial uncertainty used when two measurements at the same station are compared? For example, Table 1 shows that INSTAAR and MPI-BGC have overlapping records at Alert, so the comparison here uses measurements of the same air masses, while the other station pairings are geographically separated.
Response: The spatial uncertainty is a fixed value (0.06 ‰ for δ¹³C-CH₄ and 0.5 ‰ for δD-CH₄) based on differences between INSTAAR’s northern stations (which do not include co-located or same station data), and it represents typical variability across high-latitude sites. Since it reflects broader atmospheric variability, it is not adjusted for co-located comparisons. This approach provides a conservative and consistent estimate of atmospheric variability across the entire dataset.
I suggest moving the two paragraphs from lines 378 and 388 from p. 13 into this section to describe the uncertainty calculation method.
Response: As the method applied in this paper is also part of its discussion, we have continued the points in lines 378-388 in the discussion in section 4.
Smoothed data approach:
Line 221 describes that the difference between two sites using the smoothed data approach is calculated from the difference between the long-term average of two sites. How does this account for the trend in the isotope ratio (i.e. the polynomial part)? That is, the mean of a longer running site will be earlier in the time series than a more recent site. Why not use the function to generate evenly spaced values for both sites then take the difference in the same way as the time-matching approach?
Response: For each laboratory pair, we first apply the NOAA CCGCRV smoothing to generate evenly spaced curve points over the entire record. We then restrict each smoothed series to its common overlap period and compute the mean of those matched points; the difference between those two means is the offset. Because the curves are defined by the same number of evenly spaced points across the same dates, this intrinsically accounts for both the long‑term polynomial trend and the seasonal harmonics.
Line 228 – the authors recognise that the fitted curve should not be used to interpolate “significant gaps”. Can they provide a maximum duration that is used to exclude such portions.
Response: We exclude any overlapping segment shorter than one year or containing gaps longer than six months, since the NOAA fit becomes unreliable over very sparse data. In practice, this only affected the UU vs. MPI‑BGC δ¹³C‑CH₄ comparison in the NH (2019–2022), which we have now noted explicitly in the revised text. By enforcing these minimum duration and maximum‑gap criteria, we ensure that all smoothed‑curve comparisons rest on robust, well‑constrained fits.
Line 228 – “In this case, the extrapolated parts of the curve fit …” should be corrected to “In this case, the interpolated parts of the curve fit …”
Response: Agreed.
Line 231 – the uncertainty calculation is unclear. In the description “average sum of squares of the root mean square error” what is being averaged here? The calculation of the offset was described on line 221 as the difference between the long-term average for both sites, which implies a single value. If the average is over all pairings of test site and reference site then should there also be a spatial uncertainty that accounts for co-located and geographically separate sites?
Response: In the smoothed‑data approach, we estimate uncertainty purely from how well each laboratory’s smoothed curve matches its own measurements. We calculate the root‑mean‑square error (RMSE) of the fit for each lab, that is, the typical deviation of the fitted curve points from the observed values, and then combine those two RMSEs into a single uncertainty by averaging their squared values and taking the square root of this average. Because this method compares two continuous, evenly spaced curves rather than individual station measurements, it does not include a separate spatial‑uncertainty term: any geographic variability is inherently captured in the residuals used to compute each curve’s RMSE.
Harmonised long-term datasets
Line 459 - Can the authors provide more discussion for the statement “INSTAAR δD-CH4 values from SH are excluded because they stand out from the other time series, even after offset correction”. It is not clear what is standing out for these measurements, and does it show some conditions where the offset correction is ineffective? Section 3.2 shows little difference between the hemisphere-specific offsets. The reference to Fig. 4a also needs to be updated as the figure does not have sub-figures with letter labels.
Response: We found that, even after applying the calculated lab‑wide offset, some SH INSTAAR records lie systematically ~5 ‰ heavier than other SH points around that time (included now in the revised Fig.4, also pasted above). Because we impose a single, uniform offset per laboratory, the SH values remain biased, and it is difficult to filter out outliers. To prevent these residual artefacts from distorting the merged, smoothed time series, we have omitted the SH INSTAAR δD‑CH₄ entirely from the harmonised plot (Fig. 5). However, the complete offset‑corrected INSTAAR SH dataset is included in the ICOS data portal for any users who wish to include it.
Section 3.2 does not report INSTAAR δD-CH4 hemisphere-specific offsets (no overlap with MPI-BGC), only TU/NIPR and UU.
Apologies, this should be just ‘Fig. 4’.
Figure 5 – The line colours for Northern Hemisphere and Southern hemisphere are very difficult to distinguish, can this be replotted more clearly.
Response: The line colours for NH and SH in Fig.5 have been updated.
Figure 5 (revised and attached): Merged and fitted data series with interlaboratory offset applied (mean ± uncertainty) to the observed data from eight laboratories using MPI-BGC as the reference laboratory. The NOAA CCGCRV algorithm is used for the fit, and the RMSE of residuals is plotted as the uncertainty band.
-
AC2: 'Reply on RC2', Bibhasvata Dasgupta, 22 Aug 2025
Status: closed
-
RC1: 'Review of Dasgupta et al', Anonymous Referee #1, 18 Jul 2025
Manuscript summary
The Dasgupta et al. manuscript aims to examine interlaboratory offsets in atmospheric CH4 stable isotope measurements. The authors choose the approach of comparing atmospheric measurements at high latitude sites over the more traditional round-robin interlaboratory exercise in which cylinders containing identical air are measured at each lab. The authors argue that their approach offers the advantage of many samples over a long time span, resulting in overall more representative estimates of interlab offsets. The authors compare the interlab offsets obtained in this way with a prior study (Umezawa et al 2018) that used the round-robin approach. Finally, the interlab offsets are used to place all CH4 isotope data sets on the same scale to yield unified datasets for NH and SH high latitudes.
Major comments:
Evaluation and monitoring of inter-laboratory measurement offsets for CH4 isotopes is important, as CH4 isotopic measurements are a key constraint for the CH4 budget. The approach of intercomparing measurements (versus a round-robin exercise) is justifiable and has strong merits; mainly the large number of available samples and comparison over multiple years. The main drawback of this approach is spatial and temporal variability of air masses (much more important in Arctic than Antarctic), and this is thoroughly considered and discussed in the manuscript. I agree that applying this approach to relatively long-term data sets would result in some of the spatiotemporal variability effectively averaging out to zero, although systematic offsets between sites would remain. The statistical methods used in uncertainty estimation seem appropriate.
I think it would have been better to apply this approach in parallel with a more traditional round-robin intercomparison. Both approaches have merits, it has been quite a few years since the last intercomparison, and there does seem to be a fairly consistent offset between the prior round-robin offset estimates (Umezawa et al) and those obtained with the new approach (Fig. 3). That said, I realize that doing a round-robin with multiple labs can take years and is expensive so this is not something that I am recommending as part of the revisions.
In my opinion, the type of intercomparison /merging of some published datasets that is performed here is something that would typically be presented as part of the methodology for a study of global CH4 isotope trends / budget, rather than as a stand-alone paper. I would therefore recommend the following additions to the study to make it more publishable:
- Examine and include discussion of temporal trends in the inter-laboratory offsets (if any)
- Use the interlab offsets determined here to place all the CH4 isotope datasets (from all stations, not just high latitudes) from the laboratories involved in the intercomparison on the same measurement scale and publish that larger unified dataset as part of the study
Minor comments:
I found the discussion of some individual offsets between pairs of labs a bit scattered and hard to follow – perhaps consolidate into another table?
Line 458: “INSTAAR dD-CH4 values from SH are excluded because they stand out from the other time series, even after offset correction (Fig. 4a).”
There is no panel labeled “a” on Fig 4; also INSTAAR SH dD data are not shown at all on this figure
Line 241: “relative to NIWA” (remove “the”)
Citation: https://doi.org/10.5194/egusphere-2025-2439-RC1 -
AC1: 'Reply on RC1', Bibhasvata Dasgupta, 22 Aug 2025
RC1: 'Review of Dasgupta et al', Anonymous Referee #1, 18 Jul 2025 reply
Manuscript summary
The Dasgupta et al. manuscript aims to examine interlaboratory offsets in atmospheric CH4 stable isotope measurements. The authors choose the approach of comparing atmospheric measurements at high latitude sites over the more traditional round-robin interlaboratory exercise in which cylinders containing identical air are measured at each lab. The authors argue that their approach offers the advantage of many samples over a long time span, resulting in overall more representative estimates of interlab offsets. The authors compare the interlab offsets obtained in this way with a prior study (Umezawa et al 2018) that used the round-robin approach. Finally, the interlab offsets are used to place all CH4 isotope data sets on the same scale to yield unified datasets for NH and SH high latitudes.
Major comments:
Evaluation and monitoring of inter-laboratory measurement offsets for CH4 isotopes is important, as CH4 isotopic measurements are a key constraint for the CH4 budget. The approach of intercomparing measurements (versus a round-robin exercise) is justifiable and has strong merits; mainly the large number of available samples and comparison over multiple years. The main drawback of this approach is spatial and temporal variability of air masses (much more important in Arctic than Antarctic), and this is thoroughly considered and discussed in the manuscript. I agree that applying this approach to relatively long-term data sets would result in some of the spatiotemporal variability effectively averaging out to zero, although systematic offsets between sites would remain. The statistical methods used in uncertainty estimation seem appropriate.
I think it would have been better to apply this approach in parallel with a more traditional round-robin intercomparison. Both approaches have merits, it has been quite a few years since the last intercomparison, and there does seem to be a fairly consistent offset between the prior round-robin offset estimates (Umezawa et al) and those obtained with the new approach (Fig. 3). That said, I realize that doing a round-robin with multiple labs can take years and is expensive so this is not something that I am recommending as part of the revisions.
In my opinion, the type of intercomparison /merging of some published datasets that is performed here is something that would typically be presented as part of the methodology for a study of global CH4 isotope trends / budget, rather than as a stand-alone paper. I would therefore recommend the following additions to the study to make it more publishable:
- Examine and include discussion of temporal trends in the inter-laboratory offsets (if any)
- Use the interlab offsets determined here to place all the CH4 isotope datasets (from all stations, not just high latitudes) from the laboratories involved in the intercomparison on the same measurement scale and publish that larger unified dataset as part of the study
Response: We agree with the reviewer that our method is not a stand-alone solution for computing interlaboratory offsets but rather a complementary approach to traditional round-robin intercomparisons. There is actually a round-robin exercise presently ongoing, which involves more laboratories than the ones that participate in the present work. We still think that our approach is very useful by itself, especially for laboratories that perform high-precision atmospheric background measurements of the isotopic composition.
Further, we had initially indeed planned this part as a data harmonisation step for an inversion study. As this first part grew successively larger, and after establishing the most appropriate approach within our consortium, we decided to split it off and present this as a method manuscript where the data harmonisation effort is described in more detail than what would be possible if it were combined with an inversion study. The second part, building on this dataset, uses the offset-corrected, harmonised high-latitude isotope time series of methane to perform a 2-box Bayesian inversion. That manuscript (to be submitted) is aimed at source and sink apportionment and an evaluation of the δD-CH4 as an additional constraint on isotope-enabled atmospheric inversions in improving our understanding of the methane budget.
- Temporal trends in the inter-laboratory offsets (if any): We agree that co-located measurements can be used to monitor possible shifts in inter-laboratory offsets, and our consortium is actually making active use of this opportunity. However, this is particularly difficult for high-latitude sites because the sampling date is often not close to the analysis date. In particular, in the southern hemisphere (Antarctica), air samples are in many cases only shipped once per year to the laboratories. In addition, the UU laboratory re-measured cylinders collected by UHEI up to more than two decades after they were sampled. Thus, a general application of this approach is not easy, and it requires dedicated knowledge of the analytical conditions in each specialist laboratory. Therefore, we decided not to include this part in the manuscript, which is targeted at a wider community and should provide updated offsets between laboratories that can also be used by others in a straightforward manner to combine datasets from different laboratories.
- Interlaboratory offsets determined here to place all the CH4 isotope datasets (from all stations, not just high latitudes): We chose to offset correct and harmonise only high latitude datasets. Methane isotopes show spatial gradients across the globe, and we choose only the high-latitude regions because they are more remote from the main sources and reasonably well mixed; thus, records from different stations should still represent relatively similar air. This is not true for lower latitude sites, which are more and differently affected by local and regional sources.
Minor comments:
I found the discussion of some individual offsets between pairs of labs a bit scattered and hard to follow – perhaps consolidate into another table?
Response: We have chosen two reference laboratories, MPI-BGC and TU/NIPR, to harmonise datasets, which we think is the more appropriate and thorough approach rather than pair-wise comparisons. Including numbers with still more reference laboratories would result in too much data and take away from the central narrative of the work.
Line 458: “INSTAAR dD-CH4 values from SH are excluded because they stand out from the other time series, even after offset correction (Fig. 4a).”
There is no panel labeled “a” on Fig 4; also INSTAAR SH dD data are not shown at all on this figure
Response: Apologies, this should be just ‘Fig. 4’. The revised Fig. 4 now includes both INSTAAR SH dD and Pre-2015 MPI data.
Figure 4 (revised and attached): Harmonised data series with interlaboratory offset applied (mean ± uncertainty) to the observed data from eight laboratories using MPI-BGC as the reference laboratory. Relative to MPI-BGC, the TU/NIPR scale is enriched (+0.54 ‰) for δ13C-CH4 and depleted (−10.8 ‰) for δD-CH4.
Line 241: “relative to NIWA” (remove “the”)
Response: Agreed.
-
RC2: 'Comment on egusphere-2025-2439', Anonymous Referee #2, 27 Jul 2025
General comments
The manuscript describes a method to quantify inter-laboratory offsets for measurements of δ13C and δD in atmospheric methane, with the aim of harmonising each timeseries with a common reference. Offsets have been quantified previously by round-robin methods, and the method described here uses the timeseries of atmospheric measurements. After identifying periods of overlapping sampling, two approaches are presented to determine an offset relative to the selected reference laboratory MPI-BGC. These are a nearest neighbour approach that selects pairs of measurements sufficiently close in time then taking the difference and fitting the timeseries with a function, then taking the difference between pairs of functions. The offsets determined from the timeseries data are compared and found in agreement with previous intercomparison exercises (Umezawa et al 2018). Finally, the offsets are applied to the data to produce a harmonised timeseries for δ13C and δD in the Northern and Southern Hemispheres.
Specific comments
Nearest-neighbour approach:
It was not clear to me how the pairs of test and reference data points are selected. Is the data for one laboratory from all sites combined, then any test point matched with the closest reference in time regardless of site? Or is the nearest-neighbour approach applied to the data from each site separately?
Line 195 – Is the same value of spatial uncertainty used when two measurements at the same station are compared? For example, Table 1 shows that INSTAAR and MPI-BGC have overlapping records at Alert, so the comparison here uses measurements of the same air masses, while the other station pairings are geographically separated.
I suggest moving the two paragraphs from lines 378 and 388 from p. 13 into this section to describe the uncertainty calculation method.
Smoothed data approach:
Line 221 describes that the difference between two sites using the smoothed data approach is calculated from the difference between the long-term average of two sites. How does this account for the trend in the isotope ratio (i.e. the polynomial part)? That is, the mean of a longer running site will be earlier in the time series than a more recent site. Why not use the function to generate evenly spaced values for both sites then take the difference in the same way as the time-matching approach?
Line 228 – the authors recognise that the fitted curve should not be used to interpolate “significant gaps”. Can they provide a maximum duration that is used to exclude such portions.
Line 228 – “In this case, the extrapolated parts of the curve fit …” should be corrected to “In this case, the interpolated parts of the curve fit …”
Line 231 – the uncertainty calculation is unclear. In the description “average sum of squares of the root mean square error” what is being averaged here? The calculation of the offset was described on line 221 as the difference between the long-term average for both sites, which implies a single value. If the average is over all pairings of test site and reference site then should there also be a spatial uncertainty that accounts for co-located and geographically separate sites?
Harmonised long-term datasets
Line 459 - Can the authors provide more discussion for the statement “INSTAAR δD-CH4 values from SH are excluded because they stand out from the other time series, even after offset correction”. It is not clear what is standing out for these measurements, and does it show some conditions where the offset correction is ineffective? Section 3.2 shows little difference between the hemisphere-specific offsets. The reference to Fig. 4a also needs to be updated as the figure does not have sub-figures with letter labels.
Figure 5 – The line colours for Northern Hemisphere and Southern hemisphere are very difficult to distinguish, can this be replotted more clearly.
Citation: https://doi.org/10.5194/egusphere-2025-2439-RC2 -
AC2: 'Reply on RC2', Bibhasvata Dasgupta, 22 Aug 2025
RC2: 'Comment on egusphere-2025-2439', Anonymous Referee #2, 27 Jul 2025 reply
General comments
The manuscript describes a method to quantify inter-laboratory offsets for measurements of δ13C and δD in atmospheric methane, with the aim of harmonising each timeseries with a common reference. Offsets have been quantified previously by round-robin methods, and the method described here uses the timeseries of atmospheric measurements. After identifying periods of overlapping sampling, two approaches are presented to determine an offset relative to the selected reference laboratory MPI-BGC. These are a nearest neighbour approach that selects pairs of measurements sufficiently close in time then taking the difference and fitting the timeseries with a function, then taking the difference between pairs of functions. The offsets determined from the timeseries data are compared and found in agreement with previous intercomparison exercises (Umezawa et al 2018). Finally, the offsets are applied to the data to produce a harmonised timeseries for δ13C and δD in the Northern and Southern Hemispheres.
Specific comments
Nearest-neighbour approach:
It was not clear to me how the pairs of test and reference data points are selected. Is the data for one laboratory from all sites combined, then any test point matched with the closest reference in time regardless of site? Or is the nearest-neighbour approach applied to the data from each site separately?
Response: The data from all high-latitude sites per laboratory and hemisphere are first combined into a lab-specific NH or SH dataset. Then each point from the reference lab is matched with the closest test lab data point in time. We apply this approach at the laboratory level instead of comparing site-by-site because it allows us to use larger and longer datasets. To account for possible differences between locations, we include a fixed spatial uncertainty in our calculations. We have clarified this in the revised manuscript.
Line 195 – Is the same value of spatial uncertainty used when two measurements at the same station are compared? For example, Table 1 shows that INSTAAR and MPI-BGC have overlapping records at Alert, so the comparison here uses measurements of the same air masses, while the other station pairings are geographically separated.
Response: The spatial uncertainty is a fixed value (0.06 ‰ for δ¹³C-CH₄ and 0.5 ‰ for δD-CH₄) based on differences between INSTAAR’s northern stations (which do not include co-located or same station data), and it represents typical variability across high-latitude sites. Since it reflects broader atmospheric variability, it is not adjusted for co-located comparisons. This approach provides a conservative and consistent estimate of atmospheric variability across the entire dataset.
I suggest moving the two paragraphs from lines 378 and 388 from p. 13 into this section to describe the uncertainty calculation method.
Response: As the method applied in this paper is also part of its discussion, we have continued the points in lines 378-388 in the discussion in section 4.
Smoothed data approach:
Line 221 describes that the difference between two sites using the smoothed data approach is calculated from the difference between the long-term average of two sites. How does this account for the trend in the isotope ratio (i.e. the polynomial part)? That is, the mean of a longer running site will be earlier in the time series than a more recent site. Why not use the function to generate evenly spaced values for both sites then take the difference in the same way as the time-matching approach?
Response: For each laboratory pair, we first apply the NOAA CCGCRV smoothing to generate evenly spaced curve points over the entire record. We then restrict each smoothed series to its common overlap period and compute the mean of those matched points; the difference between those two means is the offset. Because the curves are defined by the same number of evenly spaced points across the same dates, this intrinsically accounts for both the long‑term polynomial trend and the seasonal harmonics.
Line 228 – the authors recognise that the fitted curve should not be used to interpolate “significant gaps”. Can they provide a maximum duration that is used to exclude such portions.
Response: We exclude any overlapping segment shorter than one year or containing gaps longer than six months, since the NOAA fit becomes unreliable over very sparse data. In practice, this only affected the UU vs. MPI‑BGC δ¹³C‑CH₄ comparison in the NH (2019–2022), which we have now noted explicitly in the revised text. By enforcing these minimum duration and maximum‑gap criteria, we ensure that all smoothed‑curve comparisons rest on robust, well‑constrained fits.
Line 228 – “In this case, the extrapolated parts of the curve fit …” should be corrected to “In this case, the interpolated parts of the curve fit …”
Response: Agreed.
Line 231 – the uncertainty calculation is unclear. In the description “average sum of squares of the root mean square error” what is being averaged here? The calculation of the offset was described on line 221 as the difference between the long-term average for both sites, which implies a single value. If the average is over all pairings of test site and reference site then should there also be a spatial uncertainty that accounts for co-located and geographically separate sites?
Response: In the smoothed‑data approach, we estimate uncertainty purely from how well each laboratory’s smoothed curve matches its own measurements. We calculate the root‑mean‑square error (RMSE) of the fit for each lab, that is, the typical deviation of the fitted curve points from the observed values, and then combine those two RMSEs into a single uncertainty by averaging their squared values and taking the square root of this average. Because this method compares two continuous, evenly spaced curves rather than individual station measurements, it does not include a separate spatial‑uncertainty term: any geographic variability is inherently captured in the residuals used to compute each curve’s RMSE.
Harmonised long-term datasets
Line 459 - Can the authors provide more discussion for the statement “INSTAAR δD-CH4 values from SH are excluded because they stand out from the other time series, even after offset correction”. It is not clear what is standing out for these measurements, and does it show some conditions where the offset correction is ineffective? Section 3.2 shows little difference between the hemisphere-specific offsets. The reference to Fig. 4a also needs to be updated as the figure does not have sub-figures with letter labels.
Response: We found that, even after applying the calculated lab‑wide offset, some SH INSTAAR records lie systematically ~5 ‰ heavier than other SH points around that time (included now in the revised Fig.4, also pasted above). Because we impose a single, uniform offset per laboratory, the SH values remain biased, and it is difficult to filter out outliers. To prevent these residual artefacts from distorting the merged, smoothed time series, we have omitted the SH INSTAAR δD‑CH₄ entirely from the harmonised plot (Fig. 5). However, the complete offset‑corrected INSTAAR SH dataset is included in the ICOS data portal for any users who wish to include it.
Section 3.2 does not report INSTAAR δD-CH4 hemisphere-specific offsets (no overlap with MPI-BGC), only TU/NIPR and UU.
Apologies, this should be just ‘Fig. 4’.
Figure 5 – The line colours for Northern Hemisphere and Southern hemisphere are very difficult to distinguish, can this be replotted more clearly.
Response: The line colours for NH and SH in Fig.5 have been updated.
Figure 5 (revised and attached): Merged and fitted data series with interlaboratory offset applied (mean ± uncertainty) to the observed data from eight laboratories using MPI-BGC as the reference laboratory. The NOAA CCGCRV algorithm is used for the fit, and the RMSE of residuals is plotted as the uncertainty band.
-
AC2: 'Reply on RC2', Bibhasvata Dasgupta, 22 Aug 2025
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
569 | 75 | 15 | 659 | 30 | 16 | 33 |
- HTML: 569
- PDF: 75
- XML: 15
- Total: 659
- Supplement: 30
- BibTeX: 16
- EndNote: 33
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1