the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Technical note: Obtaining accurate, high-frequency and long-term seawater pH data by using coupled lab-on-chip and optode sensing technologies
Abstract. The marine science community requires accurate, cost-effective, and reliable pH sensors capable of long-term, stable operations in-situ from coastal to deep-sea environments. Spectrophotometric pH sensors based on lab-on-chip (LOC) technology have been shown to offer long-term accuracy that can sample every 10 minutes. However, for applications where higher-frequency measurements are important, this maximum sample rate may be limiting, in addition to the power requirements needed to operate the sensor.
In contrast, commercially available pH optodes (PyroScience GmbH) are relatively inexpensive, consume little power and have a small form factor, but with intense use the pH sensitive membrane can photo-oxidise, causing signal drift. The combination of LOC and optode technologies, however, can be used to provide long-term, high-frequency and high-stability in-situ pH data, but protocols to correct for sensor drift need to be developed and evaluated.
To examine sensor drift and develop protocols to account for it, we suspended two LOC pH sensors with two pH optodes at 0.5 m depth from a floating pontoon within a harbour in Southampton, UK for six months (June–December 2023). This is a highly dynamic tidal environment with substantial biofouling. The optode (AquapHOx-L-pH, PyroScience GmbH) and an independent pH sensor (Deep SeapHOx V2, Sea-Bird Scientific) measured at a high frequency (e.g., ≤5 min) alongside a LOC pH sensor measuring at a lower frequency (e.g., ≤2 hr). Triplicate lab validated co-samples were collected each week, in addition to dedicated sensors monitoring the temperature, salinity, dissolved oxygen and tidal height. We find good agreement i.e., mean ∆pH = -0.022 ± 0.023 (3,182 data points in common) pH units between the SeapHOx and LOC sensors, in addition to individual field accuracies of <0.020 pH units. As expected, we found significant signal drift (e.g., generally ≤0.012 pH units per day) and offsets (e.g., 0.1–0.2 pH units) with the pH optodes after intensive use in a high biofouling environment. However, by coupling accurate LOC pH data to high frequency optode data, we corrected the optode signal drift/offset and achieved a similar field accuracy (<0.02 pH units) to the SeapHOx sensor even when using ultra-low LOC pH sensor measurement frequencies (e.g., several days to weeks). Overall, this work provides the oceanographic community with guidelines on how to achieve accurate, rapid and long-term pH measurements, while also balancing power requirements, by combining two complementary pH sensing technologies.
- Preprint
(1163 KB) - Metadata XML
-
Supplement
(1279 KB) - BibTeX
- EndNote
Status: open (until 17 Feb 2026)
- RC1: 'Comment on egusphere-2025-5566', Anonymous Referee #1, 06 Jan 2026 reply
-
CC1: 'Comment on egusphere-2025-5566', Anthony Lucio, 16 Jan 2026
reply
RC1: Line 74: ISFET-based pH warmup time depends on choice of reference electrode. If using the Cl- ISE, there is a longer conditioning requirement.
AJL: Thank you for pointing that out. We see that the Deep SeapHOx V2 (used in the present study) only has an external Ag/AgCl reference electrode and not in addition to an internal (gelled electrolyte) Ag/AgCl reference electrode. This results in a longer conditioning time and as such the salinity correction to the pH becomes quite important. We will make sure this is clearer in the text.
RC1: Table 1 (and text above): Size of sensors is a little unusual to include without more details- why list the size of the seafet if you deployed a seaphox? Is this just the sensor or the electronics, housing, power, etc.? Sensor footprint is different from a fully autonomous package.
AJL: The LOC and optode sensors are only pH sensors, so we wanted to compare the physical size to the most relevant ISFET-based pH only sensor (hence the SeaFET dimensions). The sizes listed are of the sensor exterior housing to give an indication of their overall footprint that is relevant to physical integration onto vehicles/platforms, but we should note that (as deployed) the Deep SeapHOx V2 and AquapHOx-L-pH (optode) were operated as fully autonomous systems whereas the LOC pH sensor utilised an external power supply.
RC1: Line 159: I don’t recall mention of pH scales used. It is important when comparing different sensors to describe which scale is being used and where/how conversions are being applied. What is the composition of the pyroscience calibration solutions?
AJL: The pH reported is the total proton scale (pHT). We will make sure this is clearly stated in the text. Unfortunately, the composition of the PyroScience buffer solutions is not disclosed but they do recommend using their specific pH 2 / pH 11 buffers instead of common commercial buffers that contain preservatives.
RC1: General notes...
AJL: Thank you for highlighting a few additional comments. We will address these general notes in our formal author response to be submitted in due course.Citation: https://doi.org/10.5194/egusphere-2025-5566-CC1 -
RC2: 'Comment on egusphere-2025-5566', Anonymous Referee #2, 23 Jan 2026
reply
This manuscript presents the results of an intercomparison between several pH sensors and laboratory measurements that were deployed in a challenging (in terms of biofouling) field environment. The experimental approach was very thorough and it seems likely that the dataset is excellent for doing the presented analysis. Overall it will be a useful contribution to the field. The concept of using a high-accuracy, low-resolution sensor to calibrate a low-accuracy, high-resolution one is interesting and does need more work in the context of specific sensor setups but it is not novel. There are a couple of limitations with the analysis. Primarily more evidence is needed to support the proposed approach, e.g. comparisons with other possible approaches and improvements to mitigate identified limitations, if the authors wish to present it as a guideline for the community to follow. My major points are in titled sections below, followed by minor comments and then technical corrections.
Instability of a 2-point regression
The issue of instability in a linear regression with 2 points (lines 331-333) is the major problem with the approach presented; it is mentioned briefly but not convincingly dealt with. Were this a study focused on reporting a particular observational dataset to interpret in some environmental context, it would probably be sufficient, because the uncertainties for the method used have been calculated and reported. But given the aim of this manuscript to provide guidelines for the research community on how to do this correction, it becomes essential here to do the extra work to see if the approach can be adapted to eliminate this issue, or at least to demonstrate that adaptations don’t add any value. The authors have already collected the data needed to test these things. For example, linear fits could be made over 3+ consecutive LOC points to reduce the sensitivity to individual points. Doing linear fits also means there are sharp transitions between gradients as points are crossed; how does doing some smoothing fit (e.g. PCHIP, moving average) between the subsampled LOC points affect the quality of the corrected data?
I recognise this requires some more work, but a paper proposing community guidelines should have done the due diligence to show that the guidelines are actually the best way of doing something. (While fully accepting that “the best way” will be a balance between complexity of the approach and accuracy of the results.) Alternatively, the relevant parts should be rephrased to indicate that this is a manuscript proposing and evaluating one potential way to do something, not claiming that this is a guideline that others should follow. Taking the latter choice would also reduce the impact of this study.
Terminology: accuracy
Throughout, especially e.g. Table 3, Fig. 7 and associated discussion: the mean offset is referred to as “accuracy” which is not always helpful terminology. This leads to e.g. describing an apparent accuracy “minimum” at some intermediate correction interval (line 326). An alternative, and to me more convincing, interpretation of Table 3 is that there is some constant offset (-0.018) e.g. due to an offset between LOC and co-samples that cannot be improved upon by increasing resolution, but because this is negative and the initial offset is positive (+0.111), you necessarily have to pass through zero to get from one to the other. But it doesn’t mean that the results are “more accurate” at that intermediate point. Indeed if the initial offset happened to be more negative than this final constant value (or the constant value positive) then we the apparent “accuracy minimum” would probably not appear. In this case, the accuracy minimum is a fluke and not a reproducible feature that would necessarily be found in other datasets that had a different offset between the LOC and co-samples.
Not helping my interpretation of the above is that I found it unclear exactly how this “accuracy” error was calculated. I’m assuming it’s corrected optode vs lab co-samples in the comment above. Please clarify or make more obviously explicit in the relevant parts of the discussion.
Finally the points stated to represent these local minima in the text (2 days for x and 1 day for 1-sigma) are not the lowest local minima in the table (1 week for x and 2 hours for 1-sigma), so I don’t follow why they were selected to be highlighted.
I think a big step towards a solution here would be to be more specific about what is meant in each statement and avoid using the somewhat ambiguous term “accuracy” when a more specifically meaningful alternative word is available.
What is accurate?
The manuscript refers often to producing measurements that are “accurate” but does not define what this means – what constitutes “accurate” and whether something is accurate enough? It depends on the research question being asked of the data. Please could this be addressed briefly where relevant (e.g., Introduction and Conclusions, maybe relevant parts of R&D). Often this is done with reference to the GOA-ON “weather” and “climate” uncertainty targets (Newton et al., 2015), although other approaches are possible.
Manufacturer claims
Manufacturer accuracy claims are presented in the Introduction and Methods. They are sometimes alluded to in the R&D but it might be useful to have a short paragraph or section that directly addresses if these accuracy claims could indeed be achieved by the various sensors in the tests here.
Minor comments
158 PyroScience have several different sensor caps available, with different pK values and returning results on different pH scales; please specify what was used.
159 The PyroScience optode software also has the option to add a third calibration point of a buffer (e.g., tris) within the measuring range to improve accuracy. Could the authors comment on if and how excluding this step may have affected their results and conclusions?
167 Was it really possible to always get the sample from the harbour, into the lab, in an optical cell, equilibrated to 20 °C, injected with mCP and measured in under 5 minutes? Impressive if so, but the relevant time to report would be the actual moment of measurement, not just the moment that the sample handling in the lab began – please check & confirm.
202 Does a “battery failure” refer to the battery running out of charge, or something else more dramatic? Please clarify.
Technical corrections
pH is dimensionless; please remove references to “pH units” throughout.
42 Grammar: change “and until recently, was” to e.g. (“which until recently was”).
43 If pH is calculated from DIC, TA or fCO2 then it is not a “measurement of pH”, please rephrase.
55 Provide a location for the NOC.
58 The ocean goes deeper than 6000 m, please rephrase “full ocean depth”.
67 Not clear specifically what “This” refers to.
77 Grammar: either “version” => “versions” and “are a cylinder” => “are cylinders”, or “are” => “is”. Also probably “sensors” => “sensor”.
108 Presumably “The NOC” should be “The harbour”, or make it clear that the harbour is at the NOC in the previous sentence.
155 “calibration-less” is a bit awkward; “calibration-free”?
Section 3.2 Several aspects of results that should be in past tense are written in the present. Also applies to other parts of the Results & Discussion. Some of these I have noted as technical corrections here but my list will be incomplete so please check through carefully.
290 Brackets around the two b terms at the end of Eq. (5) are unnecessary.
Table 3 Should n be the same in every row? I would have guessed it is how many LOC points were used in the calibration, which would be smaller for the longer correction intervals. If not, then please rewrite the caption to make it clearer what n means. If it is supposed to be the same then it doesn’t need to be a column in the table. Also, please mention in the caption which sensor data are being shown and what they are being compared to.
343 “30/80/2023” => “30/08/2023”.
359 “are reporting” => “were reporting” or “reported”.
376 “are tracking” => “were tracking”.
Citation: https://doi.org/10.5194/egusphere-2025-5566-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 210 | 86 | 24 | 320 | 35 | 15 | 15 |
- HTML: 210
- PDF: 86
- XML: 24
- Total: 320
- Supplement: 35
- BibTeX: 15
- EndNote: 15
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Here the authors present data from two deployments of pH sensors that operate at different frequencies. By deploying a pH optode alongside their LOC (lab on chip) pH sensor, they were able to demonstrate that the greater accuracy provided by the LOC pH measurement could be used to correct drift in the pH optode signal. This opened up the ability to measure pH at far greater frequency by the LOC sensor alone for longer duration using the higher frequency optode that is prone to more rapid drift. An ISFET-based pH sensor package was deployed alongside as well to provide an independent indication of optode performance and assessment of the LOC applied correction in addition to bottle samples. As part of the study, it was also determined how often the optode benefitted from the correction by the LOC sensor.
A few comments:
Line 74: ISFET-based pH warmup time depends on choice of reference electrode. If using the Cl- ISE, there is a longer conditioning requirement.
Table 1 (and text above): Size of sensors is a little unusual to include without more details- why list the size of the seafet if you deployed a seaphox? Is this just the sensor or the electronics, housing, power, etc.? Sensor footprint is different from a fully autonomous package.
Line 159: I don’t recall mention of pH scales used. It is important when comparing different sensors to describe which scale is being used and where/how conversions are being applied. What is the composition of the pyroscience calibration solutions?
General: