Tipping point analysis helps identify sensor phenomena in humidity data

Livina, Valerie N.; Willett, Kate; Bell, Stephanie

doi:10.5194/egusphere-2025-1461

Preprints

https://doi.org/10.5194/egusphere-2025-1461

Preprints

21 May 2025

| 21 May 2025

Tipping point analysis helps identify sensor phenomena in humidity data

Valerie N. Livina, Kate Willett, and Stephanie Bell

Abstract. Humidity variables are important for monitoring climate. Unlike, for instance, temperature, they require data transformation to derive water vapour variables from observations. Hygrometer technologies have changed over the years and, in some cases, have been prone to sensor drift due to aging, condensation or contamination in service, requiring replacement. Analysis of these variables may provide rich insight into both instrumental and climate dynamics. We apply tipping point analysis to dew point and relative humidity values from hygrometers at 55 observing stations in the UK. Our results demonstrate these techniques, which are usually used for studying geophysical phenomena, are also potentially useful for identifying historic instrumental changes that may be undocumented or lack metadata.

Received: 27 Mar 2025 – Discussion started: 21 May 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Valerie N. Livina, Kate Willett, and Stephanie Bell

Status: closed

RC1:
'Comment on egusphere-2025-1461', Chris Boulton, 17 Jul 2025

Being a long-time user of early warning signals to predict tipping points, I found this paper to be quite an interesting read given its novel topic. I think the idea is there, but I have a few suggestions to improve the analysis and message of the paper to make it appeal to a broader audience.
My main suggestion would be to include variance analysis too as an EWS. There has been a lot of work recently that carries out the EWS analysis on Earth Observation data that is a merged product of multiple sensors. With an expected increase in the signal-to-noise ratio as newer sensors are included, the AR(1) should increase as observed in your work, but the variance would decrease at the same time. It seems that the combination of these would aid your work as it could rule out any ‘natural’ change in the system itself. Smith et al. (2023)* shows an example of this.
Linked to this, I think it’s important to highlight how these EWS are affected by the changes in measurement circumstances, from the viewpoint of the other stages in tipping point analysis such as prediction and the chance of false positives.
I also had some slight confusion about how and why ERA5 data is used at all. It would be good to explain that the large gaps in data (I assume) come from the station data and not ERA5. Also, why can’t the station data measurement just be used since the method used to create the reanalysis will not have the same issues?
I have a few more minor comments which should improve clarity:
Lines 77-80: This section is slightly confusing. I think it’s suggesting that the AR(1) has to reach 1 as a critical value but Kendall’s tau is also mentioned. I would be wary of saying that AR(1)=1 is critical when detrending has occurred in the time series it is calculated on as this alters the absolute value of AR(1). I would also refer to a ‘time series’ of the indicator throughout rather than ‘curve’.
Lines 91-94: This section may not be needed. The potential plots here have not been estimated, for example.
Line 113: What was wrong with the station that wasn’t used?
Fig. 1: What window length is being used here? I would also centre the x-axis on -1 to 1 in both panels.
Page 6: I think it’s important to say what the actual window length was that was used, and personally I would suggest trying a longer window length as well to see how the results contrast given the discussion on this page. Also, does the BCP analysis require any a priori input on the change of form that is searched for (e.g. looking for a certain number of changepoints)? If so, this should be stated.
Fig. 2: I feel like this could be better represented by a continuous blue to red scale rather than the size of the circles as I find it hard to distinguish between the sizes (except the blue ones look small).
Fig. 3: The red box is not defined in the figure caption.
Line 180: There are no detections in the 1980s in Figures 7 or 8.
Fig. 6-8: It would be good to add the red crosses on the bottom panels each time to see how they match up more clearly.
Line 191: The shifts in the 1980s happen in the Appendix but not in the figures in the main paper.
Appendix Table: Is this for detections above 0.8 with the BCP analysis? If so, it should say in the caption.
*Smith, T., Zotta, R.-M., Boulton, C. A., Lenton, T. M., Dorigo, W., and Boers, N.: Reliability of resilience estimation based on multi-instrument time series, Earth Syst. Dynam., 14, 173–183, https://doi.org/10.5194/esd-14-173-2023, 2023.

Citation: https://doi.org/10.5194/egusphere-2025-1461-RC1
- AC1: 'Reply on RC1', Valerie N. Livina, 03 Oct 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-1461/egusphere-2025-1461-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-1461-AC1
- AC2: 'Reply on RC1', Valerie N. Livina, 03 Oct 2025
  
  Publisher’s note: this comment is a copy of AC1 and its content was therefore removed on 7 October 2025.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1461-AC2
- AC3: 'Reply on RC1', Valerie N. Livina, 03 Oct 2025
  
  Publisher’s note: this comment is a copy of AC1 and its content was therefore removed on 7 October 2025.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1461-AC3
RC2:
'Comment on egusphere-2025-1461', Anonymous Referee #2, 27 Jul 2025

The manuscript by Livina et al. aims to demonstrate how statistical techniques traditionally used for the detection of tipping points, typically applied to geophysical time series, can be extended to check the homogeneity of relative humidity time series, also in near-real time, to improve the monitoring of an observing network's data quality. This presents a new approach for identifying structural breaks caused by instrumentation changes, station relocations, or other inhomogeneities.
The homogenization of climate time series has been extensively addressed in the literature, as the authors also acknowledge. It remains a major challenge, especially when metadata is incomplete or unavailable, making it difficult to identify discontinuities using documentary evidence alone. In such cases, statistical methods are essential to detect and correct breakpoints, ensuring that time series are homogeneous and suitable for the analysis of trends and variability in climate studies.
In this context, applying Early Warning Signal (EWS) techniques for break detection is a promising approach. It may also help clarify the structural uncertainties inherent in the identification of such breaks. However, the manuscript has some limitations that require attention. Below, I provide general and specific comments, and I recommend major revisions.
General comments

1. The authors state that the uncertainty in the timing of breakpoint detection using EWS methods is "on the order of several weeks". This statement is too vague and should be quantified more precisely. Ideally, the uncertainty should be expressed as a time interval, with a more detailed explanation of its origin. For instance, spectral analysis or low-rank reconstructions such as Singular Spectrum Analysis (SSA) may introduce smoothing effects that act as data-driven low-pass filters. When high-frequency noise is removed, residuals may show inflated autocorrelation, which could bias diagnostics based on autocorrelation patterns, such as EWS and breakpoint detection. This aspect should be discussed explicitly.
2. The authors focus exclusively on relative humidity (RH) time series. While RH may be more stationary than temperature and is indeed relevant for the surface energy balance, it is rarely used as a primary indicator in global warming assessments. Temperature, in contrast, is the most studied, standardized, and policy-relevant climate variable. Although it is important to explore all surface variables to fully understand climate change, the omission of temperature series limits the broader applicability and scientific impact of the study. The rationale for this choice should be better justified. Ideally, the method should also be tested or validated on temperature data, which are better documented and more commonly used in homogenization and climate trend studies.
3. RH measurements over the decades have been affected not only by changes in the mean but also in their variability, particularly before 1990. These changes may result from sensor issues such as limited sensitivity or response time. The authors should explain how such biases might influence the performance of EWS methods in detecting breakpoints, especially since EWS methods may rely on changes in variance and autocorrelation.
4. To evaluate the added value of the proposed EWS technique, it would be useful to compare it with more established methods, such as Observation-minus-Background (O–B) diagnostics from atmospheric reanalyses, at least for the two cases discussed in the main text. Comparison with widely used methods for detecting breakpoints and assessing the quality of time series may be beneficial to highlight the quality of the proposed approach. Such a comparison would help clarify why some breakpoints remain undetected by EWS and highlight the potential advantages or limitations of the approach compared to other techniques.
4. The exclusion of time series with unresolved breaks removes a substantial portion of potentially valuable climate information. The authors should take a clearer position on this issue: is the exclusion of parts of the time series with the largest temporal gaps recommended as a general strategy, or should these series be reconstructed through infilling or correction techniques? The manuscript would benefit from a more explicit discussion of this trade-off.
Specific comments
Line 117–120: Is temperature not recorded at the stations alongside RH? If so, it is not fully clear why both pressure and dew point ERA5 data are needed here, especially if they are not being used to fill data gaps. Please clarify this point.
Line 124–125: Provide additional context on the use of Singular Spectrum Analysis to the reader. Explain how and why it was applied in this specific case.
Lines 143–145 vs. 182–184: The rationale for choosing short time windows to avoid conflating long-term changes with breakpoints appears inconsistent with later claims that long-term trends are still detectable. Please clarify the limitations and implications of the chosen window size.
Line 162–163: This section offers an excellent opportunity to demonstrate the utility of the EWS approach by comparing detected breakpoints with available metadata. Consider including a table (even in the appendix) with station metadata and detected breaks, and comment on the comparison of the breakpoint detection dates.
Line 167: It would be helpful to explain how the levels of autocorrelation and the resulting probabilities of breakpoint detection can vary depending on the nature and magnitude of the underlying break. For instance, abrupt shifts in the mean versus gradual drifts or changes in variance may produce different statistical signals, leading to varying detection sensitivity. Moreover, spectral smoothing or reconstruction methods (such as SSA) may affect the autocorrelation. These aspects should be discussed in detail to justify the robustness of the detection criteria.
Lines 168–170: Breakpoints may arise not only from instrumentation changes but also from relocations, calibration issues, or sensor malfunctions. This should be described in more detail.
Lines 171–172 (Figure 6–7): Bingley and Camborne stations reportedly show no documented changes before 1990. Is this because there were no changes, or are the metadata simply unavailable? This is important information for readers and should be clarified.
Lines 175–179: See general comment above regarding comparisons with other break detection methods.
References: Please consider expanding the bibliography to include more contributions from the broader climate science community on break detection, including methods based on metadata, O–B fields, Bayesian approaches, etc.

Citation: https://doi.org/10.5194/egusphere-2025-1461-RC2
- AC4: 'Reply on RC2', Valerie N. Livina, 03 Oct 2025
  
  The pdf-file of the reply is attached.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1461-AC4

Status: closed

RC1:
'Comment on egusphere-2025-1461', Chris Boulton, 17 Jul 2025

Being a long-time user of early warning signals to predict tipping points, I found this paper to be quite an interesting read given its novel topic. I think the idea is there, but I have a few suggestions to improve the analysis and message of the paper to make it appeal to a broader audience.
My main suggestion would be to include variance analysis too as an EWS. There has been a lot of work recently that carries out the EWS analysis on Earth Observation data that is a merged product of multiple sensors. With an expected increase in the signal-to-noise ratio as newer sensors are included, the AR(1) should increase as observed in your work, but the variance would decrease at the same time. It seems that the combination of these would aid your work as it could rule out any ‘natural’ change in the system itself. Smith et al. (2023)* shows an example of this.
Linked to this, I think it’s important to highlight how these EWS are affected by the changes in measurement circumstances, from the viewpoint of the other stages in tipping point analysis such as prediction and the chance of false positives.
I also had some slight confusion about how and why ERA5 data is used at all. It would be good to explain that the large gaps in data (I assume) come from the station data and not ERA5. Also, why can’t the station data measurement just be used since the method used to create the reanalysis will not have the same issues?
I have a few more minor comments which should improve clarity:
Lines 77-80: This section is slightly confusing. I think it’s suggesting that the AR(1) has to reach 1 as a critical value but Kendall’s tau is also mentioned. I would be wary of saying that AR(1)=1 is critical when detrending has occurred in the time series it is calculated on as this alters the absolute value of AR(1). I would also refer to a ‘time series’ of the indicator throughout rather than ‘curve’.
Lines 91-94: This section may not be needed. The potential plots here have not been estimated, for example.
Line 113: What was wrong with the station that wasn’t used?
Fig. 1: What window length is being used here? I would also centre the x-axis on -1 to 1 in both panels.
Page 6: I think it’s important to say what the actual window length was that was used, and personally I would suggest trying a longer window length as well to see how the results contrast given the discussion on this page. Also, does the BCP analysis require any a priori input on the change of form that is searched for (e.g. looking for a certain number of changepoints)? If so, this should be stated.
Fig. 2: I feel like this could be better represented by a continuous blue to red scale rather than the size of the circles as I find it hard to distinguish between the sizes (except the blue ones look small).
Fig. 3: The red box is not defined in the figure caption.
Line 180: There are no detections in the 1980s in Figures 7 or 8.
Fig. 6-8: It would be good to add the red crosses on the bottom panels each time to see how they match up more clearly.
Line 191: The shifts in the 1980s happen in the Appendix but not in the figures in the main paper.
Appendix Table: Is this for detections above 0.8 with the BCP analysis? If so, it should say in the caption.
*Smith, T., Zotta, R.-M., Boulton, C. A., Lenton, T. M., Dorigo, W., and Boers, N.: Reliability of resilience estimation based on multi-instrument time series, Earth Syst. Dynam., 14, 173–183, https://doi.org/10.5194/esd-14-173-2023, 2023.

Citation: https://doi.org/10.5194/egusphere-2025-1461-RC1
- AC1: 'Reply on RC1', Valerie N. Livina, 03 Oct 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-1461/egusphere-2025-1461-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-1461-AC1
- AC2: 'Reply on RC1', Valerie N. Livina, 03 Oct 2025
  
  Publisher’s note: this comment is a copy of AC1 and its content was therefore removed on 7 October 2025.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1461-AC2
- AC3: 'Reply on RC1', Valerie N. Livina, 03 Oct 2025
  
  Publisher’s note: this comment is a copy of AC1 and its content was therefore removed on 7 October 2025.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1461-AC3
RC2:
'Comment on egusphere-2025-1461', Anonymous Referee #2, 27 Jul 2025

The manuscript by Livina et al. aims to demonstrate how statistical techniques traditionally used for the detection of tipping points, typically applied to geophysical time series, can be extended to check the homogeneity of relative humidity time series, also in near-real time, to improve the monitoring of an observing network's data quality. This presents a new approach for identifying structural breaks caused by instrumentation changes, station relocations, or other inhomogeneities.
The homogenization of climate time series has been extensively addressed in the literature, as the authors also acknowledge. It remains a major challenge, especially when metadata is incomplete or unavailable, making it difficult to identify discontinuities using documentary evidence alone. In such cases, statistical methods are essential to detect and correct breakpoints, ensuring that time series are homogeneous and suitable for the analysis of trends and variability in climate studies.
In this context, applying Early Warning Signal (EWS) techniques for break detection is a promising approach. It may also help clarify the structural uncertainties inherent in the identification of such breaks. However, the manuscript has some limitations that require attention. Below, I provide general and specific comments, and I recommend major revisions.
General comments

1. The authors state that the uncertainty in the timing of breakpoint detection using EWS methods is "on the order of several weeks". This statement is too vague and should be quantified more precisely. Ideally, the uncertainty should be expressed as a time interval, with a more detailed explanation of its origin. For instance, spectral analysis or low-rank reconstructions such as Singular Spectrum Analysis (SSA) may introduce smoothing effects that act as data-driven low-pass filters. When high-frequency noise is removed, residuals may show inflated autocorrelation, which could bias diagnostics based on autocorrelation patterns, such as EWS and breakpoint detection. This aspect should be discussed explicitly.
2. The authors focus exclusively on relative humidity (RH) time series. While RH may be more stationary than temperature and is indeed relevant for the surface energy balance, it is rarely used as a primary indicator in global warming assessments. Temperature, in contrast, is the most studied, standardized, and policy-relevant climate variable. Although it is important to explore all surface variables to fully understand climate change, the omission of temperature series limits the broader applicability and scientific impact of the study. The rationale for this choice should be better justified. Ideally, the method should also be tested or validated on temperature data, which are better documented and more commonly used in homogenization and climate trend studies.
3. RH measurements over the decades have been affected not only by changes in the mean but also in their variability, particularly before 1990. These changes may result from sensor issues such as limited sensitivity or response time. The authors should explain how such biases might influence the performance of EWS methods in detecting breakpoints, especially since EWS methods may rely on changes in variance and autocorrelation.
4. To evaluate the added value of the proposed EWS technique, it would be useful to compare it with more established methods, such as Observation-minus-Background (O–B) diagnostics from atmospheric reanalyses, at least for the two cases discussed in the main text. Comparison with widely used methods for detecting breakpoints and assessing the quality of time series may be beneficial to highlight the quality of the proposed approach. Such a comparison would help clarify why some breakpoints remain undetected by EWS and highlight the potential advantages or limitations of the approach compared to other techniques.
4. The exclusion of time series with unresolved breaks removes a substantial portion of potentially valuable climate information. The authors should take a clearer position on this issue: is the exclusion of parts of the time series with the largest temporal gaps recommended as a general strategy, or should these series be reconstructed through infilling or correction techniques? The manuscript would benefit from a more explicit discussion of this trade-off.
Specific comments
Line 117–120: Is temperature not recorded at the stations alongside RH? If so, it is not fully clear why both pressure and dew point ERA5 data are needed here, especially if they are not being used to fill data gaps. Please clarify this point.
Line 124–125: Provide additional context on the use of Singular Spectrum Analysis to the reader. Explain how and why it was applied in this specific case.
Lines 143–145 vs. 182–184: The rationale for choosing short time windows to avoid conflating long-term changes with breakpoints appears inconsistent with later claims that long-term trends are still detectable. Please clarify the limitations and implications of the chosen window size.
Line 162–163: This section offers an excellent opportunity to demonstrate the utility of the EWS approach by comparing detected breakpoints with available metadata. Consider including a table (even in the appendix) with station metadata and detected breaks, and comment on the comparison of the breakpoint detection dates.
Line 167: It would be helpful to explain how the levels of autocorrelation and the resulting probabilities of breakpoint detection can vary depending on the nature and magnitude of the underlying break. For instance, abrupt shifts in the mean versus gradual drifts or changes in variance may produce different statistical signals, leading to varying detection sensitivity. Moreover, spectral smoothing or reconstruction methods (such as SSA) may affect the autocorrelation. These aspects should be discussed in detail to justify the robustness of the detection criteria.
Lines 168–170: Breakpoints may arise not only from instrumentation changes but also from relocations, calibration issues, or sensor malfunctions. This should be described in more detail.
Lines 171–172 (Figure 6–7): Bingley and Camborne stations reportedly show no documented changes before 1990. Is this because there were no changes, or are the metadata simply unavailable? This is important information for readers and should be clarified.
Lines 175–179: See general comment above regarding comparisons with other break detection methods.
References: Please consider expanding the bibliography to include more contributions from the broader climate science community on break detection, including methods based on metadata, O–B fields, Bayesian approaches, etc.

Citation: https://doi.org/10.5194/egusphere-2025-1461-RC2
- AC4: 'Reply on RC2', Valerie N. Livina, 03 Oct 2025
  
  The pdf-file of the reply is attached.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1461-AC4

Valerie N. Livina, Kate Willett, and Stephanie Bell

Viewed

Total article views: 799 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
684	84	31	799	26	31

HTML: 684
PDF: 84
XML: 31
Total: 799
BibTeX: 26
EndNote: 31

Views and downloads (calculated since 21 May 2025)

Month	HTML	PDF	XML	Total
May 2025	42	10	3	55
Jun 2025	47	5	2	54
Jul 2025	51	22	6	79
Aug 2025	105	5	1	111
Sep 2025	348	8	8	364
Oct 2025	57	20	9	86
Nov 2025	31	12	2	45
Dec 2025	3	2	0	5

Cumulative views and downloads (calculated since 21 May 2025)

Month	HTML	PDF	XML	Total
May 2025	42	10	3	55
Jun 2025	47	5	2	54
Jul 2025	51	22	6	79
Aug 2025	105	5	1	111
Sep 2025	348	8	8	364
Oct 2025	57	20	9	86
Nov 2025	31	12	2	45
Dec 2025	3	2	0	5

Viewed (geographical distribution)

Total article views: 794 (including HTML, PDF, and XML) Thereof 794 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 05 Dec 2025

Short summary

A novel approach that uses tipping point analysis for identifying instrumental changes in sensor data that may not have full description of legacy hardware. The technique helps interpret changes of pattern in the data (autocorrelations) and distinguish them from climatic and environmental effects. This is particularly important for historic datasets, where instrumental changes may be undocumented or lack metadata.


Total:	0
HTML:	0
PDF:	0
XML:	0