Exploring the Capability of Surface-Observed Spectral Irradiance for Remote Sensing of Precipitable Water Vapor Amount under All-Sky Conditions

Khatri, Pradeep; Takamura, Tamio; Irie, Hitoshi

doi:10.5194/egusphere-2025-4074

Preprints

https://doi.org/10.5194/egusphere-2025-4074

Preprints

17 Oct 2025

| 17 Oct 2025

Exploring the Capability of Surface-Observed Spectral Irradiance for Remote Sensing of Precipitable Water Vapor Amount under All-Sky Conditions

Pradeep Khatri, Tamio Takamura, and Hitoshi Irie

Abstract. Precipitable water vapor (PWV) is a key component of Earth’s climate and hydrological systems, yet its accurate and continuous observation under varying sky conditions remains challenging. This study demonstrates the strong potential of surface-based spectral irradiance measurements for PWV retrieval across a range of atmospheric conditions using deep neural network (DNN) models trained on water vapor absorption bands. Global, direct, and diffuse spectral irradiances observed at water vapor absorption bands of 929.0–997.3 nm, 800.9–840.5 nm, and 708.1–744.6 nm by a spectroradiometer (MS-700; EKO Instruments Co., Ltd., Japan) equipped with a rotating shadow-band system were used as test data, while PWV observed by a microwave radiometer (MP-1500; Radiometrics Corporation, USA) served as reference data for model training and validation. Models incorporating global, direct, and diffuse irradiances achieved the highest accuracy, exhibiting minimal errors and closely capturing seasonal PWV variations. Notably, even models using only global irradiance—an easier and more accessible measurement—maintained high predictive performance, with low errors and robust seasonal tracking. In contrast, models trained solely on clear-sky direct irradiance with limited data showed relatively higher errors and weaker generalization, underscoring the importance of data volume and diversity in DNN models. These results highlight the effectiveness of spectral irradiance-based approaches for continuous PWV estimation across a range of atmospheric conditions. Future research should incorporate additional spectral bands sensitive to constituents like aerosols and ozone to expand retrieval capability.

Received: 20 Aug 2025 – Discussion started: 17 Oct 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Pradeep Khatri, Tamio Takamura, and Hitoshi Irie

Status: closed

RC1:
'Comment on egusphere-2025-4074', Anonymous Referee #2, 29 Oct 2025
General comment
This study provides practical insights for estimating PWV from the spectral measurements of solar global, direct, and diffuse irradiances. The retrieval of PWV under all-sky conditions is useful for monitoring the atmospheric conditions. However, concerns exist regarding the following points: The input data for the DNN model includes “Day number of year” and “solar zenith angle”. The model may have learned seasonal characteristics of the observation site. If the instrument is relocated to another site, would retraining be necessary?
The methodology details and results are clearly described. This paper is recommended after minor revision.

Specific comments
L50-53: GNSS precise point positioning is also used to retrieve PWV in worldwide (e.g., Zhang et al., 2019, DOI:10.1109/JSTARS.2019.2906950).

Figure 2: How did you obtain the calibration constants at each wavelength? Is the radiometric calibration necessary in the retrieval method of this study?

L220-229: There are many technical terms. If possible, include the references.

L493-494: Could the discrepancies in Fig. 8b from July to September be attributed to the limited amount of training data available during the wet season?
Citation: https://doi.org/10.5194/egusphere-2025-4074-RC1
- AC2: 'Reply on RC1', Pradeep Khatri, 30 Nov 2025
  
  Please find our responses in the attached file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-4074-AC2
RC2:
'Comment on egusphere-2025-4074', Anonymous Referee #1, 06 Nov 2025
GENERAL COMMENTS
The study by Khatri et al. investigates precipitable water vapour (PWV) retrieval using near-infrared spectral irradiances measured by a ground-based spectroradiometer (EKO MS-700) and deep neural network (DNN) models trained with a microwave radiometer (MWR) as the reference instrument. The manuscript explores the algorithm potential by testing individual spectral bands as well as their combination, and by evaluating spectral irradiances in different geometrical configurations (direct, global, diffuse, separately and jointly). The models generally demonstrate good performance, except for the one based only on direct irradiances under clear-sky conditions, which is mainly attributed to the limited amount of training data available.
Overall, the manuscript is clearly written and represents a valuable methodological study that could form the basis for new PWV retrieval approaches. However, as the primary motivation of the work is to extend monitoring to locations lacking reference instruments, certain concerns regarding the practical applicability of the proposed algorithm should be addressed before publication. These are detailed in the sections below.
SPECIFIC COMMENTS
1. To my understanding, the main motivation of the study is to develop an algorithm enabling the wider deployment of accurate PWV measurements using "inexpensive, robust sensors" (line 92), such as solar spectroradiometers. According to the authors, this would "enhance the retrieval of water vapour, especially in regions where high-cost instrumentation is scarce" (line 105) and be "valuable for expanding the operational viability of PWV monitoring in cost-constrained or logistically limited environments" (lines 528–529). However, the results indicate that a large amount of reference data, presumably covering a broad range of atmospheric conditions, must be collected before the DNN model can be applied, e.g. the training set used in the paper spans four years. This, in practice, requires co-located, high-cost reference instrumentation such as MWRs for a long time. Therefore, I suggest that the authors clarify several points to better demonstrate whether the proposed algorithm can be implemented under real-world conditions:
1a. The study focuses on a single site for both training and testing. Is the DNN model site-dependent? Must the algorithm be trained under conditions similar to those expected during future measurements? From my understanding, the DNN does not explicitly distinguish between instrument-dependent and site-dependent features, and relocating the spectroradiometer to a site with substantially different conditions would likely degrade retrieval accuracy.
1b. For new instruments or sites, how long would the training phase need to be? If several years of reference data are required before deployment, would such a training procedure be practical for large-scale implementation of the technique?
1c. How stable is the spectroradiometer expected to be, and how frequently should the model be recalibrated (e.g. to account for instrumental drift or degradation)?
2. Related to points 1b–1c: the manuscript states that the day number of the year is included as an input variable in the training. This effectively provides the algorithm with a prior on the most likely atmospheric conditions for a given time of year. Could the authors assess the relative importance of all input variables in the DNN, including the day number? How can they be confident that the agreement shown in Figs. 5 and 8 is not partly driven by the climatology implicit in the training data? Moreover, what would happen if the model were applied to a site with a very different PWV climatology?
3. What level of uncertainty can be expected or considered acceptable in this context? What are the typical uncertainties associated with the reference instruments, and what level of uncertainty would be tolerable depending on the intended application? Benchmark values should be introduced before discussing the results (e.g., RMSE) and before stating that the model performance is good.
4. The authors emphasise the need for "continuous retrievals" (line 73). However, the retrieval technique described in the manuscript, which relies on solar measurements, can only be applied during daylight hours, with an unavoidable interruption at night. I could not find any explicit reference to this limitation in the manuscript (although I may have missed it), and it should be clearly acknowledged and discussed.
5. Figures 5 and 8 are useful for illustrating PWV variations across the seasons. However, they are not ideal for assessing the quality of the comparison throughout the year, as the spread of the measurements is large and the whiskers do not provide information about point-to-point correspondence. If the aim is to demonstrate how the agreement between the two instruments varies seasonally, it might be more effective to compute the ratio of each pair of measurements and present a boxplot of that ratio as a function of month.
6. Are strictly clear-sky conditions required for direct-irradiance retrievals, or is it sufficient that the Sun is not obscured by clouds? Moreover, please explain why clear-sky conditions are necessary for direct irradiance but not for other geometries (line 416), as this may not be immediately evident to all readers.
7. Could the authors more clearly articulate the main advantages of using DNN-based techniques compared with DOAS-type retrievals or radiative transfer calculations?
8. A high-temporal-resolution example would be valuable, for instance, a time series of some days showing both the reference dataset and the corresponding DNN retrievals. At present, the paper includes only averaged or summary plots. Including a short time window with pronounced temporal variability (e.g., within a day or over a few days) would help illustrate how smooth or responsive the DNN retrievals are.
TECHNICAL REMARKS
Line 22: Please clarify what is meant by "limited data" in this context.

Lines 29–31: Bibliographic references needed.

Line 60, "most weather conditions": As this section discusses the limitations of different techniques, it would be important to specify which weather conditions are suitable for MWR measurements.

Line 171: Correct "cantered" to "centred".

Lines 173–175: At least one relevant bibliographic reference should be added.

Line 207: Please explain why ReLU activation functions were chosen for the DNN architecture.

Line 232: This is the first occurrence of the term "unseen" in the DNN context, also used in Section 4.1.2. It would be helpful to introduce or define it more clearly.

Line 265: Consider clarifying what is specifically meant by "feature engineering" in this context.

Line 387: Can the "more dynamic conditions" mentioned here be identified or quantified more precisely?

Line 424: Please specify that all-sky conditions refer to global irradiance, and clear-sky to direct irradiance.

Sections 4.2.2 and 5: The importance of data volume is reiterated several times. Consider removing a few redundant mentions.
Citation: https://doi.org/10.5194/egusphere-2025-4074-RC2
- AC1: 'Reply on RC2', Pradeep Khatri, 30 Nov 2025
  
  Please find our responses in the attached file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-4074-AC1

Status: closed

RC1:
'Comment on egusphere-2025-4074', Anonymous Referee #2, 29 Oct 2025
General comment
This study provides practical insights for estimating PWV from the spectral measurements of solar global, direct, and diffuse irradiances. The retrieval of PWV under all-sky conditions is useful for monitoring the atmospheric conditions. However, concerns exist regarding the following points: The input data for the DNN model includes “Day number of year” and “solar zenith angle”. The model may have learned seasonal characteristics of the observation site. If the instrument is relocated to another site, would retraining be necessary?
The methodology details and results are clearly described. This paper is recommended after minor revision.

Specific comments
L50-53: GNSS precise point positioning is also used to retrieve PWV in worldwide (e.g., Zhang et al., 2019, DOI:10.1109/JSTARS.2019.2906950).

Figure 2: How did you obtain the calibration constants at each wavelength? Is the radiometric calibration necessary in the retrieval method of this study?

L220-229: There are many technical terms. If possible, include the references.

L493-494: Could the discrepancies in Fig. 8b from July to September be attributed to the limited amount of training data available during the wet season?
Citation: https://doi.org/10.5194/egusphere-2025-4074-RC1
- AC2: 'Reply on RC1', Pradeep Khatri, 30 Nov 2025
  
  Please find our responses in the attached file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-4074-AC2
RC2:
'Comment on egusphere-2025-4074', Anonymous Referee #1, 06 Nov 2025
GENERAL COMMENTS
The study by Khatri et al. investigates precipitable water vapour (PWV) retrieval using near-infrared spectral irradiances measured by a ground-based spectroradiometer (EKO MS-700) and deep neural network (DNN) models trained with a microwave radiometer (MWR) as the reference instrument. The manuscript explores the algorithm potential by testing individual spectral bands as well as their combination, and by evaluating spectral irradiances in different geometrical configurations (direct, global, diffuse, separately and jointly). The models generally demonstrate good performance, except for the one based only on direct irradiances under clear-sky conditions, which is mainly attributed to the limited amount of training data available.
Overall, the manuscript is clearly written and represents a valuable methodological study that could form the basis for new PWV retrieval approaches. However, as the primary motivation of the work is to extend monitoring to locations lacking reference instruments, certain concerns regarding the practical applicability of the proposed algorithm should be addressed before publication. These are detailed in the sections below.
SPECIFIC COMMENTS
1. To my understanding, the main motivation of the study is to develop an algorithm enabling the wider deployment of accurate PWV measurements using "inexpensive, robust sensors" (line 92), such as solar spectroradiometers. According to the authors, this would "enhance the retrieval of water vapour, especially in regions where high-cost instrumentation is scarce" (line 105) and be "valuable for expanding the operational viability of PWV monitoring in cost-constrained or logistically limited environments" (lines 528–529). However, the results indicate that a large amount of reference data, presumably covering a broad range of atmospheric conditions, must be collected before the DNN model can be applied, e.g. the training set used in the paper spans four years. This, in practice, requires co-located, high-cost reference instrumentation such as MWRs for a long time. Therefore, I suggest that the authors clarify several points to better demonstrate whether the proposed algorithm can be implemented under real-world conditions:
1a. The study focuses on a single site for both training and testing. Is the DNN model site-dependent? Must the algorithm be trained under conditions similar to those expected during future measurements? From my understanding, the DNN does not explicitly distinguish between instrument-dependent and site-dependent features, and relocating the spectroradiometer to a site with substantially different conditions would likely degrade retrieval accuracy.
1b. For new instruments or sites, how long would the training phase need to be? If several years of reference data are required before deployment, would such a training procedure be practical for large-scale implementation of the technique?
1c. How stable is the spectroradiometer expected to be, and how frequently should the model be recalibrated (e.g. to account for instrumental drift or degradation)?
2. Related to points 1b–1c: the manuscript states that the day number of the year is included as an input variable in the training. This effectively provides the algorithm with a prior on the most likely atmospheric conditions for a given time of year. Could the authors assess the relative importance of all input variables in the DNN, including the day number? How can they be confident that the agreement shown in Figs. 5 and 8 is not partly driven by the climatology implicit in the training data? Moreover, what would happen if the model were applied to a site with a very different PWV climatology?
3. What level of uncertainty can be expected or considered acceptable in this context? What are the typical uncertainties associated with the reference instruments, and what level of uncertainty would be tolerable depending on the intended application? Benchmark values should be introduced before discussing the results (e.g., RMSE) and before stating that the model performance is good.
4. The authors emphasise the need for "continuous retrievals" (line 73). However, the retrieval technique described in the manuscript, which relies on solar measurements, can only be applied during daylight hours, with an unavoidable interruption at night. I could not find any explicit reference to this limitation in the manuscript (although I may have missed it), and it should be clearly acknowledged and discussed.
5. Figures 5 and 8 are useful for illustrating PWV variations across the seasons. However, they are not ideal for assessing the quality of the comparison throughout the year, as the spread of the measurements is large and the whiskers do not provide information about point-to-point correspondence. If the aim is to demonstrate how the agreement between the two instruments varies seasonally, it might be more effective to compute the ratio of each pair of measurements and present a boxplot of that ratio as a function of month.
6. Are strictly clear-sky conditions required for direct-irradiance retrievals, or is it sufficient that the Sun is not obscured by clouds? Moreover, please explain why clear-sky conditions are necessary for direct irradiance but not for other geometries (line 416), as this may not be immediately evident to all readers.
7. Could the authors more clearly articulate the main advantages of using DNN-based techniques compared with DOAS-type retrievals or radiative transfer calculations?
8. A high-temporal-resolution example would be valuable, for instance, a time series of some days showing both the reference dataset and the corresponding DNN retrievals. At present, the paper includes only averaged or summary plots. Including a short time window with pronounced temporal variability (e.g., within a day or over a few days) would help illustrate how smooth or responsive the DNN retrievals are.
TECHNICAL REMARKS
Line 22: Please clarify what is meant by "limited data" in this context.

Lines 29–31: Bibliographic references needed.

Line 60, "most weather conditions": As this section discusses the limitations of different techniques, it would be important to specify which weather conditions are suitable for MWR measurements.

Line 171: Correct "cantered" to "centred".

Lines 173–175: At least one relevant bibliographic reference should be added.

Line 207: Please explain why ReLU activation functions were chosen for the DNN architecture.

Line 232: This is the first occurrence of the term "unseen" in the DNN context, also used in Section 4.1.2. It would be helpful to introduce or define it more clearly.

Line 265: Consider clarifying what is specifically meant by "feature engineering" in this context.

Line 387: Can the "more dynamic conditions" mentioned here be identified or quantified more precisely?

Line 424: Please specify that all-sky conditions refer to global irradiance, and clear-sky to direct irradiance.

Sections 4.2.2 and 5: The importance of data volume is reiterated several times. Consider removing a few redundant mentions.
Citation: https://doi.org/10.5194/egusphere-2025-4074-RC2
- AC1: 'Reply on RC2', Pradeep Khatri, 30 Nov 2025
  
  Please find our responses in the attached file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-4074-AC1

Pradeep Khatri, Tamio Takamura, and Hitoshi Irie

Viewed

Total article views: 268 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
183	60	25	268	15	13

HTML: 183
PDF: 60
XML: 25
Total: 268
BibTeX: 15
EndNote: 13

Views and downloads (calculated since 17 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	94	19	7	120
Nov 2025	59	6	9	74
Dec 2025	30	35	9	74

Cumulative views and downloads (calculated since 17 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	94	19	7	120
Nov 2025	59	6	9	74
Dec 2025	30	35	9	74

Viewed (geographical distribution)

Total article views: 260 (including HTML, PDF, and XML) Thereof 260 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 20 Dec 2025

Short summary

Precipitable water vapor (PWV) is important for various climate and weather studies, but difficult to monitor under various weather conditions. This study shows that surface-based spectral irradiance combined with deep neural network models can accurately estimate PWV under various atmospheric conditions. Models using global, direct, and diffuse irradiances performed best, while even global-only data gave reliable results.


Total:	0
HTML:	0
PDF:	0
XML:	0