Real-time plotting and evaluation of the data quality control from the CSIR- NGRI Magnetic observatories

Pavan Kumar, Vengala; Phani Chandrasekhar, Nelapatla; Sai Vijay Kumar, Potharaju

doi:10.5194/egusphere-2025-1587

Preprints

https://doi.org/10.5194/egusphere-2025-1587

Preprints

22 Apr 2025

| 22 Apr 2025

Real-time plotting and evaluation of the data quality control from the CSIR- NGRI Magnetic observatories

Vengala Pavan Kumar, Nelapatla Phani Chandrasekhar, and Potharaju Sai Vijay Kumar

Abstract. Earth’s magnetic field, a dynamic shield influenced by internal and external forces, holds critical insights into space weather forecasting and the planet’s core dynamics. The Choutuppal (CPL) and Hyderabad (HYB) magnetic observatories in India are pioneering this field by delivering high-resolution geomagnetic data to INTERMAGNET with unprecedented speed and precision. Utilizing a novel, low-cost protocol, CPL transmits 1 s resolution data and HYB provides 1 min data, both achieving a latency of less than 300 s making them among the first observatories worldwide to accomplish this feat. This rapid data transmission enhances global collaboration in space weather prediction, safeguarding critical infrastructure like satellites and power grids from solar storms.

To further elevate data utility, we developed a Python based software for real-time visualization and quality control at both observatories. This tool generates plots, performs initial quality checks, and computes first differences at 1 s and 1 min intervals, with a latency under 300 s. By enabling daily evaluation of data quality, the software facilitates the identification of anomalies and noise, supporting the preparation of quasi-definitive data essential for geomagnetic research. Our Python server and web applications are designed with the future in mind, integrating artificial intelligence (AI) and machine learning (ML) capabilities. These advancements at CPL and HYB are set to transform the processing, forecasting, and visualization of geomagnetic data. By improving both the accuracy and accessibility of this data, we aim to revolutionize geomagnetic research, making it more precise, accessible, and actionable.

Received: 03 Apr 2025 – Discussion started: 22 Apr 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1437 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (1437 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

12 Dec 2025

Real-time plotting and evaluation of the data quality control from the CSIR-NGRI magnetic observatories

Pavan Kumar Vengala, Phani Chandrasekhar Nelapatla, and Sai Vijay Kumar Potharaju

Geosci. Instrum. Method. Data Syst., 14, 491–501, https://doi.org/10.5194/gi-14-491-2025,https://doi.org/10.5194/gi-14-491-2025, 2025

Short summary

Vengala Pavan Kumar, Nelapatla Phani Chandrasekhar, and Potharaju Sai Vijay Kumar

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-1587', Anonymous Referee #1, 08 May 2025

GENERAL COMMENTS
The manuscript presents a solution proposed by the authors for real-time transmission of magnetic data to the INTERMAGNET network. This solution has been implemented at the Indian geomagnetic observatories CSIR-NGRI in Choutuppal (CPL) and Hyderabad (HYB).
The authors have developed Python-based tools for real-time visualisation and quality control of one-second and one-minute geomagnetic data. The initial automated data control focuses mainly on analysing "first differences". Implementing the "first difference" method effectively detects and flags anthropogenic disturbances, making it easier to remove these disturbances before publishing Quasi-Definitive data. However the article does not mention the commonly used data control method based on the Fv-Fs difference. It is unclear whether this method is not used at the HYB and CPL observatories. If it is used, it should be mentioned in the manuscript; if not, the authors should explain why it was omitted.
The paper also describes the transition from a PHP server to a more advanced Django+Bokeh environment, which allows for better data management, AI/ML integration, and more efficient data processing, visualization, and forecasting of geomagnetic phenomena.
The manuscript represents a valuable technical and methodological contribution that will certainly be of interest to the geomagnetic community—both researchers and observatory operators. I believe that after making the corrections described in the Specific Comments and Technical Corrections sections, this manuscript is worth publishing.

SPECIFIC COMMENTS
Lines 13, 106      Intermagnet Manual https://tech-man.intermagnet.org/stable/chapters/submitdata/introduction.html chapter 6.4.1 says about goals
                             for near real-time performance: 30s for 1s-data, 60s for 1m-data.
                            About 50% of observatories provide 1m data with a delay <= 5 minutes. The relevant statistics are available
                            at https://imag-data.bgs.ac.uk/GIN_V1/GINStatistics.
                           The claim “achieving a latency of under 300 seconds and being one of the first observatories worldwide” is a bit debatable.

Lines 89-91         Although these satellites were used in the past, the preferred way to send data to the GINs now is through the internet, and satellite channels are only used as a backup option.
Line 132               “300 seconds (5 minutes)” – (5 minutes) is not necessary
Line 423               It is worth adding the DOI (https://doi.org/10.1007/s00024-023-03333-8)
Line 432               It is worth adding the DOI (https://doi.org/10.5194/gi-6-329-2017)
Line 449               It is worth adding the DOI (https://doi.org/10.4401/ag-4572)
Many lines          It would be good to make the data names consistent throughout the manuscript.
                             Now, the one-second data is called both '1s' and '1-second'. The same issue concerns one-minute data.

TECHNICAL CORRECTIONS
Lines 33, 46, 52, 95, 96, 97, 339         year should be preceded by a comma
Lines 98, 140      publication year should be rather 2022, Nelaptla (typo error)
Line 673               Khumutov (typo error)
Lines 156, 214, 220, 230, 234, 258           All figures are low quality. For instance, the axis labels are difficult to read.

Citation: https://doi.org/10.5194/egusphere-2025-1587-RC1
- CC1: 'Reply on RC1', Nelapatla Phani Chandrasekhar, 21 May 2025
  
  Reply to RC1 comments
  
  Comment-1: -The authors have developed Python-based tools for real-time visualisation and quality control of one-second and one-minute geomagnetic data. The initial automated data control focuses mainly on analysing "first differences". Implementing the "first difference" method effectively detects and flags anthropogenic disturbances, making it easier to remove these disturbances before publishing Quasi-Definitive data. However the article does not mention the commonly used data control method based on the Fv-Fs difference. It is unclear whether this method is not used at the HYB and CPL observatories. If it is used, it should be mentioned in the manuscript; if not, the authors should explain why it was omitted.
  
  Reply: We thank the reviewer for highlighting the omission of the Fv-Fs difference method in our manuscript and for seeking clarification on its use at the HYB and CPL observatories. We acknowledge the importance of this widely recognized data control technique and confirm that it is actively employed at both observatories to ensure high-quality geomagnetic data. Lines: 397-410.
  
  Comment-2: Lines 13, 106, Intermagnet Manual https://tech-man.intermagnet.org/stable/chapters/submitdata/introduction.html chapter 6.4.1 says about goals. For near real-time performance: 30s for 1s-data, 60s for 1m-data. About 50% of observatories provide 1m data with a delay <= 5 minutes. The relevant statistics are available at https://imag-data.bgs.ac.uk/GIN_V1/GINStatistics. The claim “achieving a latency of under 300 seconds and being one of the first observatories worldwide” is a bit debatable.
  Reply: As suggested, we have gone through the manual and the link providing the statistics on the real-time data transmission. The following are our observations:
  Other observatories, such as BEL, HEL, and Hornsund in Poland, achieve a latency of around 5 minutes by utilizing VPN routers and backup servers, indicating the use of robust data transmission infrastructure. In comparison, NRCan, BGS, and ASP observatories employ satellite links and high-performance servers, reflecting their adoption of more advanced communication systems suitable for their operational requirements.
  On the other hand, the CPL, HYB, and TTB observatories seem to be better candidates based on the available evidence. These magnetic observatories transmit live data to INTERMAGNET GIN with a latency of 5 minutes or less on a daily basis, using low-cost, low-resource setups that do not require heavy infrastructure. While 50% of observatories provide 1-minute data with a delay of 5 minutes or less, the time frame varies from a minimum of 2 minutes to a maximum of several thousand minutes. However, detailed information about the infrastructure used for data transmission is lacking. Therefore, it is noted that CPL and HYB utilize lightweight data transfer systems using Python, rsync, and broadband service along with basic desktop computers. Meanwhile, TTB employs a minimal Raspberry Pi setup, which fulfills all necessary requirements for a cost-effective data transmission system.
  
  Comment-3: Lines 89-91, Although these satellites were used in the past, the preferred way to send data to the GINs now is through the internet, and satellite channels are only used as a backup option.
  Reply: Included in the text as suggested. Lines: 91-93.
  
  Comment-4: Line 132, “300 seconds (5 minutes)” – (5 minutes) is not necessary
  
  Reply: Modified 5 mins as 300s throughout the manuscript as suggested.
  
  Comment-5: Line 423, It is worth adding the DOI https://doi.org/10.1007/s00024-023-03333-8)
  Reply: Included DOI as suggested
  Comment-6: Line 432, it is worth adding the DOI (https://doi.org/10.5194/gi-6-329-2017)
  Reply: Included DOI as suggested
  Comment-7: Line 449, It is worth adding the DOI (https://doi.org/10.4401/ag-4572)
  Reply: Included DOI as suggested
  Comment-8: Many lines, it would be good to make the data names consistent throughout the manuscript. Now, the one-second data is called both '1s' and '1-second'. The same issue concerns one-minute data.
  Reply: Modified the data names throughout the manuscript as suggested.
  Comment-9: Lines 33, 46, 52, 95, 96, 97, 339, year should be preceded by a comma
  Reply: Included comma as suggested
  Comment-10: Lines 98, 140, publication year should be rather 2022, Nelaptla (typo error)
  Reply: Corrected the typo error as suggested
  Comment-11: Line 673, Khumutov (typo error)
  Reply: Corrected the typo error as suggested
  Comment-12: Lines 156, 214, 220, 230, 234, 258. All figures are low quality. For instance, the axis labels are difficult to read.
  Reply: Improved the resolution of the figures and font size increased for axis labels as suggested.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-CC1
- AC1: 'Reply on RC1', Pavan Kumar Vengala, 03 Jun 2025
  
  Reply to RC1 comments
  
  Comment-1: -The authors have developed Python-based tools for real-time visualisation and quality control of one-second and one-minute geomagnetic data. The initial automated data control focuses mainly on analysing "first differences". Implementing the "first difference" method effectively detects and flags anthropogenic disturbances, making it easier to remove these disturbances before publishing Quasi-Definitive data. However the article does not mention the commonly used data control method based on the Fv-Fs difference. It is unclear whether this method is not used at the HYB and CPL observatories. If it is used, it should be mentioned in the manuscript; if not, the authors should explain why it was omitted.
  
  Reply: We thank the reviewer for highlighting the omission of the Fv-Fs difference method in our manuscript and for seeking clarification on its use at the HYB and CPL observatories. We acknowledge the importance of this widely recognized data control technique and confirm that it is actively employed at both observatories to ensure high-quality geomagnetic data. Lines: 397-410.
  
  Comment-2: Lines 13, 106, Intermagnet Manual https://tech-man.intermagnet.org/stable/chapters/submitdata/introduction.html chapter 6.4.1 says about goals. For near real-time performance: 30s for 1s-data, 60s for 1m-data. About 50% of observatories provide 1m data with a delay <= 5 minutes. The relevant statistics are available at https://imag-data.bgs.ac.uk/GIN_V1/GINStatistics. The claim “achieving a latency of under 300 seconds and being one of the first observatories worldwide” is a bit debatable.
  
  Reply: As suggested, we have gone through the manual and the link providing the statistics on the real-time data transmission. The following are our observations:
  
  Other observatories, such as BEL, HEL, and Hornsund in Poland, achieve a latency of around 5 minutes by utilizing VPN routers and backup servers, indicating the use of robust data transmission infrastructure. In comparison, NRCan, BGS, and ASP observatories employ satellite links and high-performance servers, reflecting their adoption of more advanced communication systems suitable for their operational requirements.
  
  On the other hand, the CPL, HYB, and TTB observatories seem to be better candidates based on the available evidence. These magnetic observatories transmit live data to INTERMAGNET GIN with a latency of 5 minutes or less on a daily basis, using low-cost, low-resource setups that do not require heavy infrastructure. While 50% of observatories provide 1-minute data with a delay of 5 minutes or less, the time frame varies from a minimum of 2 minutes to a maximum of several thousand minutes. However, detailed information about the infrastructure used for data transmission is lacking. Therefore, it is noted that CPL and HYB utilize lightweight data transfer systems using Python, rsync, and broadband service along with basic desktop computers. Meanwhile, TTB employs a minimal Raspberry Pi setup, which fulfills all necessary requirements for a cost-effective data transmission system.
  
  Comment-3: Lines 89-91, Although these satellites were used in the past, the preferred way to send data to the GINs now is through the internet, and satellite channels are only used as a backup option.
  
  Reply: Included in the text as suggested. Lines: 91-93.
  
  Comment-4: Line 132, “300 seconds (5 minutes)” – (5 minutes) is not necessary
  
  Reply: Modified 5 mins as 300s throughout the manuscript as suggested.
  
  Comment-5: Line 423, It is worth adding the DOI https://doi.org/10.1007/s00024-023-03333-8)
  
  Reply: Included DOI as suggested
  
  Comment-6: Line 432, it is worth adding the DOI (https://doi.org/10.5194/gi-6-329-2017)
  
  Reply: Included DOI as suggested
  
  Comment-7: Line 449, It is worth adding the DOI (https://doi.org/10.4401/ag-4572)
  
  Reply: Included DOI as suggested
  
  Comment-8: Many lines, it would be good to make the data names consistent throughout the manuscript. Now, the one-second data is called both '1s' and '1-second'. The same issue concerns one-minute data.
  
  Reply: Modified the data names throughout the manuscript as suggested.
  
  Comment-9: Lines 33, 46, 52, 95, 96, 97, 339, year should be preceded by a comma
  
  Reply: Included comma as suggested
  
  Comment-10: Lines 98, 140, publication year should be rather 2022, Nelaptla (typo error)
  
  Reply: Corrected the typo error as suggested
  
  Comment-11: Line 673, Khumutov (typo error)
  
  Reply: Corrected the typo error as suggested
  
  Comment-12: Lines 156, 214, 220, 230, 234, 258. All figures are low quality. For instance, the axis labels are difficult to read.
  
  Reply: Improved the resolution of the figures and font size increased for axis labels as suggested.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-CC1
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-AC1
RC2:
'Comment on egusphere-2025-1587', Anonymous Referee #2, 16 May 2025

The present manuscript describes a newly developed tool for visualizing magnetometer data and automating quality control (QC). The authors also detail the process of near real-time data dissemination to the INTERMAGNET data hub and mention future plans to expand their service with machine learning and AI-based capabilities.
Developing automated QC methods is indeed a valuable contribution, as manual quality checks can be time-consuming and may delay data dissemination. I find the authors' work important, but certain sections of the manuscript would benefit from further elaboration and clarification.

Major Comments
-The authors cite a few related services used by other observatories (Khumutov et al., 2017; He et al., 2022; and MOSFiT by da Silva et al., 2023). Another relevant tool to consider is the MagPy package maintained by INTERMAGNET (https://github.com/geomagpy/magpy), which is widely used by observatories and provides similar features, including data plotting and automatic spike detection. A more precise comparison between the proposed system and existing tools (MagPy, MOSFiT, etc.) would strengthen the manuscript and clarify the novelty and advantages of the new system.
-It would be helpful to include a flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline. This would aid the reader in understanding how the different components interact and when human intervention is required.
-To my understanding, the service is still under development. However, the manuscript should provide a more precise explanation of how the automatic QC process functions in practice. The current description mentions that the QC is based on First Differences (FD), but this approach can mistakenly flag legitimate geomagnetic variation (e.g. during storms) as noise. Do you automatically discard all FD-flagged data, or are these values manually reviewed? Clarifying this — possibly within the flowchart — is essential.
-It would be beneficial to discuss the risks associated with automated data cleaning, such as inadvertently removing valid data, and how these risks are mitigated (e.g., threshold tuning, post-flagging review, cross-validation with secondary data streams).
Minor Comments / Technical Corrections
-Please maintain consistent notation for data cadence (e.g., "1 sec", "1 min") throughout the text.
-Are the Python scripts or software packages publicly available? If so, a GitHub or repository link would be appreciated.
-The term “real-time” should be clearly defined. For example, does it mean a latency of less than 5 minutes?
-The website shown in the plots appears to be inaccessible. If it is not publicly available, please state this explicitly in the manuscript and clarify whether public access is planned for the future.
-Section 4 (Upgrading the PHP server to a Python server): It is unclear whether the migration to Python Django and Bokeh has already been completed, or if this remains part of future plans. Please clarify the implementation status.
-L86: No need to redefine the abbreviation GIN, as it is already explained in L57.
-L130: Same comment — GIN is already defined earlier.
-L154: The phrase “weekday’s data” is ambiguous. Do you mean data from one day or from an entire week?
-L200–202: The sentence is confusing and should be revised for clarity and word order.
-L226–227: The statement implies that 1-minute data is noisier than 1-second data. However, Figures 2 and 3 suggest the opposite. Please rephrase or clarify.
-L244–246: Presenting a long list of numerical values in the text is difficult to follow. Consider using a table for clarity.
-L355: It would be helpful to describe how the thresholds used in QC are determined — are they fixed, empirical, or adaptively set?
-L357–359: The sentence suggests that only specific devices can run the tool. Is this a hard requirement, or are these just tested and recommended devices? Please clarify.

-A brief performance benchmark (e.g., number of flagged points per day, false positive/negative rates if tested) could support claims about the system's effectiveness.

Citation: https://doi.org/10.5194/egusphere-2025-1587-RC2
- CC2:
  'Reply on RC2', Nelapatla Phani Chandrasekhar, 21 May 2025
  Reply to RC2 comments
  
  Comment-1: -The authors cite a few related services used by other observatories (Khumutov et al., 2017; He et al., 2022; and MOSFiT by da Silva et al., 2023). Another relevant tool to consider is the MagPy package maintained by INTERMAGNET (https://github.com/geomagpy/magpy), which is widely used by observatories and provides similar features, including data plotting and automatic spike detection. A more precise comparison between the proposed system and existing tools (MagPy, MOSFiT, etc.) would strengthen the manuscript and clarify the novelty and advantages of the new system.
  Reply: Here’s a precise comparison between MagPy, MOSFiT, and real-time first differences in geomagnetism based on their applications, methodologies, and use cases in geomagnetic data analysis:
  MagPy is a Python library specifically designed for processing and analyzing geomagnetic data. It offers features such as data filtering, visualization, and quality control. MagPy provides tools for baseline correction and noise removal, as well as the ability to compute differences (e.g., between observatories or components). However, it is not specifically optimized for calculating real-time first differences.
  
  MOSFiT s another Python tool developed to investigate the secular variation (SV) of the Earth's geomagnetic field. It is useful for detecting geomagnetic jerks and assessing the quality of geomagnetic observatory data. MOSFiT works with data from any INTERMAGNET geomagnetic observatory that is sampled at one-minute intervals.
  
  In contrast, our tool, also based on a Python library, offers immediate insights into rapid changes in the geomagnetic field between consecutive measurements (e.g., ΔB/Δt) on a per-second basis. This tool has been under observation for the past few days to evaluate the performance of recording systems and data quality in real-time. It is designed to be simple and computationally efficient.
  Comment-2: It would be helpful to include a flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline. This would aid the reader in understanding how the different components interact and when human intervention is required.
  Reply: Flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline is now included as Figure 2 in the manuscript
  Comment-3: To my understanding, the service is still under development. However, the manuscript should provide a more precise explanation of how the automatic QC process functions in practice. The current description mentions that the QC is based on First Differences (FD), but this approach can mistakenly flag legitimate geomagnetic variation (e.g. during storms) as noise. Do you automatically discard all FD-flagged data, or are these values manually reviewed? Clarifying this possibly within the flowchart is essential.
  Reply: The QC is based on First Differences (FD), but this approach will not automatically discard all the FD-flagged data. The flagged data points between the observatories (CPL and HYB) are manually reviewed and then removed.
  
  Comment-4: It would be beneficial to discuss the risks associated with automated data cleaning, such as inadvertently removing valid data, and how these risks are mitigated (e.g., threshold tuning, post-flagging review, cross-validation with secondary data streams).
  Reply: Thank you for raising this important point. Automated geomagnetic data cleaning does carry inherent risks, such as the unintended removal of valid signals or the misclassification of noise. To mitigate these risks, we employ several strategies:
  Threshold Tuning: The parameter for outlier detection for +/-0.2nT are defined by INTERMAGNET.
  
  Post-Flagging Review: Rather than being removed immediately, suspect data points are flagged. This allows for manual verification when necessary.
  
  Cross-Validation: Multi-station comparisons of CPL for HYB and HYB for CPL will be used to confirm anomalies and ensure consistency.
  
  Comment-5: Please maintain consistent notation for data cadence (e.g., "1 sec", "1 min") throughout the text.
  Reply: Modified the data cadence as suggested throughout the text.
  
  Comment-6: -Are the Python scripts or software packages publicly available? If so, a GitHub or repository link would be appreciated.
  Reply: At present, the Python scripts and associated software packages are not publicly available, in adherence to directives from our Director and institutional policies governing data and software dissemination.
  However, we are committed to supporting the scientific community and would be glad to explore avenues of collaboration or provide specific assistance to observatories or institutions that may require it. Please feel free to reach out with any specific needs or proposals.
  
  Comment-7: The term “real-time” should be clearly defined. For example, does it mean a latency of less than 5 minutes?
  Reply: Yes, in real-time with a latency less than 5 minutes.
  
  Comment-8: The website shown in the plots appears to be inaccessible. If it is not publicly available, please state this explicitly in the manuscript and clarify whether public access is planned for the future.
  Reply: The website shown in the manuscript is currently inaccessible, but it is accessible within the institute. We plan to make this website available for public access in the future.
  
  Comment-9: Section 4 (Upgrading the PHP server to a Python server): It is unclear whether the migration to Python Django and Bokeh has already been completed, or if this remains part of future plans. Please clarify the implementation status.
  Reply: Yes, migration to Python Django and Bokeh has already been completed and implemented.
  
  Comment-10: L86: No need to redefine the abbreviation GIN, as it is already explained in L57.
  Reply: Removed the abbreviation GIN as suggested.
  
  Comment-11: L130: Same comment — GIN is already defined earlier.
  Reply: Removed the abbreviation GIN as suggested.
  
  Comment-12: L154: The phrase “weekday’s data” is ambiguous. Do you mean data from one day or from an entire week?
  Reply: The term “weekday data” refers to storing data from day one to day seven, or collecting data for one week. In the upgrading process, we enhanced the server's storage capacity to several months (L156).
  
  Comment-13: L200–202: The sentence is confusing and should be revised for clarity and word order.
  Reply: The term FD refers to the difference between consecutive values in a dataset. A few words have also been removed from the sentence to avoid confusion (L236-237).
  
  Comment-14: L226–227: The statement implies that 1-minute data is noisier than 1-second data. However, Figures 2 and 3 suggest the opposite. Please rephrase or clarify.
  Reply: Yes, Figures 2 and 3 suggest that 1s data is noisier than 1 min data. Corrected the text as suggested (L263-264).
  
  Comment-15: L244–246: Presenting a long list of numerical values in the text is difficult to follow. Consider using a table for clarity.
  Reply: Included the table for clarity
  D(nT) H(nT) Z(nT) F(nT)
  
  FD: 1min CPL ±0.5 ± 1.5 ± 0.3 ± 1.5
  
  HYB ±0.5 ± 1.5 ± 0.3 ± 1.5
  
  FD: 1s CPL ± 0.1 ± 0.1 ± 0.5 ± 0.5
  
  HYB ± 0.1 ± 0.1 ± 2 ± 2
  Comment-16: L355: It would be helpful to describe how the thresholds used in QC are determined are they fixed, empirical, or adaptively set?
  Reply: Included in the text: Lines: 397-410.
  Comment-17: L357–359: The sentence suggests that only specific devices can run the tool. Is this a hard requirement, or are these just tested and recommended devices? Please clarify.
  Reply: Thank you for your insightful question. The devices listed—such as Raspberry Pi, Omega2 LTE, Libre Computer Board Le Potato, and others—are representative examples of low-power, remote-deployable hardware platforms that align well with the intended application environment. Although the current Python-based tool has not undergone formal testing on these specific devices, its modular and platform-agnostic architecture allows for straightforward extension and adaptation to such hardware. We are confident that, with necessary environment-specific adjustments and dependency management, the tool can be effectively deployed on these and similar devices to facilitate real-time data acquisition and quality control in remote observatory settings.
  Comment-18: A brief performance benchmark (e.g., number of flagged points per day, false positive/negative rates if tested) could support claims about the system's effectiveness.
  Reply: An example is already illustrated in Figure 4. Another instance is presented below, where we observed a sudden increase in the amplitude of the vector components at CPL due to the impact of a lightning strike. Our tool recorded this event during the real-time quality control check, and it was not deleted. These natural anomalies will be flagged during the data processing phase. The same event is not observed in HYB.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-CC2
- AC2:
  'Reply on RC2', Pavan Kumar Vengala, 03 Jun 2025
  Reply to RC2 comments
  
  Comment-1: -The authors cite a few related services used by other observatories (Khumutov et al., 2017; He et al., 2022; and MOSFiT by da Silva et al., 2023). Another relevant tool to consider is the MagPy package maintained by INTERMAGNET (https://github.com/geomagpy/magpy), which is widely used by observatories and provides similar features, including data plotting and automatic spike detection. A more precise comparison between the proposed system and existing tools (MagPy, MOSFiT, etc.) would strengthen the manuscript and clarify the novelty and advantages of the new system.
  Reply: Here’s a precise comparison between MagPy, MOSFiT, and real-time first differences in geomagnetism based on their applications, methodologies, and use cases in geomagnetic data analysis:
  MagPy is a Python library specifically designed for processing and analyzing geomagnetic data. It offers features such as data filtering, visualization, and quality control. MagPy provides tools for baseline correction and noise removal, as well as the ability to compute differences (e.g., between observatories or components). However, it is not specifically optimized for calculating real-time first differences.
  
  MOSFiT s another Python tool developed to investigate the secular variation (SV) of the Earth's geomagnetic field. It is useful for detecting geomagnetic jerks and assessing the quality of geomagnetic observatory data. MOSFiT works with data from any INTERMAGNET geomagnetic observatory that is sampled at one-minute intervals.
  
  In contrast, our tool, also based on a Python library, offers immediate insights into rapid changes in the geomagnetic field between consecutive measurements (e.g., ΔB/Δt) on a per-second basis. This tool has been under observation for the past few days to evaluate the performance of recording systems and data quality in real-time. It is designed to be simple and computationally efficient.
  Comment-2: It would be helpful to include a flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline. This would aid the reader in understanding how the different components interact and when human intervention is required.
  Reply: Flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline is now included as Figure 2 in the manuscript
  Comment-3: To my understanding, the service is still under development. However, the manuscript should provide a more precise explanation of how the automatic QC process functions in practice. The current description mentions that the QC is based on First Differences (FD), but this approach can mistakenly flag legitimate geomagnetic variation (e.g. during storms) as noise. Do you automatically discard all FD-flagged data, or are these values manually reviewed? Clarifying this possibly within the flowchart is essential.
  Reply: The QC is based on First Differences (FD), but this approach will not automatically discard all the FD-flagged data. The flagged data points between the observatories (CPL and HYB) are manually reviewed and then removed.
  
  Comment-4: It would be beneficial to discuss the risks associated with automated data cleaning, such as inadvertently removing valid data, and how these risks are mitigated (e.g., threshold tuning, post-flagging review, cross-validation with secondary data streams).
  Reply: Thank you for raising this important point. Automated geomagnetic data cleaning does carry inherent risks, such as the unintended removal of valid signals or the misclassification of noise. To mitigate these risks, we employ several strategies:
  Threshold Tuning: The parameter for outlier detection for +/-0.2nT are defined by INTERMAGNET.
  
  Post-Flagging Review: Rather than being removed immediately, suspect data points are flagged. This allows for manual verification when necessary.
  
  Cross-Validation: Multi-station comparisons of CPL for HYB and HYB for CPL will be used to confirm anomalies and ensure consistency.
  
  Comment-5: Please maintain consistent notation for data cadence (e.g., "1 sec", "1 min") throughout the text.
  Reply: Modified the data cadence as suggested throughout the text.
  
  Comment-6: -Are the Python scripts or software packages publicly available? If so, a GitHub or repository link would be appreciated.
  Reply: At present, the Python scripts and associated software packages are not publicly available, in adherence to directives from our Director and institutional policies governing data and software dissemination.
  However, we are committed to supporting the scientific community and would be glad to explore avenues of collaboration or provide specific assistance to observatories or institutions that may require it. Please feel free to reach out with any specific needs or proposals.
  
  Comment-7: The term “real-time” should be clearly defined. For example, does it mean a latency of less than 5 minutes?
  Reply: Yes, in real-time with a latency less than 5 minutes.
  
  Comment-8: The website shown in the plots appears to be inaccessible. If it is not publicly available, please state this explicitly in the manuscript and clarify whether public access is planned for the future.
  Reply: The website shown in the manuscript is currently inaccessible, but it is accessible within the institute. We plan to make this website available for public access in the future.
  
  Comment-9: Section 4 (Upgrading the PHP server to a Python server): It is unclear whether the migration to Python Django and Bokeh has already been completed, or if this remains part of future plans. Please clarify the implementation status.
  Reply: Yes, migration to Python Django and Bokeh has already been completed and implemented.
  
  Comment-10: L86: No need to redefine the abbreviation GIN, as it is already explained in L57.
  Reply: Removed the abbreviation GIN as suggested.
  
  Comment-11: L130: Same comment — GIN is already defined earlier.
  Reply: Removed the abbreviation GIN as suggested.
  
  Comment-12: L154: The phrase “weekday’s data” is ambiguous. Do you mean data from one day or from an entire week?
  Reply: The term “weekday data” refers to storing data from day one to day seven, or collecting data for one week. In the upgrading process, we enhanced the server's storage capacity to several months (L156).
  
  Comment-13: L200–202: The sentence is confusing and should be revised for clarity and word order.
  Reply: The term FD refers to the difference between consecutive values in a dataset. A few words have also been removed from the sentence to avoid confusion (L236-237).
  
  Comment-14: L226–227: The statement implies that 1-minute data is noisier than 1-second data. However, Figures 2 and 3 suggest the opposite. Please rephrase or clarify.
  Reply: Yes, Figures 2 and 3 suggest that 1s data is noisier than 1 min data. Corrected the text as suggested (L263-264).
  
  Comment-15: L244–246: Presenting a long list of numerical values in the text is difficult to follow. Consider using a table for clarity.
  Reply: Included the table for clarity
  D(nT) H(nT) Z(nT) F(nT)
  
  FD: 1min CPL ±0.5 ± 1.5 ± 0.3 ± 1.5
  
  HYB ±0.5 ± 1.5 ± 0.3 ± 1.5
  
  FD: 1s CPL ± 0.1 ± 0.1 ± 0.5 ± 0.5
  
  HYB ± 0.1 ± 0.1 ± 2 ± 2
  Comment-16: L355: It would be helpful to describe how the thresholds used in QC are determined are they fixed, empirical, or adaptively set?
  Reply: Included in the text: Lines: 397-410.
  Comment-17: L357–359: The sentence suggests that only specific devices can run the tool. Is this a hard requirement, or are these just tested and recommended devices? Please clarify.
  Reply: Thank you for your insightful question. The devices listed—such as Raspberry Pi, Omega2 LTE, Libre Computer Board Le Potato, and others—are representative examples of low-power, remote-deployable hardware platforms that align well with the intended application environment. Although the current Python-based tool has not undergone formal testing on these specific devices, its modular and platform-agnostic architecture allows for straightforward extension and adaptation to such hardware. We are confident that, with necessary environment-specific adjustments and dependency management, the tool can be effectively deployed on these and similar devices to facilitate real-time data acquisition and quality control in remote observatory settings.
  Comment-18: A brief performance benchmark (e.g., number of flagged points per day, false positive/negative rates if tested) could support claims about the system's effectiveness.
  Reply: An example is already illustrated in Figure 4. Another instance is presented below, where we observed a sudden increase in the amplitude of the vector components at CPL due to the impact of a lightning strike. Our tool recorded this event during the real-time quality control check, and it was not deleted. These natural anomalies will be flagged during the data processing phase. The same event is not observed in HYB.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-AC2
EC1: 'Comment on egusphere-2025-1587 by Editor', Anne Neska, 23 May 2025

Dear Authors and Reviewers,
Regarding our common effort in the editorial process of manuscript "Real-time plotting and evaluation of the data quality control from the CSIR- NGRI Magnetic observatories ", I would like to thank both Reviewers for their thorough work while providing very qualified and specific comments and the Authors for their prompt reply. As I can see we all agree on that the manuscript is a valuable contribution to the field of how to control quality of geomagnetic observatory data and making them accessible and that it can be published after minor revision. The Authors already have responded to most comments in a constructive way and have promised improvements in a revised version, but there is one point that requires clarification from my side. It concerns comment no. 2 of Reviewer #1 who raised very well-documented doubts about the claim that observatories CPL and HYB belong to the first ones delivering data within 300 s. In their reply Authors explain that this "among the first" does not refer to real-time delivery as such but to delivery by means of a certain technology. This of course has to be specified in the revised text (Authors do not point out that they would do this) - I will pay attention to this point.
Of course it is also possible for Reviewers to answer to any part of the Authors' reply at any time.
We are looking forward to read a revised version to be uploaded by the Authors by May 28th. If you need more time or encounter problems with uploading, please let me know.
Best wishes and regards,
Anne Neska
Handling Editor

Citation: https://doi.org/10.5194/egusphere-2025-1587-EC1
EC2: 'Comment on egusphere-2025-1587 - Addition by Editor', Anne Neska, 23 May 2025

I noticed just now that there is a revised version already - it is somewhat hidden as an attachment to the Authors reply. So dear Reviewers let us have a look at it and come back to it before May 28. My apologies for confusion.

Citation: https://doi.org/10.5194/egusphere-2025-1587-EC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-1587', Anonymous Referee #1, 08 May 2025

GENERAL COMMENTS
The manuscript presents a solution proposed by the authors for real-time transmission of magnetic data to the INTERMAGNET network. This solution has been implemented at the Indian geomagnetic observatories CSIR-NGRI in Choutuppal (CPL) and Hyderabad (HYB).
The authors have developed Python-based tools for real-time visualisation and quality control of one-second and one-minute geomagnetic data. The initial automated data control focuses mainly on analysing "first differences". Implementing the "first difference" method effectively detects and flags anthropogenic disturbances, making it easier to remove these disturbances before publishing Quasi-Definitive data. However the article does not mention the commonly used data control method based on the Fv-Fs difference. It is unclear whether this method is not used at the HYB and CPL observatories. If it is used, it should be mentioned in the manuscript; if not, the authors should explain why it was omitted.
The paper also describes the transition from a PHP server to a more advanced Django+Bokeh environment, which allows for better data management, AI/ML integration, and more efficient data processing, visualization, and forecasting of geomagnetic phenomena.
The manuscript represents a valuable technical and methodological contribution that will certainly be of interest to the geomagnetic community—both researchers and observatory operators. I believe that after making the corrections described in the Specific Comments and Technical Corrections sections, this manuscript is worth publishing.

SPECIFIC COMMENTS
Lines 13, 106      Intermagnet Manual https://tech-man.intermagnet.org/stable/chapters/submitdata/introduction.html chapter 6.4.1 says about goals
                             for near real-time performance: 30s for 1s-data, 60s for 1m-data.
                            About 50% of observatories provide 1m data with a delay <= 5 minutes. The relevant statistics are available
                            at https://imag-data.bgs.ac.uk/GIN_V1/GINStatistics.
                           The claim “achieving a latency of under 300 seconds and being one of the first observatories worldwide” is a bit debatable.

Lines 89-91         Although these satellites were used in the past, the preferred way to send data to the GINs now is through the internet, and satellite channels are only used as a backup option.
Line 132               “300 seconds (5 minutes)” – (5 minutes) is not necessary
Line 423               It is worth adding the DOI (https://doi.org/10.1007/s00024-023-03333-8)
Line 432               It is worth adding the DOI (https://doi.org/10.5194/gi-6-329-2017)
Line 449               It is worth adding the DOI (https://doi.org/10.4401/ag-4572)
Many lines          It would be good to make the data names consistent throughout the manuscript.
                             Now, the one-second data is called both '1s' and '1-second'. The same issue concerns one-minute data.

TECHNICAL CORRECTIONS
Lines 33, 46, 52, 95, 96, 97, 339         year should be preceded by a comma
Lines 98, 140      publication year should be rather 2022, Nelaptla (typo error)
Line 673               Khumutov (typo error)
Lines 156, 214, 220, 230, 234, 258           All figures are low quality. For instance, the axis labels are difficult to read.

Citation: https://doi.org/10.5194/egusphere-2025-1587-RC1
- CC1: 'Reply on RC1', Nelapatla Phani Chandrasekhar, 21 May 2025
  
  Reply to RC1 comments
  
  Comment-1: -The authors have developed Python-based tools for real-time visualisation and quality control of one-second and one-minute geomagnetic data. The initial automated data control focuses mainly on analysing "first differences". Implementing the "first difference" method effectively detects and flags anthropogenic disturbances, making it easier to remove these disturbances before publishing Quasi-Definitive data. However the article does not mention the commonly used data control method based on the Fv-Fs difference. It is unclear whether this method is not used at the HYB and CPL observatories. If it is used, it should be mentioned in the manuscript; if not, the authors should explain why it was omitted.
  
  Reply: We thank the reviewer for highlighting the omission of the Fv-Fs difference method in our manuscript and for seeking clarification on its use at the HYB and CPL observatories. We acknowledge the importance of this widely recognized data control technique and confirm that it is actively employed at both observatories to ensure high-quality geomagnetic data. Lines: 397-410.
  
  Comment-2: Lines 13, 106, Intermagnet Manual https://tech-man.intermagnet.org/stable/chapters/submitdata/introduction.html chapter 6.4.1 says about goals. For near real-time performance: 30s for 1s-data, 60s for 1m-data. About 50% of observatories provide 1m data with a delay <= 5 minutes. The relevant statistics are available at https://imag-data.bgs.ac.uk/GIN_V1/GINStatistics. The claim “achieving a latency of under 300 seconds and being one of the first observatories worldwide” is a bit debatable.
  Reply: As suggested, we have gone through the manual and the link providing the statistics on the real-time data transmission. The following are our observations:
  Other observatories, such as BEL, HEL, and Hornsund in Poland, achieve a latency of around 5 minutes by utilizing VPN routers and backup servers, indicating the use of robust data transmission infrastructure. In comparison, NRCan, BGS, and ASP observatories employ satellite links and high-performance servers, reflecting their adoption of more advanced communication systems suitable for their operational requirements.
  On the other hand, the CPL, HYB, and TTB observatories seem to be better candidates based on the available evidence. These magnetic observatories transmit live data to INTERMAGNET GIN with a latency of 5 minutes or less on a daily basis, using low-cost, low-resource setups that do not require heavy infrastructure. While 50% of observatories provide 1-minute data with a delay of 5 minutes or less, the time frame varies from a minimum of 2 minutes to a maximum of several thousand minutes. However, detailed information about the infrastructure used for data transmission is lacking. Therefore, it is noted that CPL and HYB utilize lightweight data transfer systems using Python, rsync, and broadband service along with basic desktop computers. Meanwhile, TTB employs a minimal Raspberry Pi setup, which fulfills all necessary requirements for a cost-effective data transmission system.
  
  Comment-3: Lines 89-91, Although these satellites were used in the past, the preferred way to send data to the GINs now is through the internet, and satellite channels are only used as a backup option.
  Reply: Included in the text as suggested. Lines: 91-93.
  
  Comment-4: Line 132, “300 seconds (5 minutes)” – (5 minutes) is not necessary
  
  Reply: Modified 5 mins as 300s throughout the manuscript as suggested.
  
  Comment-5: Line 423, It is worth adding the DOI https://doi.org/10.1007/s00024-023-03333-8)
  Reply: Included DOI as suggested
  Comment-6: Line 432, it is worth adding the DOI (https://doi.org/10.5194/gi-6-329-2017)
  Reply: Included DOI as suggested
  Comment-7: Line 449, It is worth adding the DOI (https://doi.org/10.4401/ag-4572)
  Reply: Included DOI as suggested
  Comment-8: Many lines, it would be good to make the data names consistent throughout the manuscript. Now, the one-second data is called both '1s' and '1-second'. The same issue concerns one-minute data.
  Reply: Modified the data names throughout the manuscript as suggested.
  Comment-9: Lines 33, 46, 52, 95, 96, 97, 339, year should be preceded by a comma
  Reply: Included comma as suggested
  Comment-10: Lines 98, 140, publication year should be rather 2022, Nelaptla (typo error)
  Reply: Corrected the typo error as suggested
  Comment-11: Line 673, Khumutov (typo error)
  Reply: Corrected the typo error as suggested
  Comment-12: Lines 156, 214, 220, 230, 234, 258. All figures are low quality. For instance, the axis labels are difficult to read.
  Reply: Improved the resolution of the figures and font size increased for axis labels as suggested.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-CC1
- AC1: 'Reply on RC1', Pavan Kumar Vengala, 03 Jun 2025
  
  Reply to RC1 comments
  
  Comment-1: -The authors have developed Python-based tools for real-time visualisation and quality control of one-second and one-minute geomagnetic data. The initial automated data control focuses mainly on analysing "first differences". Implementing the "first difference" method effectively detects and flags anthropogenic disturbances, making it easier to remove these disturbances before publishing Quasi-Definitive data. However the article does not mention the commonly used data control method based on the Fv-Fs difference. It is unclear whether this method is not used at the HYB and CPL observatories. If it is used, it should be mentioned in the manuscript; if not, the authors should explain why it was omitted.
  
  Reply: We thank the reviewer for highlighting the omission of the Fv-Fs difference method in our manuscript and for seeking clarification on its use at the HYB and CPL observatories. We acknowledge the importance of this widely recognized data control technique and confirm that it is actively employed at both observatories to ensure high-quality geomagnetic data. Lines: 397-410.
  
  Comment-2: Lines 13, 106, Intermagnet Manual https://tech-man.intermagnet.org/stable/chapters/submitdata/introduction.html chapter 6.4.1 says about goals. For near real-time performance: 30s for 1s-data, 60s for 1m-data. About 50% of observatories provide 1m data with a delay <= 5 minutes. The relevant statistics are available at https://imag-data.bgs.ac.uk/GIN_V1/GINStatistics. The claim “achieving a latency of under 300 seconds and being one of the first observatories worldwide” is a bit debatable.
  
  Reply: As suggested, we have gone through the manual and the link providing the statistics on the real-time data transmission. The following are our observations:
  
  Other observatories, such as BEL, HEL, and Hornsund in Poland, achieve a latency of around 5 minutes by utilizing VPN routers and backup servers, indicating the use of robust data transmission infrastructure. In comparison, NRCan, BGS, and ASP observatories employ satellite links and high-performance servers, reflecting their adoption of more advanced communication systems suitable for their operational requirements.
  
  On the other hand, the CPL, HYB, and TTB observatories seem to be better candidates based on the available evidence. These magnetic observatories transmit live data to INTERMAGNET GIN with a latency of 5 minutes or less on a daily basis, using low-cost, low-resource setups that do not require heavy infrastructure. While 50% of observatories provide 1-minute data with a delay of 5 minutes or less, the time frame varies from a minimum of 2 minutes to a maximum of several thousand minutes. However, detailed information about the infrastructure used for data transmission is lacking. Therefore, it is noted that CPL and HYB utilize lightweight data transfer systems using Python, rsync, and broadband service along with basic desktop computers. Meanwhile, TTB employs a minimal Raspberry Pi setup, which fulfills all necessary requirements for a cost-effective data transmission system.
  
  Comment-3: Lines 89-91, Although these satellites were used in the past, the preferred way to send data to the GINs now is through the internet, and satellite channels are only used as a backup option.
  
  Reply: Included in the text as suggested. Lines: 91-93.
  
  Comment-4: Line 132, “300 seconds (5 minutes)” – (5 minutes) is not necessary
  
  Reply: Modified 5 mins as 300s throughout the manuscript as suggested.
  
  Comment-5: Line 423, It is worth adding the DOI https://doi.org/10.1007/s00024-023-03333-8)
  
  Reply: Included DOI as suggested
  
  Comment-6: Line 432, it is worth adding the DOI (https://doi.org/10.5194/gi-6-329-2017)
  
  Reply: Included DOI as suggested
  
  Comment-7: Line 449, It is worth adding the DOI (https://doi.org/10.4401/ag-4572)
  
  Reply: Included DOI as suggested
  
  Comment-8: Many lines, it would be good to make the data names consistent throughout the manuscript. Now, the one-second data is called both '1s' and '1-second'. The same issue concerns one-minute data.
  
  Reply: Modified the data names throughout the manuscript as suggested.
  
  Comment-9: Lines 33, 46, 52, 95, 96, 97, 339, year should be preceded by a comma
  
  Reply: Included comma as suggested
  
  Comment-10: Lines 98, 140, publication year should be rather 2022, Nelaptla (typo error)
  
  Reply: Corrected the typo error as suggested
  
  Comment-11: Line 673, Khumutov (typo error)
  
  Reply: Corrected the typo error as suggested
  
  Comment-12: Lines 156, 214, 220, 230, 234, 258. All figures are low quality. For instance, the axis labels are difficult to read.
  
  Reply: Improved the resolution of the figures and font size increased for axis labels as suggested.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-CC1
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-AC1
RC2:
'Comment on egusphere-2025-1587', Anonymous Referee #2, 16 May 2025

The present manuscript describes a newly developed tool for visualizing magnetometer data and automating quality control (QC). The authors also detail the process of near real-time data dissemination to the INTERMAGNET data hub and mention future plans to expand their service with machine learning and AI-based capabilities.
Developing automated QC methods is indeed a valuable contribution, as manual quality checks can be time-consuming and may delay data dissemination. I find the authors' work important, but certain sections of the manuscript would benefit from further elaboration and clarification.

Major Comments
-The authors cite a few related services used by other observatories (Khumutov et al., 2017; He et al., 2022; and MOSFiT by da Silva et al., 2023). Another relevant tool to consider is the MagPy package maintained by INTERMAGNET (https://github.com/geomagpy/magpy), which is widely used by observatories and provides similar features, including data plotting and automatic spike detection. A more precise comparison between the proposed system and existing tools (MagPy, MOSFiT, etc.) would strengthen the manuscript and clarify the novelty and advantages of the new system.
-It would be helpful to include a flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline. This would aid the reader in understanding how the different components interact and when human intervention is required.
-To my understanding, the service is still under development. However, the manuscript should provide a more precise explanation of how the automatic QC process functions in practice. The current description mentions that the QC is based on First Differences (FD), but this approach can mistakenly flag legitimate geomagnetic variation (e.g. during storms) as noise. Do you automatically discard all FD-flagged data, or are these values manually reviewed? Clarifying this — possibly within the flowchart — is essential.
-It would be beneficial to discuss the risks associated with automated data cleaning, such as inadvertently removing valid data, and how these risks are mitigated (e.g., threshold tuning, post-flagging review, cross-validation with secondary data streams).
Minor Comments / Technical Corrections
-Please maintain consistent notation for data cadence (e.g., "1 sec", "1 min") throughout the text.
-Are the Python scripts or software packages publicly available? If so, a GitHub or repository link would be appreciated.
-The term “real-time” should be clearly defined. For example, does it mean a latency of less than 5 minutes?
-The website shown in the plots appears to be inaccessible. If it is not publicly available, please state this explicitly in the manuscript and clarify whether public access is planned for the future.
-Section 4 (Upgrading the PHP server to a Python server): It is unclear whether the migration to Python Django and Bokeh has already been completed, or if this remains part of future plans. Please clarify the implementation status.
-L86: No need to redefine the abbreviation GIN, as it is already explained in L57.
-L130: Same comment — GIN is already defined earlier.
-L154: The phrase “weekday’s data” is ambiguous. Do you mean data from one day or from an entire week?
-L200–202: The sentence is confusing and should be revised for clarity and word order.
-L226–227: The statement implies that 1-minute data is noisier than 1-second data. However, Figures 2 and 3 suggest the opposite. Please rephrase or clarify.
-L244–246: Presenting a long list of numerical values in the text is difficult to follow. Consider using a table for clarity.
-L355: It would be helpful to describe how the thresholds used in QC are determined — are they fixed, empirical, or adaptively set?
-L357–359: The sentence suggests that only specific devices can run the tool. Is this a hard requirement, or are these just tested and recommended devices? Please clarify.

-A brief performance benchmark (e.g., number of flagged points per day, false positive/negative rates if tested) could support claims about the system's effectiveness.

Citation: https://doi.org/10.5194/egusphere-2025-1587-RC2
- CC2:
  'Reply on RC2', Nelapatla Phani Chandrasekhar, 21 May 2025
  Reply to RC2 comments
  
  Comment-1: -The authors cite a few related services used by other observatories (Khumutov et al., 2017; He et al., 2022; and MOSFiT by da Silva et al., 2023). Another relevant tool to consider is the MagPy package maintained by INTERMAGNET (https://github.com/geomagpy/magpy), which is widely used by observatories and provides similar features, including data plotting and automatic spike detection. A more precise comparison between the proposed system and existing tools (MagPy, MOSFiT, etc.) would strengthen the manuscript and clarify the novelty and advantages of the new system.
  Reply: Here’s a precise comparison between MagPy, MOSFiT, and real-time first differences in geomagnetism based on their applications, methodologies, and use cases in geomagnetic data analysis:
  MagPy is a Python library specifically designed for processing and analyzing geomagnetic data. It offers features such as data filtering, visualization, and quality control. MagPy provides tools for baseline correction and noise removal, as well as the ability to compute differences (e.g., between observatories or components). However, it is not specifically optimized for calculating real-time first differences.
  
  MOSFiT s another Python tool developed to investigate the secular variation (SV) of the Earth's geomagnetic field. It is useful for detecting geomagnetic jerks and assessing the quality of geomagnetic observatory data. MOSFiT works with data from any INTERMAGNET geomagnetic observatory that is sampled at one-minute intervals.
  
  In contrast, our tool, also based on a Python library, offers immediate insights into rapid changes in the geomagnetic field between consecutive measurements (e.g., ΔB/Δt) on a per-second basis. This tool has been under observation for the past few days to evaluate the performance of recording systems and data quality in real-time. It is designed to be simple and computationally efficient.
  Comment-2: It would be helpful to include a flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline. This would aid the reader in understanding how the different components interact and when human intervention is required.
  Reply: Flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline is now included as Figure 2 in the manuscript
  Comment-3: To my understanding, the service is still under development. However, the manuscript should provide a more precise explanation of how the automatic QC process functions in practice. The current description mentions that the QC is based on First Differences (FD), but this approach can mistakenly flag legitimate geomagnetic variation (e.g. during storms) as noise. Do you automatically discard all FD-flagged data, or are these values manually reviewed? Clarifying this possibly within the flowchart is essential.
  Reply: The QC is based on First Differences (FD), but this approach will not automatically discard all the FD-flagged data. The flagged data points between the observatories (CPL and HYB) are manually reviewed and then removed.
  
  Comment-4: It would be beneficial to discuss the risks associated with automated data cleaning, such as inadvertently removing valid data, and how these risks are mitigated (e.g., threshold tuning, post-flagging review, cross-validation with secondary data streams).
  Reply: Thank you for raising this important point. Automated geomagnetic data cleaning does carry inherent risks, such as the unintended removal of valid signals or the misclassification of noise. To mitigate these risks, we employ several strategies:
  Threshold Tuning: The parameter for outlier detection for +/-0.2nT are defined by INTERMAGNET.
  
  Post-Flagging Review: Rather than being removed immediately, suspect data points are flagged. This allows for manual verification when necessary.
  
  Cross-Validation: Multi-station comparisons of CPL for HYB and HYB for CPL will be used to confirm anomalies and ensure consistency.
  
  Comment-5: Please maintain consistent notation for data cadence (e.g., "1 sec", "1 min") throughout the text.
  Reply: Modified the data cadence as suggested throughout the text.
  
  Comment-6: -Are the Python scripts or software packages publicly available? If so, a GitHub or repository link would be appreciated.
  Reply: At present, the Python scripts and associated software packages are not publicly available, in adherence to directives from our Director and institutional policies governing data and software dissemination.
  However, we are committed to supporting the scientific community and would be glad to explore avenues of collaboration or provide specific assistance to observatories or institutions that may require it. Please feel free to reach out with any specific needs or proposals.
  
  Comment-7: The term “real-time” should be clearly defined. For example, does it mean a latency of less than 5 minutes?
  Reply: Yes, in real-time with a latency less than 5 minutes.
  
  Comment-8: The website shown in the plots appears to be inaccessible. If it is not publicly available, please state this explicitly in the manuscript and clarify whether public access is planned for the future.
  Reply: The website shown in the manuscript is currently inaccessible, but it is accessible within the institute. We plan to make this website available for public access in the future.
  
  Comment-9: Section 4 (Upgrading the PHP server to a Python server): It is unclear whether the migration to Python Django and Bokeh has already been completed, or if this remains part of future plans. Please clarify the implementation status.
  Reply: Yes, migration to Python Django and Bokeh has already been completed and implemented.
  
  Comment-10: L86: No need to redefine the abbreviation GIN, as it is already explained in L57.
  Reply: Removed the abbreviation GIN as suggested.
  
  Comment-11: L130: Same comment — GIN is already defined earlier.
  Reply: Removed the abbreviation GIN as suggested.
  
  Comment-12: L154: The phrase “weekday’s data” is ambiguous. Do you mean data from one day or from an entire week?
  Reply: The term “weekday data” refers to storing data from day one to day seven, or collecting data for one week. In the upgrading process, we enhanced the server's storage capacity to several months (L156).
  
  Comment-13: L200–202: The sentence is confusing and should be revised for clarity and word order.
  Reply: The term FD refers to the difference between consecutive values in a dataset. A few words have also been removed from the sentence to avoid confusion (L236-237).
  
  Comment-14: L226–227: The statement implies that 1-minute data is noisier than 1-second data. However, Figures 2 and 3 suggest the opposite. Please rephrase or clarify.
  Reply: Yes, Figures 2 and 3 suggest that 1s data is noisier than 1 min data. Corrected the text as suggested (L263-264).
  
  Comment-15: L244–246: Presenting a long list of numerical values in the text is difficult to follow. Consider using a table for clarity.
  Reply: Included the table for clarity
  D(nT) H(nT) Z(nT) F(nT)
  
  FD: 1min CPL ±0.5 ± 1.5 ± 0.3 ± 1.5
  
  HYB ±0.5 ± 1.5 ± 0.3 ± 1.5
  
  FD: 1s CPL ± 0.1 ± 0.1 ± 0.5 ± 0.5
  
  HYB ± 0.1 ± 0.1 ± 2 ± 2
  Comment-16: L355: It would be helpful to describe how the thresholds used in QC are determined are they fixed, empirical, or adaptively set?
  Reply: Included in the text: Lines: 397-410.
  Comment-17: L357–359: The sentence suggests that only specific devices can run the tool. Is this a hard requirement, or are these just tested and recommended devices? Please clarify.
  Reply: Thank you for your insightful question. The devices listed—such as Raspberry Pi, Omega2 LTE, Libre Computer Board Le Potato, and others—are representative examples of low-power, remote-deployable hardware platforms that align well with the intended application environment. Although the current Python-based tool has not undergone formal testing on these specific devices, its modular and platform-agnostic architecture allows for straightforward extension and adaptation to such hardware. We are confident that, with necessary environment-specific adjustments and dependency management, the tool can be effectively deployed on these and similar devices to facilitate real-time data acquisition and quality control in remote observatory settings.
  Comment-18: A brief performance benchmark (e.g., number of flagged points per day, false positive/negative rates if tested) could support claims about the system's effectiveness.
  Reply: An example is already illustrated in Figure 4. Another instance is presented below, where we observed a sudden increase in the amplitude of the vector components at CPL due to the impact of a lightning strike. Our tool recorded this event during the real-time quality control check, and it was not deleted. These natural anomalies will be flagged during the data processing phase. The same event is not observed in HYB.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-CC2
- AC2:
  'Reply on RC2', Pavan Kumar Vengala, 03 Jun 2025
  Reply to RC2 comments
  
  Comment-1: -The authors cite a few related services used by other observatories (Khumutov et al., 2017; He et al., 2022; and MOSFiT by da Silva et al., 2023). Another relevant tool to consider is the MagPy package maintained by INTERMAGNET (https://github.com/geomagpy/magpy), which is widely used by observatories and provides similar features, including data plotting and automatic spike detection. A more precise comparison between the proposed system and existing tools (MagPy, MOSFiT, etc.) would strengthen the manuscript and clarify the novelty and advantages of the new system.
  Reply: Here’s a precise comparison between MagPy, MOSFiT, and real-time first differences in geomagnetism based on their applications, methodologies, and use cases in geomagnetic data analysis:
  MagPy is a Python library specifically designed for processing and analyzing geomagnetic data. It offers features such as data filtering, visualization, and quality control. MagPy provides tools for baseline correction and noise removal, as well as the ability to compute differences (e.g., between observatories or components). However, it is not specifically optimized for calculating real-time first differences.
  
  MOSFiT s another Python tool developed to investigate the secular variation (SV) of the Earth's geomagnetic field. It is useful for detecting geomagnetic jerks and assessing the quality of geomagnetic observatory data. MOSFiT works with data from any INTERMAGNET geomagnetic observatory that is sampled at one-minute intervals.
  
  In contrast, our tool, also based on a Python library, offers immediate insights into rapid changes in the geomagnetic field between consecutive measurements (e.g., ΔB/Δt) on a per-second basis. This tool has been under observation for the past few days to evaluate the performance of recording systems and data quality in real-time. It is designed to be simple and computationally efficient.
  Comment-2: It would be helpful to include a flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline. This would aid the reader in understanding how the different components interact and when human intervention is required.
  Reply: Flowchart or schematic illustrating the processes involved in the automated QC and data transfer pipeline is now included as Figure 2 in the manuscript
  Comment-3: To my understanding, the service is still under development. However, the manuscript should provide a more precise explanation of how the automatic QC process functions in practice. The current description mentions that the QC is based on First Differences (FD), but this approach can mistakenly flag legitimate geomagnetic variation (e.g. during storms) as noise. Do you automatically discard all FD-flagged data, or are these values manually reviewed? Clarifying this possibly within the flowchart is essential.
  Reply: The QC is based on First Differences (FD), but this approach will not automatically discard all the FD-flagged data. The flagged data points between the observatories (CPL and HYB) are manually reviewed and then removed.
  
  Comment-4: It would be beneficial to discuss the risks associated with automated data cleaning, such as inadvertently removing valid data, and how these risks are mitigated (e.g., threshold tuning, post-flagging review, cross-validation with secondary data streams).
  Reply: Thank you for raising this important point. Automated geomagnetic data cleaning does carry inherent risks, such as the unintended removal of valid signals or the misclassification of noise. To mitigate these risks, we employ several strategies:
  Threshold Tuning: The parameter for outlier detection for +/-0.2nT are defined by INTERMAGNET.
  
  Post-Flagging Review: Rather than being removed immediately, suspect data points are flagged. This allows for manual verification when necessary.
  
  Cross-Validation: Multi-station comparisons of CPL for HYB and HYB for CPL will be used to confirm anomalies and ensure consistency.
  
  Comment-5: Please maintain consistent notation for data cadence (e.g., "1 sec", "1 min") throughout the text.
  Reply: Modified the data cadence as suggested throughout the text.
  
  Comment-6: -Are the Python scripts or software packages publicly available? If so, a GitHub or repository link would be appreciated.
  Reply: At present, the Python scripts and associated software packages are not publicly available, in adherence to directives from our Director and institutional policies governing data and software dissemination.
  However, we are committed to supporting the scientific community and would be glad to explore avenues of collaboration or provide specific assistance to observatories or institutions that may require it. Please feel free to reach out with any specific needs or proposals.
  
  Comment-7: The term “real-time” should be clearly defined. For example, does it mean a latency of less than 5 minutes?
  Reply: Yes, in real-time with a latency less than 5 minutes.
  
  Comment-8: The website shown in the plots appears to be inaccessible. If it is not publicly available, please state this explicitly in the manuscript and clarify whether public access is planned for the future.
  Reply: The website shown in the manuscript is currently inaccessible, but it is accessible within the institute. We plan to make this website available for public access in the future.
  
  Comment-9: Section 4 (Upgrading the PHP server to a Python server): It is unclear whether the migration to Python Django and Bokeh has already been completed, or if this remains part of future plans. Please clarify the implementation status.
  Reply: Yes, migration to Python Django and Bokeh has already been completed and implemented.
  
  Comment-10: L86: No need to redefine the abbreviation GIN, as it is already explained in L57.
  Reply: Removed the abbreviation GIN as suggested.
  
  Comment-11: L130: Same comment — GIN is already defined earlier.
  Reply: Removed the abbreviation GIN as suggested.
  
  Comment-12: L154: The phrase “weekday’s data” is ambiguous. Do you mean data from one day or from an entire week?
  Reply: The term “weekday data” refers to storing data from day one to day seven, or collecting data for one week. In the upgrading process, we enhanced the server's storage capacity to several months (L156).
  
  Comment-13: L200–202: The sentence is confusing and should be revised for clarity and word order.
  Reply: The term FD refers to the difference between consecutive values in a dataset. A few words have also been removed from the sentence to avoid confusion (L236-237).
  
  Comment-14: L226–227: The statement implies that 1-minute data is noisier than 1-second data. However, Figures 2 and 3 suggest the opposite. Please rephrase or clarify.
  Reply: Yes, Figures 2 and 3 suggest that 1s data is noisier than 1 min data. Corrected the text as suggested (L263-264).
  
  Comment-15: L244–246: Presenting a long list of numerical values in the text is difficult to follow. Consider using a table for clarity.
  Reply: Included the table for clarity
  D(nT) H(nT) Z(nT) F(nT)
  
  FD: 1min CPL ±0.5 ± 1.5 ± 0.3 ± 1.5
  
  HYB ±0.5 ± 1.5 ± 0.3 ± 1.5
  
  FD: 1s CPL ± 0.1 ± 0.1 ± 0.5 ± 0.5
  
  HYB ± 0.1 ± 0.1 ± 2 ± 2
  Comment-16: L355: It would be helpful to describe how the thresholds used in QC are determined are they fixed, empirical, or adaptively set?
  Reply: Included in the text: Lines: 397-410.
  Comment-17: L357–359: The sentence suggests that only specific devices can run the tool. Is this a hard requirement, or are these just tested and recommended devices? Please clarify.
  Reply: Thank you for your insightful question. The devices listed—such as Raspberry Pi, Omega2 LTE, Libre Computer Board Le Potato, and others—are representative examples of low-power, remote-deployable hardware platforms that align well with the intended application environment. Although the current Python-based tool has not undergone formal testing on these specific devices, its modular and platform-agnostic architecture allows for straightforward extension and adaptation to such hardware. We are confident that, with necessary environment-specific adjustments and dependency management, the tool can be effectively deployed on these and similar devices to facilitate real-time data acquisition and quality control in remote observatory settings.
  Comment-18: A brief performance benchmark (e.g., number of flagged points per day, false positive/negative rates if tested) could support claims about the system's effectiveness.
  Reply: An example is already illustrated in Figure 4. Another instance is presented below, where we observed a sudden increase in the amplitude of the vector components at CPL due to the impact of a lightning strike. Our tool recorded this event during the real-time quality control check, and it was not deleted. These natural anomalies will be flagged during the data processing phase. The same event is not observed in HYB.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1587-AC2
EC1: 'Comment on egusphere-2025-1587 by Editor', Anne Neska, 23 May 2025

Dear Authors and Reviewers,
Regarding our common effort in the editorial process of manuscript "Real-time plotting and evaluation of the data quality control from the CSIR- NGRI Magnetic observatories ", I would like to thank both Reviewers for their thorough work while providing very qualified and specific comments and the Authors for their prompt reply. As I can see we all agree on that the manuscript is a valuable contribution to the field of how to control quality of geomagnetic observatory data and making them accessible and that it can be published after minor revision. The Authors already have responded to most comments in a constructive way and have promised improvements in a revised version, but there is one point that requires clarification from my side. It concerns comment no. 2 of Reviewer #1 who raised very well-documented doubts about the claim that observatories CPL and HYB belong to the first ones delivering data within 300 s. In their reply Authors explain that this "among the first" does not refer to real-time delivery as such but to delivery by means of a certain technology. This of course has to be specified in the revised text (Authors do not point out that they would do this) - I will pay attention to this point.
Of course it is also possible for Reviewers to answer to any part of the Authors' reply at any time.
We are looking forward to read a revised version to be uploaded by the Authors by May 28th. If you need more time or encounter problems with uploading, please let me know.
Best wishes and regards,
Anne Neska
Handling Editor

Citation: https://doi.org/10.5194/egusphere-2025-1587-EC1
EC2: 'Comment on egusphere-2025-1587 - Addition by Editor', Anne Neska, 23 May 2025

I noticed just now that there is a revised version already - it is somewhat hidden as an attachment to the Authors reply. So dear Reviewers let us have a look at it and come back to it before May 28. My apologies for confusion.

Citation: https://doi.org/10.5194/egusphere-2025-1587-EC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Pavan Kumar Vengala on behalf of the Authors (03 Jun 2025) Author's response Author's tracked changes Manuscript

ED: Publish subject to minor revisions (review by editor) (09 Jun 2025) by Anne Neska

AR by Pavan Kumar Vengala on behalf of the Authors (12 Jun 2025) Author's response Author's tracked changes Manuscript

ED: Publish as is (25 Jun 2025) by Anne Neska

AR by Pavan Kumar Vengala on behalf of the Authors (26 Jun 2025) Manuscript

Journal article(s) based on this preprint

12 Dec 2025

Real-time plotting and evaluation of the data quality control from the CSIR-NGRI magnetic observatories

Pavan Kumar Vengala, Phani Chandrasekhar Nelapatla, and Sai Vijay Kumar Potharaju

Geosci. Instrum. Method. Data Syst., 14, 491–501, https://doi.org/10.5194/gi-14-491-2025,https://doi.org/10.5194/gi-14-491-2025, 2025

Short summary

Vengala Pavan Kumar, Nelapatla Phani Chandrasekhar, and Potharaju Sai Vijay Kumar

Viewed

Total article views: 2,438 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,980	358	100	2,438	117	172

HTML: 1,980
PDF: 358
XML: 100
Total: 2,438
BibTeX: 117
EndNote: 172

Views and downloads (calculated since 22 Apr 2025)

Month	HTML	PDF	XML	Total
Apr 2025	92	20	6	118
May 2025	168	40	18	226
Jun 2025	76	34	24	134
Jul 2025	38	20	0	58
Aug 2025	164	6	4	174
Sep 2025	964	20	6	990
Oct 2025	78	34	4	116
Nov 2025	80	14	4	98
Dec 2025	88	24	4	116
Jan 2026	84	62	12	158
Feb 2026	52	20	6	78
Mar 2026	54	36	6	96
Apr 2026	13	14	4	31
May 2026	22	10	2	34
Jun 2026	7	4	0	11

Cumulative views and downloads (calculated since 22 Apr 2025)

Month	HTML	PDF	XML	Total
Apr 2025	92	20	6	118
May 2025	168	40	18	226
Jun 2025	76	34	24	134
Jul 2025	38	20	0	58
Aug 2025	164	6	4	174
Sep 2025	964	20	6	990
Oct 2025	78	34	4	116
Nov 2025	80	14	4	98
Dec 2025	88	24	4	116
Jan 2026	84	62	12	158
Feb 2026	52	20	6	78
Mar 2026	54	36	6	96
Apr 2026	13	14	4	31
May 2026	22	10	2	34
Jun 2026	7	4	0	11

Viewed (geographical distribution)

Total article views: 2,438 (including HTML, PDF, and XML) Thereof 2,438 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 25 Jun 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (1437 KB)
Metadata XML

Short summary

A Python-based software was developed for real-time visualization and quality control at India’s CPL and HYB geomagnetic observatories. The tool generates plots, conducts quality checks, and computes first differences at 1s and 1min intervals with under 300s latency, aiding anomaly detection and quasi-definitive data preparation. Designed for future integration with AI/ML, this system enhances geomagnetic data accuracy and accessibility, revolutionizing research, forecasting, and visualization.


Total:	0
HTML:	0
PDF:	0
XML:	0