the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Next-generation Ionospheric Model for Operations – Validation and Demonstration for Space Weather and Research
Abstract. The Next-generation Ionospheric Model for Operations (NIMO) is an assimilative geospace model developed to address the space weather operational needs in the ionosphere. NIMO harnesses contributions from both near real-time data and state-of-the-art implementation of ionospheric theory to provide hindcasts, nowcasts, and forecasts for operational or research purposes. NIMO is currently configured to assimilate various types of electron density measurements through the Ionospheric Data Assimilation Four-Dimensional (IDA-4D) data assimilation schema. Information from the neutral atmosphere is provided by empirical models. The ionospheric chemistry and transport calculations are handled within NIMO using a version of SAMI3 is also a Model of the Ionosphere (SAMI3) designed to have a realistic geomagnetic field and work effectively on a parallel processing system. This article discusses how NIMO is configured, demonstrates potential use cases for the research community, and validates hindcast runs using a new suite of metrics designed to allow repeatable, quantitative, model-independent evaluations against publicly available observations that may be adopted by any ionospheric global circulation or regional space weather model.
- Preprint
(6622 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-4967', Anonymous Referee #1, 04 Nov 2025
-
RC2: 'Comment on egusphere-2025-4967', Anonymous Referee #2, 02 Dec 2025
This study provides a validation of a data assimilation system developed by the US Naval Research Laboratory using observations from ionosondes, GNSS, altimeters, and in situ observations. The study is potentially a very important illustration of the performance of an operational model; however, there are a number of deficiencies that make it challenging to interpret the implications of the results and undermine the value of the study. At present the study is essentially a validation of a model that is not publicly available, with little description of its implementation. I struggle somewhat in identifying the key scientific value presented here, but I believe that this can be corrected through the addition of some details regarding the assimilation system structure. The study isn’t representative of the operational performance of the model, since it is being used in hindcast. The study can’t be used to advance understanding of the implementation of such a system, since there are precious few details of the actual implementation of the system. The study isn’t trying to say anything about the relative value of the different datasets used. With that said, the main value then seems to only be in establishing an approach for model validation with this system serving as an example of it but no links are provided to the datasets used, the data processing is opaque, and the ingested data is not available, so others could not, for example, test their data assimilation systems under the same conditions, which seems to contravene the point of the last sentence of their abstract. The study is furthermore both thorough and vague at the same time; at times having considerable detail and rigor while at other times lacking considerable detail and consistency. Based on this and the comments below, I recommend that the study be returned to the authors for major revisions.
In terms of composition, in general the writing is clear, well-constructed, and presented logically, with only minor typos. The manuscript figures are clear and excellently composed, aside from missing units in the last two figures. The captions are informative, with only one lacking a detail or two.
Major Comments:
- The applications description in lines 22-30 is under referenced and there is no description in the introduction of other data assimilation systems that exist. As this is advocating the performance and novelty of an operational data assimilation system, there should be at least some acknowledgment or discussion of what methods and systems already exist. Is there a reason you have chosen IDA4D vs. other approaches? At least some discussion to that effect would be valuable to understanding the design trades involved in developing the model, particularly within an operational context, which has different challenges and restrictions that reanalysis-type or scientific data assimilation systems.
- Lines 49-50: This pre-processor requires significant further description. Data preprocessing, particularly in a real time scenario where phase leveling for GNSS, or ionosonde data filtering, can be much more challenging than in reanalaysis configurations, ultimately setting the bounds on the available information content available from the measurements. While the study is validated in hindcast here, it is stated as an operational capacity, capable of near real time operation, so the pre-processing is a significant necessary part of such a system and must be adequately described. At the very least the processing applied to the hindcast ingested TEC data should be described and it should be caveated whether the approach is the same as that which would be applied in real time.
- Lines 50-51: How? The code hasn't been described in any detail, so it is hard to both corroborate this claim or understand the limits of the model.
- Lines 56-58: What is included in the assimilation state space? If you're updating the electron density at t0 and propagating to t1 without updating the external drivers of SAMI3 (EUV flux, thermosphere, winds, etc...) SAMI3 will largely just revert to determining an electron density self-consistent with the external drivers. The ionophere has very little memory by virtue of the fast recombination rate and nearly instantaneous response to external driving such that the prior electron density is only a small factor of the subsequent state. More information is needed here and some sort of demonstration of the forward propagation in the assimilation step to demonstrate that the model isn't just falling back to the state that is self-consistent with external drivers is essential to understanding how information from previous timesteps are being leveraged in the assimilation.
- NIMO System: The details on how the assimilation is conducted are extremely limited. Such details are essential to interpret the validations later in this study. Details regarding the construction of the assimilation a-priori and measurement error covariances at a minimum should be provided, particularly given the apparent overfitting to ionosonde observations seen later in the manuscript.
- Lines 91-93: This is not a recent change, so it's odd that the AMTB option was selected here, when the newer default option could have been changed to. The authors mention that this difference would be discussed in later sections, but it is not mentioned again after this point. Also, given that the authors are using IRI-2016, can they confirm that they have updated the IRI's internal ig_rz.dat and apf107.dat files to ensure that the model has been run in a nominal, rather than forecast, configuration in their study? For reference, if the pypi version by Mike Hersch is the one that has been used, without modification, then index files would have last been updated in February/May 2019.
- Lines 117 – 130, Ionosonde data Quality and Repeatability: All manually scaled data adhering to the URSI guidelines includes provision of qualifying letters to attest to the consistency and accuracy of the data. The reliability of those qualifying letters as a specification of manual scaling performance was assessed and validated in the 1970s and again in the 1980s as part of the URSI INAG endorsement process for the URSI handbook and guidelines.
- Dandenault Study: That study involved having inexperienced scalers scale ionograms without any substantive training according the URSI guidelines and does not represent the scaling performance of manual scaling as a whole. Any scaling that adheres to URSI guidelines is accurate and precise to within 0.05 MHz unless a corresponding qualify letter is prescribed by the scaler, in which case the error threshold of the qualifying letter should be considered. In that study, many of the participants accidentally scaled the F1 peak as the F2 peak, a mistake one would expect of an autoscaling routine but not one I have ever seen made by a scaler adhering to the URSI guidelines and with corresponding training. The accuracy of ARTIST that you cite above this was determined against manually scaled data, so it is somewhat contradictory to imply that the accuracy of manually scaled data is not sufficient as to warrant getting the data manually scaled instead of using autoscaled data when it was sufficient to establish the performance of manual scaling in the first place.
- Lines 127-130, Ionosonde data quality and processing: What efforts were made? You mention that the qualifying letters are not sufficient in and of themselves, but have you employed any filtering to your dataset or have you perhaps employed some preferential qualifying letters or confidence scores for certain parameters? It doesn't seem sufficient to just say that everything has errors so we didn't bother implementing anything as is currently implied.
- Lines 132-134: This needs to be clarified: are data from all of the locations in Figure 4 passed to the assimilation or are some of them not included at all? If data is being rejected internally the criteria and methodology of that should be described somewhere. I would think that GNSS data would be often removed or down-sampled due to the over-correlated error covariance, but I would not think that the relatively infrequent ionosonde observations would ever be rejected on the basis of "dominating the assimilation".
- Residuals vs. validation: In many instances the model is being compared to data that was assimilated either in part or in full. I all such instances it should be made clear that this is the case and it should be caveated that such comparisons are residuals, not independent validation.
- Ingested Data, Lines 177-178: If the sTEC was acquired through a different source and used a different bias estimation approach, it should be described here. The location of the assimilated GNSS stations should be added to figure 5 so that the reader can understand what amount and relative distribution of GNSS data was ingested into the model. It is also highly likely that your separate dataset is also included in the Madrigal one or is highly collocated, so comparison to TEC here is likely mainly an assessment of residual performance rather than independent validation.
- Lines 199-201: Reference needed unless this was an assessment you have conducted, in which case it would benefit from being illustrated. Regardless, this does not mean that JASON2 is a correct reference, see for example: https://doi.org/10.1007/s00190-021-01564-y
- Section 4.3.1: Ionosonde-based validation should likely be broken down by latitude region. The selected ionosonde dataset is highly heterogeneous with a strong bias toward mid latitudes. Overall metrics using ionosonde data will thus be strongly biased toward the performance at mid latitudes. At the very least a comment should be added that the overall metric is likely strongly skewed toward mid latitudes.
- Lines 293-294: Why do you believe that foF2 and hmF2 are more reliable than the parameters of the other layers? foE and foF1 do have some challenges in their scaling, but they are not as significant as those in the F2 peak and these layers are generally very stable and well represented by even climatology, so one would imagine that they are relatively easy goals. Given that the main stated objective is OTHR, which is highly sensitive to E and lower F-region plasma density, I would think that validation in that domain would be of critical importance to this work. In fact, one of the largest drawbacks of physics-based models compared to the IRI is their significant limitations in capturing the E-Region and F1-layer characteristics, so it is particularly important here, given your background model, to assess how it is doing below the F2 peak. It is, in fact, quite odd that despite having bottomside measurements either by ionosondes or ISRs, no illustration of the vertical structure of the assimilation and background model error statistics are provided. Given the importance for the stated application and the challenges experienced by many physics-based models in the bottomside, I think it is essential that the authors provide at least some illustration of the performance of the model in terms of its vertical structure, if only by comparison to ISR data.
- Background model performance: It would be very valuable, if not essential, for the study to include comparison to the background model used for the assimilation system to better understand the innovation induced by the inclusion of the measurements (i.e. to understand how much the data moves the state away from the background).
- Figure 8: This is a plot of residuals, where the ionosonde data was included in the assimilation itself. Can the author comment on what filtering was used in the ionosonde data during fitting? It appears like the assimilation ionosonde data is being overfit, given the significant outlier that is following the ionosonde observations, but the assimilation doesn't do the same for a subsequent outlier. Were the observations from the second outlier not assimilated while they were in the first case? How are you assigning uncertainty to the ionosonde observations in the assimilation? In this case at 0130 UT on April 10, 2020 the error is a second-hop trace scaled in error and with a very low assigned confidence value. This concern returns in Figure 11.
- Figure 11: The assimilation is very clearly overfit to the collocated ionosonde observations here, where scaling errors are dominating the variability of the assimilation result. If that is somehow not the case, then the presence of the large anomalous swings in the assimilation must be explained. This figure is pointed to in the text as an example, but the contents of the figure and the behaviour of the assimilation demonstrated therein are not addressed or discussed anywhere in the manuscript. There needs to be some discussion of what is happening in the NIMO output in this figure. The variations seen look nothing like true variations seen in the ISR observations.
- Line 341: Why? The whole point of the ISRs over the ionosondes should be that the ISRs provide unambiguous vertical structure information. Reducing the comparison to just hmF2 and foF2, when the ISRs are themselves, in most cases, already calibrated against local ionosonde observations which you likely assimilate likely biases performance, since collocated hmF2 and foF2 observations were available at these locations from other instruments and it seems like a missed opportunity here to understand the vertical structure of model performance.
- Validation consistency: The authors repeatedly switch between what metrics they present for what comparisons. The authors should provide the same set of metrics for each comparison. RMSE should not be missing from Figures 12 and 13, just as correlation should not be missing from Figure 10, etc.... Given that the authors have spent considerable time establishing the importance of each of these metrics, they should be applying them equally to all comparisons. The same can be said for Figure 14 where RMSE returns but r disappears. The absence of particular metrics in certain validations could give the reader the false impression that the authors have been cherry picking metrics.
- Acknowledgements – Ionosonde data: The authors do not provide an acknowledgment for the ionosonde data used and do not adhere to the rules of the road for use of ionosonde data. Rules of the Road: https://giro.uml.edu/didbase/RulesOfTheRoad.html Acknowledgement List: http://giro.uml.edu/didbase/acknowledgements.html
- Acknowledgements – Madrigal TEC: Please adhere to the Madrigal TEC recommended practice described below for acknowledging use of Madrigal TEC products: https://cedar.openmadrigal.org/static/siteSpecific/tec_sources.html Also, you should provide a doi and reference to the relevant datasets used in this study. Madrigal provides a tool for composing doi's for sets of data if necessary. This can be done using Madrigal's globalCitation.py script in the Python API wrapper.
Minor Comments:
- Line 53, UV data: Is there a reference for this or was it done as part of this study?
- Lines 65-66: This would imply that your forward propagation includes no storm-related behaviour of any sort, except, perhaps some minimal bleed-in from MSIS's storm response. While geomagnetic indices are used in your implementation, they would only end up being passed to MSIS, correct?
- Line 69: lowercase a (unless you mean the 24-hour average, in which case the capital is correct, but the three-hour is a bit confusing since formally Ap is calculated as a daily value at the end of each day and not as a sliding value).
- Line 78: Clarification is needed: Do you mean that you only ingest the slant TEC in the podTEC files or do you incorporate other processed RO products as well?
- Line 100 “validations” -> “validation”?
- Line 110: This is not strictly correct; scaling is the process of isolating the complete ionospheric virtual height trace. The process you are referring to is trace inversion, which is a separate process conducted after scaling.
- Line 113 version 4 or 5: The data also includes observations using v4.5. Despite it’s subversioning, 4.5 is a distinct version of ARTIST using a different approach from both v4 and v5.
- Line 114 scaling method unknown: It is not strictly unknown; it is just not reported by the quick char tool on the website. In most cases, this is either Autoscala for ionosondes operated by INGV or the Australian software suite if operated by the Australian Bureau of Meteorology. Russian ionosonde data is a mix of manually scaled and data scaled by Autoscala. That information is, however, contained in the ionosonde Standard Archiving Output files.
- Lines 127-128: Worth citing the following to set some bounds on this: https://doi.org/10.3390/rs12172671
- ISR Data Calibration: Is this something you have done separately, or are you just using the data as it appears on Madrigal? If Madrigal, just cite the appropriate dois using their aggregate doi creation tool.
- Lines 172-173: This is out of date. The revised processing was reported in https://amt.copernicus.org/articles/9/1303/2016/amt-9-1303-2016.html
- Lines 173: While it is reported as an error, it is actually the "standard error" associated with the grid average (i.e. sigma/sqrt(N)). Also, I don't believe that it is true that they are typically on the order of a tenth of a TECU. Opening a random file from 2021, I get a global mean dtec of 0.85 TECU and median of 0.92 TECU with the distribution peaking at 1.4 TECU and appearing very multi-modal, with only 0.02% of all dtec values being less than 0.1 TECU. The randomly chosen file is that from May 26th, 2021, if you'd like to verify. Regardless, this is not indicative of the error in the measurement. In relative TEC perhaps, but there is not a bias determination method that exists that can claim precision of this level. Even the best approaches settle in around 1 TECU if only because of the uncertainty in the residual error from phase-leveling, amongst other geometric limitations. The authors are again directed to https://amt.copernicus.org/articles/9/1303/2016/amt-9-1303-2016.html for a more up-to-date assessment of Madrigal TEC bias accuracy/precision or to https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023SW003611 for an external assessment of their performance.
- Line 327: To an extent this is true, but previous extensive assessments of the performance of autoscaling, which you cite in your introduction, can be used to set some bounds on our expectations of these errors. I would recommend revisiting your previous autoscaling performance numbers for the Galkin and other papers here for context, particular in foF2.
- Figure 11: The dates and times of this example must be provided so that the measurements can be corroborated and the dataset can be verified.
- Lines 435-436: missing "is"?
- Figure 19 and 20: Please add units to the axes or where appropriate.
Citation: https://doi.org/10.5194/egusphere-2025-4967-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 240 | 83 | 22 | 345 | 21 | 19 |
- HTML: 240
- PDF: 83
- XML: 22
- Total: 345
- BibTeX: 21
- EndNote: 19
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Title: Next-generation Ionospheric Model for Operations (NIMO): Validation and Demonstration for Space Weather and Research
Authors: A. G. Burrell et al.
Journal: EGUsphere
Recommendation: Moderate revision before acceptance
General Comments
This paper presents the Next-generation Ionospheric Model for Operations (NIMO), a data assimilation, physics-based system that combines the SAMI3 model with the IDA-4D framework. The authors provide a detailed description of the model configuration and a validation using multiple data sources, including ionosondes, incoherent scatter radars, GPS TEC, JASON altimetry, and in-situ plasma density from CINDI, DMSP, and ICON.
Overall, this is a well-structured contribution to the ionospheric modelling and space weather community. It demonstrates NIMO’s ability to deliver high-fidelity ionospheric specifications and forecasts and to outperform empirical models such as IRI-2016 under various geomagnetic conditions.
The study is methodologically sound, comprehensive, and of clear relevance for both research and operational use. However, a few issues require clarification or enhancement before publication.
Specifically, the manuscript would benefit from:
With these improvements, I would recommend acceptance after minor to moderate revision.
Specific Comments
Technical Corrections
Summary Recommendation
Decision: Recommended for publication after minor revision.
Suggested Actions Before Acceptance
Overall assessment:
This manuscript represents a valuable contribution to the field of space weather modelling. After addressing the relatively minor methodological clarifications and presentation issues listed above, it will merit acceptance for publication in Annales Geophysicae (or equivalent EGU journal).