Knowledge-inspired fusion strategies for the inference of PM2.5 values with a Neural Network

Dabrowski, Matthieu; Mennesson, José; Riedi, Jérôme; Djeraba, Chaabane; Nabat, Pierre

doi:10.5194/egusphere-2024-2676

Preprints

https://doi.org/10.5194/egusphere-2024-2676

Preprints

29 Oct 2024

| 29 Oct 2024

Knowledge-inspired fusion strategies for the inference of PM2.5 values with a Neural Network

Matthieu Dabrowski, José Mennesson, Jérôme Riedi, Chaabane Djeraba, and Pierre Nabat

Abstract. Ground-level concentrations of Particulate Matter (more precisely PM2.5) are a strong indicator of air quality, which is now widely recognized to impact human health. Accurately inferring or predicting PM2.5 concentrations is therefore an important step for health hazard monitoring and the implementation of air quality related policies. Various methods have been used to achieve this objective, and Neural Networks are one of the most recent and popular solutions.

In this study, a limited set of quantities that are known to impact the relation between column AOD and surface PM2.5 concentrations are used as input of several networks architectures to investigate how different fusion strategies can impact and help explain predicted PM2.5 concentrations. Different models are trained on two different sets of simulated data, namely global scale atmospheric composition reanalysis provided by the Copernicus Atmospheric Monitoring Service (CAMS) as well as higher resolution data simulated over Europe with the Centre National de Recherches Météorologiques ALADIN model.

Based on an extensive set of experiments, this work proposes several models of knowledge-inspired Neural Networks, achieving interesting results both from the performance and interpretability points of view. Specifically, novel architectures based on BC-GANs (which are able to leverage information from sparse ground observation networks) and on more traditional UNets, employing various information fusion methods, are designed and evaluated against each other. Our results can serve as baseline benchmark for other studies and be used to develop further optimised models for the inference of PM2.5 concentrations from AOD at either global or regional scale.

Received: 27 Aug 2024 – Discussion started: 29 Oct 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 3883 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (3883 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

23 Jun 2025

Knowledge-inspired fusion strategies for the inference of PM_2.5 values with a neural network

Matthieu Dabrowski, José Mennesson, Jérôme Riedi, Chaabane Djeraba, and Pierre Nabat

Geosci. Model Dev., 18, 3707–3733, https://doi.org/10.5194/gmd-18-3707-2025,https://doi.org/10.5194/gmd-18-3707-2025, 2025

Short summary

Matthieu Dabrowski, José Mennesson, Jérôme Riedi, Chaabane Djeraba, and Pierre Nabat

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-2676', Anonymous Referee #1, 16 Dec 2024

The authors developed a set of surrogate models to predict PM2.5 from AOD and multiple climate input variables, mostly based on the existing simulation of CAMS and ALADIN. Two neural networks, UNet and GAN, were used and different input data fusion techniques were evaluated to assess the performance of the neural networks. Decent amounts of work were put into the manuscript, based on which I think it is a valuable piece of work to guide future fast emulation of PM2.5. I suggest a moderate revision with one major suggestion and multiple minor comments outlined below.
My main suggestion is the writing style in the results and conclusions sections (i.e., Sections 6 and 7). The descriptions were broken into multiple discontinuous small paragraphs, making the reading pretty hard. It reads more like a draft or oral presentation, instead of a research article. I would suggest reorganizing each subsection into a handful of coherent ‘big’ paragraphs.
Minor comments:
Line 25: (Martin et al., 2019) report --> Martin et al. (2019) report
Line 93: for PM2.5 which results and performances ---> for PM2.5 whose results and performances
Line 125: caracterize --> characterize
Lines 319 and 362: Please formally describe the Boundary Conditions-GAN/loss (in a mathematical way).
Section 5.3: Please provide the mathematical equations for MAE, MBE, and FSIM
Eqs.(3) and (4): Please provide the definitions of M_{i,j}, C_{i,j}, and N.
Table 1: How many epochs were used in training? Please provide some samples of training/test losses over epoch to check the convergence/overfitting of the model.
Line 439: the inference time increases?
Line 576: are specifics to --> are specific to
Line 590: However, and while --> However, while
Line 594: As suggested by (Zhou et al., 2024) --> As suggested by Zhou et al. (2024)

Citation: https://doi.org/10.5194/egusphere-2024-2676-RC1
- AC1:
  'Reply on RC1', Matthieu Dabrowski, 22 Jan 2025
  First, we would like to thank the reviewers for their helpful comments and observations. We have taken into account most of the suggested modifications into a new version of this manuscript or are providing justifications otherwise.
  
  My main suggestion is the writing style in the results and conclusions sections (i.e., Sections 6 and 7). The descriptions were broken into multiple discontinuous small paragraphs, making the reading pretty hard. It reads more like a draft or oral presentation, instead of a research article. I would suggest reorganizing each subsection into a handful of coherent ‘big’ paragraphs.
  
  In our new version, section 6 and 7 have been reorganised into fewer and more self contained paragraphs.
  
  Line 25: (Martin et al., 2019) report --> Martin et al. (2019) report
  
  Line 93: for PM2.5 which results and performances ---> for PM2.5 whose results and performances
  
  Line 125: caracterize --> characterize
  
  Each of these typos and formulations have been corrected the suggested way in the new version of our manuscript.
  
  Lines 319 and 362: Please formally describe the Boundary Conditions-GAN/loss (in a mathematical way).
  
  Section 5.3: Please provide the mathematical equations for MAE, MBE, and FSIM
  
  Eqs.(3) and (4): Please provide the definitions of M_{i,j}, C_{i,j}, and N.
  
  Our new version now contains more detailed mathematical descriptions of almost all these elements. The only exception concerns the mathematical formulation of the FSIM metric.
  Indeed, the FSIM is a complex image quality indicator, based on the concepts of Gradient Magnitude and Phase Congruency, which conception and computation details can not be easily nor briefly summarized within our own paper through a few simple equations. Instead several complex intermediary equations and mathematical concepts would need to be explained in details for a mathematical description of this metric to be satisfactory. It is our belief that this explanation would get away from the main topic of our article and we prefer to refer interested readers to the relevant publication by Zhang et al (2011).
  For an in-depth description of the FSIM metric we would like to recommend the reading of the article in which it is first introduced : Zhang, L., Zhang, L., Mou, X., and Zhang, D.: FSIM: A Feature Similarity Index for Image Quality Assessment, IEEE Transactions on Image Processing, 20, 2378–2386, https://doi.org/10.1109/TIP.2011.2109730, 2011. This article is cited in our manuscript.
  The authors of this article have made available the MATLAB source code of this metric: https://web.comp.polyu.edu.hk/cslzhang/IQA/FSIM/FSIM.htm.
  As far as our work is concerned, we have used an existing python implementation of this same metric provided by the piq library : https://piq.readthedocs.io/en/latest/.
  
  Table 1: How many epochs were used in training? Please provide some samples of training/test losses over epoch to check the convergence/overfitting of the model.
  
  We have added an appendix on the convergence of our models, containing two graphics describing respectively the evolution of the loss function values during training, and the distribution of MAE values among test samples.
  “Figure D1 provides a graph of training loss values over iterations, showing clearly the convergence of the model. This corresponds to the training of a UNet model using exlcusively the AOD as input. In this experiment as in all other experiments presented in this article, the models are trained on 500 epochs.
  Figure D2 gives, for the same model, an overview of the MAE values for the different test samples. A few test samples stand out as having a significantly worse MAE than others, but the maximum MAE for these samples remains below 3 , which is satisfying.”
  Figure D1. Graph of training loss during supervised learning over iterations
  Figure D2. Graph of MAE values during testing over sample date
  
  Line 439: the inference time increases?
  
  Line 576: are specifics to --> are specific to
  
  Line 590: However, and while --> However, while
  
  Line 594: As suggested by (Zhou et al., 2024) --> As suggested by Zhou et al. (2024)
  
  Each of these typos and formulations have been corrected as suggested in the new version of our manuscript.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2676-AC1
RC2:
'Comment on egusphere-2024-2676', Anonymous Referee #2, 23 Dec 2024
The manuscript extensively studies the use of multiple meteorological variables and aerosol optical properties as inputs for a deep learning model to infer PM2.5 concentrations from Aerosol Optical Depth (AOD). The researchers explored different network architectures and fusion strategies, conducting numerous experiments primarily on the CAMS dataset, with some additional experiments on the ALADIN dataset.
The strengths of the paper are evident. The study is comprehensive, utilizing extensive error metrics and conducting a thorough investigation of different variants of the deep learning models. The results will be beneficial to the readership of the journal.
However, one significant weakness is the presentation, particularly regarding performance metrics and the visualization of model performance using radar charts.
Error metrics:
It is not clear how the relative versions of MAE and MBE (rMAE and rMBE) are defined. Please provide explicit definitions with mathematical formulations.

The summarizing scores (total score, timeless score, reduced score) undermine the purpose of having multiple error metrics. The different versions of summarizing scores also contradict the claim of easier model comparisons. Consequently, performance tables (e.g., table 2) are cluttered with both the 5 error metrics and additional 3 scores. I suggest prioritizing a subset of these metrics or scores to avoid confusion and overwhelming the readers.

The rationale behind the normalization of inference time in equation 5 needs clarification. Mixing inference time with other error metrics adds to the confusion. Consider presenting the inference time as a standalone performance metric and eliminating the total score.

The definitions of the total score, timeless score, and reduced score require justification. While it is clear that QE, MAE, MBE, and FSIM are simple equally weighted average scores, their distribution (or typical range) and sensitivity are not the same. A simple average may assign higher weights to the most sensitive score.

Current presentation issues when comparing model performance:
It is not straightforward to compare the same metric across different models using radar charts. This design highlights differences between the metrics of the same model.

The values of the metrics are challenging to read from the plots. I suggest annotating each entry on the charts.

The common legend in figure 8 is counterintuitive, though it is understood that this was done to save space. The line styles and colors are simple enough to be described in captions.
Citation: https://doi.org/10.5194/egusphere-2024-2676-RC2
- AC2:
  'Reply on RC2', Matthieu Dabrowski, 22 Jan 2025
  We would like to thank the reviewer for his insightful observations and comments. Some of the suggested modifications were applied to a new version of this manuscript while justification is provided hereafter otherwise.
  
  It is not clear how the relative versions of MAE and MBE (rMAE and rMBE) are defined. Please provide explicit definitions with mathematical formulations.
  
  In our new version mathematical formulations of the rMAE and rMBE are now provided.
  
  The summarizing scores (total score, timeless score, reduced score) undermine the purpose of having multiple error metrics. The different versions of summarizing scores also contradict the claim of easier model comparisons. Consequently, performance tables (e.g., table 2) are cluttered with both the 5 error metrics and additional 3 scores. I suggest prioritizing a subset of these metrics or scores to avoid confusion and overwhelming the readers.
  
  We would like to specify that we do agree with the referee that these scores may not be suited for an in-depth analysis of our results, and may make the analysis of our results harder. These scores are mainly used to justify the selection of some of our experiments among the numerous we have performed before comparing them, and provide a quantifiable rationale for this selection. Once this subset of experiments is selected, a more in-depth analysis can be realised based on dedicated metrics. Therefore we decided to only display the values of our metrics in the revised manuscript, and have updated the description of our scores accordingly. However our selection protocol, using our scores, remains the same. Our total score is also used in our radar charts to highlight the overall best results.
  
  The rationale behind the normalization of inference time in equation 5 needs clarification. Mixing inference time with other error metrics adds to the confusion. Consider presenting the inference time as a standalone performance metric and eliminating the total score.
  
  The rationale behind the choice of the 0.05s threshold for the normalization of inference time in equation 5 has been clarified in our new version : “Regarding the inference time, the threshold of 0.05s is used as it is the maximum inference time among our experiences with our deep learning models.” (section 5.4)
  Regarding the use of the total score, modifications have been made and are detailed in our response to the second comment.
  
  The definitions of the total score, timeless score, and reduced score require justification. While it is clear that QE, MAE, MBE, and FSIM are simple equally weighted average scores, their distribution (or typical range) and sensitivity are not the same. A simple average may assign higher weights to the most sensitive score.
  
  As stated in our answer to the second comment, these scores are not meant to be used as a way to analyse our results, but simply to provide a clearly defined protocol to select a subset of our results for analysis. Only the metrics should be used in the results analysis itself. This has now been made clearer in our new version of the manuscript.
  
  It is not straightforward to compare the same metric across different models using radar charts. This design highlights differences between the metrics of the same model.
  
  We do understand the referee's opinion regarding our choice to use radar charts to display our results overviews. However we failed to find a more satisfactory way to represent this amount of experimental results in a brief way. We do want to bring the attention of the referee to the fact that to realise this graph, all metrics have been normalized, the same way they were for the computation of the scores. This allows us to represent all metrics on the same value scale. This has now been made clearer in the new version of our manuscript.
  
  The values of the metrics are challenging to read from the plots. I suggest annotating each entry on the charts.
  
  Some of the entries on our radar charts are now annotated. We chose to annotate only the entries corresponding to the experiments which lead to the best results, as annotating all entries would have hindered the readability of these graphs.
  
  The common legend in figure 8 is counterintuitive, though it is understood that this was done to save space. The line styles and colors are simple enough to be described in captions.
  
  The common legend has been replaced with a paragraph describing the different colors and styles of the lines, in section 6.1, and a summary of this paragraph is present in the captions of our radar charts.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2676-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-2676', Anonymous Referee #1, 16 Dec 2024

The authors developed a set of surrogate models to predict PM2.5 from AOD and multiple climate input variables, mostly based on the existing simulation of CAMS and ALADIN. Two neural networks, UNet and GAN, were used and different input data fusion techniques were evaluated to assess the performance of the neural networks. Decent amounts of work were put into the manuscript, based on which I think it is a valuable piece of work to guide future fast emulation of PM2.5. I suggest a moderate revision with one major suggestion and multiple minor comments outlined below.
My main suggestion is the writing style in the results and conclusions sections (i.e., Sections 6 and 7). The descriptions were broken into multiple discontinuous small paragraphs, making the reading pretty hard. It reads more like a draft or oral presentation, instead of a research article. I would suggest reorganizing each subsection into a handful of coherent ‘big’ paragraphs.
Minor comments:
Line 25: (Martin et al., 2019) report --> Martin et al. (2019) report
Line 93: for PM2.5 which results and performances ---> for PM2.5 whose results and performances
Line 125: caracterize --> characterize
Lines 319 and 362: Please formally describe the Boundary Conditions-GAN/loss (in a mathematical way).
Section 5.3: Please provide the mathematical equations for MAE, MBE, and FSIM
Eqs.(3) and (4): Please provide the definitions of M_{i,j}, C_{i,j}, and N.
Table 1: How many epochs were used in training? Please provide some samples of training/test losses over epoch to check the convergence/overfitting of the model.
Line 439: the inference time increases?
Line 576: are specifics to --> are specific to
Line 590: However, and while --> However, while
Line 594: As suggested by (Zhou et al., 2024) --> As suggested by Zhou et al. (2024)

Citation: https://doi.org/10.5194/egusphere-2024-2676-RC1
- AC1:
  'Reply on RC1', Matthieu Dabrowski, 22 Jan 2025
  First, we would like to thank the reviewers for their helpful comments and observations. We have taken into account most of the suggested modifications into a new version of this manuscript or are providing justifications otherwise.
  
  My main suggestion is the writing style in the results and conclusions sections (i.e., Sections 6 and 7). The descriptions were broken into multiple discontinuous small paragraphs, making the reading pretty hard. It reads more like a draft or oral presentation, instead of a research article. I would suggest reorganizing each subsection into a handful of coherent ‘big’ paragraphs.
  
  In our new version, section 6 and 7 have been reorganised into fewer and more self contained paragraphs.
  
  Line 25: (Martin et al., 2019) report --> Martin et al. (2019) report
  
  Line 93: for PM2.5 which results and performances ---> for PM2.5 whose results and performances
  
  Line 125: caracterize --> characterize
  
  Each of these typos and formulations have been corrected the suggested way in the new version of our manuscript.
  
  Lines 319 and 362: Please formally describe the Boundary Conditions-GAN/loss (in a mathematical way).
  
  Section 5.3: Please provide the mathematical equations for MAE, MBE, and FSIM
  
  Eqs.(3) and (4): Please provide the definitions of M_{i,j}, C_{i,j}, and N.
  
  Our new version now contains more detailed mathematical descriptions of almost all these elements. The only exception concerns the mathematical formulation of the FSIM metric.
  Indeed, the FSIM is a complex image quality indicator, based on the concepts of Gradient Magnitude and Phase Congruency, which conception and computation details can not be easily nor briefly summarized within our own paper through a few simple equations. Instead several complex intermediary equations and mathematical concepts would need to be explained in details for a mathematical description of this metric to be satisfactory. It is our belief that this explanation would get away from the main topic of our article and we prefer to refer interested readers to the relevant publication by Zhang et al (2011).
  For an in-depth description of the FSIM metric we would like to recommend the reading of the article in which it is first introduced : Zhang, L., Zhang, L., Mou, X., and Zhang, D.: FSIM: A Feature Similarity Index for Image Quality Assessment, IEEE Transactions on Image Processing, 20, 2378–2386, https://doi.org/10.1109/TIP.2011.2109730, 2011. This article is cited in our manuscript.
  The authors of this article have made available the MATLAB source code of this metric: https://web.comp.polyu.edu.hk/cslzhang/IQA/FSIM/FSIM.htm.
  As far as our work is concerned, we have used an existing python implementation of this same metric provided by the piq library : https://piq.readthedocs.io/en/latest/.
  
  Table 1: How many epochs were used in training? Please provide some samples of training/test losses over epoch to check the convergence/overfitting of the model.
  
  We have added an appendix on the convergence of our models, containing two graphics describing respectively the evolution of the loss function values during training, and the distribution of MAE values among test samples.
  “Figure D1 provides a graph of training loss values over iterations, showing clearly the convergence of the model. This corresponds to the training of a UNet model using exlcusively the AOD as input. In this experiment as in all other experiments presented in this article, the models are trained on 500 epochs.
  Figure D2 gives, for the same model, an overview of the MAE values for the different test samples. A few test samples stand out as having a significantly worse MAE than others, but the maximum MAE for these samples remains below 3 , which is satisfying.”
  Figure D1. Graph of training loss during supervised learning over iterations
  Figure D2. Graph of MAE values during testing over sample date
  
  Line 439: the inference time increases?
  
  Line 576: are specifics to --> are specific to
  
  Line 590: However, and while --> However, while
  
  Line 594: As suggested by (Zhou et al., 2024) --> As suggested by Zhou et al. (2024)
  
  Each of these typos and formulations have been corrected as suggested in the new version of our manuscript.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2676-AC1
RC2:
'Comment on egusphere-2024-2676', Anonymous Referee #2, 23 Dec 2024
The manuscript extensively studies the use of multiple meteorological variables and aerosol optical properties as inputs for a deep learning model to infer PM2.5 concentrations from Aerosol Optical Depth (AOD). The researchers explored different network architectures and fusion strategies, conducting numerous experiments primarily on the CAMS dataset, with some additional experiments on the ALADIN dataset.
The strengths of the paper are evident. The study is comprehensive, utilizing extensive error metrics and conducting a thorough investigation of different variants of the deep learning models. The results will be beneficial to the readership of the journal.
However, one significant weakness is the presentation, particularly regarding performance metrics and the visualization of model performance using radar charts.
Error metrics:
It is not clear how the relative versions of MAE and MBE (rMAE and rMBE) are defined. Please provide explicit definitions with mathematical formulations.

The summarizing scores (total score, timeless score, reduced score) undermine the purpose of having multiple error metrics. The different versions of summarizing scores also contradict the claim of easier model comparisons. Consequently, performance tables (e.g., table 2) are cluttered with both the 5 error metrics and additional 3 scores. I suggest prioritizing a subset of these metrics or scores to avoid confusion and overwhelming the readers.

The rationale behind the normalization of inference time in equation 5 needs clarification. Mixing inference time with other error metrics adds to the confusion. Consider presenting the inference time as a standalone performance metric and eliminating the total score.

The definitions of the total score, timeless score, and reduced score require justification. While it is clear that QE, MAE, MBE, and FSIM are simple equally weighted average scores, their distribution (or typical range) and sensitivity are not the same. A simple average may assign higher weights to the most sensitive score.

Current presentation issues when comparing model performance:
It is not straightforward to compare the same metric across different models using radar charts. This design highlights differences between the metrics of the same model.

The values of the metrics are challenging to read from the plots. I suggest annotating each entry on the charts.

The common legend in figure 8 is counterintuitive, though it is understood that this was done to save space. The line styles and colors are simple enough to be described in captions.
Citation: https://doi.org/10.5194/egusphere-2024-2676-RC2
- AC2:
  'Reply on RC2', Matthieu Dabrowski, 22 Jan 2025
  We would like to thank the reviewer for his insightful observations and comments. Some of the suggested modifications were applied to a new version of this manuscript while justification is provided hereafter otherwise.
  
  It is not clear how the relative versions of MAE and MBE (rMAE and rMBE) are defined. Please provide explicit definitions with mathematical formulations.
  
  In our new version mathematical formulations of the rMAE and rMBE are now provided.
  
  The summarizing scores (total score, timeless score, reduced score) undermine the purpose of having multiple error metrics. The different versions of summarizing scores also contradict the claim of easier model comparisons. Consequently, performance tables (e.g., table 2) are cluttered with both the 5 error metrics and additional 3 scores. I suggest prioritizing a subset of these metrics or scores to avoid confusion and overwhelming the readers.
  
  We would like to specify that we do agree with the referee that these scores may not be suited for an in-depth analysis of our results, and may make the analysis of our results harder. These scores are mainly used to justify the selection of some of our experiments among the numerous we have performed before comparing them, and provide a quantifiable rationale for this selection. Once this subset of experiments is selected, a more in-depth analysis can be realised based on dedicated metrics. Therefore we decided to only display the values of our metrics in the revised manuscript, and have updated the description of our scores accordingly. However our selection protocol, using our scores, remains the same. Our total score is also used in our radar charts to highlight the overall best results.
  
  The rationale behind the normalization of inference time in equation 5 needs clarification. Mixing inference time with other error metrics adds to the confusion. Consider presenting the inference time as a standalone performance metric and eliminating the total score.
  
  The rationale behind the choice of the 0.05s threshold for the normalization of inference time in equation 5 has been clarified in our new version : “Regarding the inference time, the threshold of 0.05s is used as it is the maximum inference time among our experiences with our deep learning models.” (section 5.4)
  Regarding the use of the total score, modifications have been made and are detailed in our response to the second comment.
  
  The definitions of the total score, timeless score, and reduced score require justification. While it is clear that QE, MAE, MBE, and FSIM are simple equally weighted average scores, their distribution (or typical range) and sensitivity are not the same. A simple average may assign higher weights to the most sensitive score.
  
  As stated in our answer to the second comment, these scores are not meant to be used as a way to analyse our results, but simply to provide a clearly defined protocol to select a subset of our results for analysis. Only the metrics should be used in the results analysis itself. This has now been made clearer in our new version of the manuscript.
  
  It is not straightforward to compare the same metric across different models using radar charts. This design highlights differences between the metrics of the same model.
  
  We do understand the referee's opinion regarding our choice to use radar charts to display our results overviews. However we failed to find a more satisfactory way to represent this amount of experimental results in a brief way. We do want to bring the attention of the referee to the fact that to realise this graph, all metrics have been normalized, the same way they were for the computation of the scores. This allows us to represent all metrics on the same value scale. This has now been made clearer in the new version of our manuscript.
  
  The values of the metrics are challenging to read from the plots. I suggest annotating each entry on the charts.
  
  Some of the entries on our radar charts are now annotated. We chose to annotate only the entries corresponding to the experiments which lead to the best results, as annotating all entries would have hindered the readability of these graphs.
  
  The common legend in figure 8 is counterintuitive, though it is understood that this was done to save space. The line styles and colors are simple enough to be described in captions.
  
  The common legend has been replaced with a paragraph describing the different colors and styles of the lines, in section 6.1, and a summary of this paragraph is present in the captions of our radar charts.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2676-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Matthieu Dabrowski on behalf of the Authors (22 Jan 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (27 Jan 2025) by Po-Lun Ma

RR by Anonymous Referee #1 (31 Jan 2025)

RR by Anonymous Referee #2 (10 Feb 2025)

ED: Publish as is (17 Feb 2025) by Po-Lun Ma

AR by Matthieu Dabrowski on behalf of the Authors (27 Feb 2025)

Journal article(s) based on this preprint

23 Jun 2025

Knowledge-inspired fusion strategies for the inference of PM_2.5 values with a neural network

Matthieu Dabrowski, José Mennesson, Jérôme Riedi, Chaabane Djeraba, and Pierre Nabat

Geosci. Model Dev., 18, 3707–3733, https://doi.org/10.5194/gmd-18-3707-2025,https://doi.org/10.5194/gmd-18-3707-2025, 2025

Short summary

Matthieu Dabrowski, José Mennesson, Jérôme Riedi, Chaabane Djeraba, and Pierre Nabat

Data sets

Knowledge-inspired fusion strategies for the inference of PM2.5 values with a Neural Network - CAMS data for experiments Matthieu Dabrowski https://doi.org/10.5281/zenodo.13929498

CNRM-ALADIN64 - Regional climate simulation over the Euro-Mediterranean region Marc Mallet and Pierre Nabat http://dx.doi.org/10.25326/703

Model code and software

Knowledge-inspired fusion strategies for the inference of PM2.5 values with a Neural Network - code for experiments Matthieu Dabrowski https://doi.org/10.5281/zenodo.13947256

Matthieu Dabrowski, José Mennesson, Jérôme Riedi, Chaabane Djeraba, and Pierre Nabat

Viewed

Total article views: 2,436 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,454	476	506	2,436	91	188

HTML: 1,454
PDF: 476
XML: 506
Total: 2,436
BibTeX: 91
EndNote: 188

Views and downloads (calculated since 29 Oct 2024)

Month	HTML	PDF	XML	Total
Oct 2024	122	32	6	160
Nov 2024	108	26	10	144
Dec 2024	92	30	4	126
Jan 2025	78	16	8	102
Feb 2025	36	16	88	140
Mar 2025	56	16	114	186
Apr 2025	24	24	98	146
May 2025	32	22	94	148
Jun 2025	96	12	50	158
Jul 2025	32	12	2	46
Aug 2025	90	18	0	108
Sep 2025	322	10	2	334
Oct 2025	26	6	0	32
Nov 2025	42	38	6	86
Dec 2025	48	54	0	102
Jan 2026	90	30	4	124
Feb 2026	56	30	4	90
Mar 2026	48	28	14	90
Apr 2026	22	28	1	51
May 2026	22	17	0	39
Jun 2026	7	3	0	10
Jul 2026	5	8	1	14

Cumulative views and downloads (calculated since 29 Oct 2024)

Month	HTML	PDF	XML	Total
Oct 2024	122	32	6	160
Nov 2024	108	26	10	144
Dec 2024	92	30	4	126
Jan 2025	78	16	8	102
Feb 2025	36	16	88	140
Mar 2025	56	16	114	186
Apr 2025	24	24	98	146
May 2025	32	22	94	148
Jun 2025	96	12	50	158
Jul 2025	32	12	2	46
Aug 2025	90	18	0	108
Sep 2025	322	10	2	334
Oct 2025	26	6	0	32
Nov 2025	42	38	6	86
Dec 2025	48	54	0	102
Jan 2026	90	30	4	124
Feb 2026	56	30	4	90
Mar 2026	48	28	14	90
Apr 2026	22	28	1	51
May 2026	22	17	0	39
Jun 2026	7	3	0	10
Jul 2026	5	8	1	14

Viewed (geographical distribution)

Total article views: 2,429 (including HTML, PDF, and XML) Thereof 2,429 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 22 Jul 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (3883 KB)
Metadata XML

Short summary

This work focuses on the prediction of aerosol concentration values at ground level, which are a strong indicator of air quality, using Artificial Neural Networks. A study of different variables and their efficiency as inputs for these models is also proposed, and reveals that the best results are obtained when using all of them. Comparison of networks architectures and information fusion methods allows the extraction of knowledge on the most efficient methods in the context of this study.


Total:	0
HTML:	0
PDF:	0
XML:	0