the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Classification accuracy and compatibility across devices of a new Rapid-E+ flow cytometer
Abstract. The study evaluated a new model of a Plair SA air flow cytometer, Rapid-E+, and assessed its suitability for airborne pollen monitoring within operational networks. Key features of the new model are compared with the previous one, Rapid-E. A machine learning algorithm is constructed and evaluated for (i) classification of reference pollen types in laboratory conditions and (ii) monitoring in real-life field campaigns. The second goal of the study was to evaluate the device usability in forthcoming monitoring networks, which would require similarity and reproducibility of the measurement signal across devices. We employed three devices and analysed (dis-)similarities of their measurements in laboratory conditions. The lab evaluation showed similar recognition performance as that of Rapid-E, but field measurements in conditions when several pollen types are present in the air simultaneously, showed a notably lower agreement of Rapid-E+ with manual Hirst-type observations than those of the older model. An exception was the total-pollen measurements. Comparison across the Rapid-E+ devices revealed noticeable differences in fluorescence measurements between the three devices tested. As a result, application of the recognition algorithm trained on the data of one device to another one led to large errors. The study confirmed the potential of the fluorescence measurements for discrimination between different pollen classes, but each monitor needed to be trained individually to achieve acceptable skills. A large uncertainty of fluorescence measurements and their variability between different devices need to be addressed to improve the device usability.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(5061 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(5061 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2024-187', Anonymous Referee #1, 02 May 2024
This manuscript “Classification accuracy and compatibility across devices of a new Rapid-E+ flow cytometer” describes the evaluation of a new instrument, the Rapid-E+, upgraded from a previous model made by Plair SA, and its ability to monitor pollen compared alongside a manual Hirst-type sampler. The necessary training of a classification algorithm to distinguish pollen types is detailed and lab evaluation is followed up by field evaluation, and cross-comparison with instruments at other sites to assess method generalisability. The study is thorough and comprehensive, looking into the detail of the different modalities of data obtained for different pollen types across different instruments.
The manuscript is of rigorous scientific quality and reports findings that are useful in this field to further the advancement of automated pollen monitoring. It is written and presented concisely and generally clearly, with ample supporting information in the Appendices. There are only some minor technical points that I would address before continuing to publication.
Please see below for specific comments by line.
Abstract
Line 22: I would use the term ‘instrument’ instead of ‘monitor’.
Introduction
Line 29: “Buters et al. 2022”
Line 30: “monitoring instruments”
Materials and Methods
Line 49-50: Not sure in this sentence exactly how the Rapid-E+ compares to the Rapid-E. Perhaps alter to “In particular the Rapid-E+ samples at a faster flow rate of 5 l min-1 (compared to 2.8 l min-1 for the Rapid-E), and records all particles passing through a 447 nm scattering laser into 4 size bins (>0.3 µm, >0.5 µm, >1 µm, >5 µm) unlike the Rapid-E which…?” (does the Rapid-E not have different size bins?)
Line 55-56: “also allows for adjusting the gain of the fluorescence spectrum and lifetime detectors”
Line 72: “Three Rapid-E+ air flow cytometers were involved in this study.”
Line 72: “…in Novi Sad, Serbia, …”
Line 73: “the Novi Sad laboratory” is very nondescript. Details about the organisation that runs the Novi Sad laboratory may be helpful, and the environment?
Line78: “The test period allowed for the exploration of measurement performance of the automatic bioaerosol monitoring instrument in a variety of conditions characteristic of the Pannonian Plain in [where?]. This region contains a large diversity of pollen and fungal spores…” This sentence was quite long so I suggest splitting it into two, e.g. where I have done so.
Line 82: “the period of seasonal allergies” – perhaps a little more description specifically as to what these seasonal allergies are in this place?
Line 83: “when large quantities of ragweed pollen are recorded in the air”
Line 85: “the main features of diurnal variations”
Line 89: “Reference pollen for training was collected locally.”
Line 98: “to ensure identity” - could you explain this better?
Line 102: “exposed to pollen using the Swisens Atomizer”
Line 103: “expose pollen to the Novi Sad and Osijek devices.
Line 106: “validating”
Line 109: Could say “colocated” instead of side-by-side.
Results and discussion
Line 201: Are these precision, recall and F1 scores averaged across scores for each pollen classification? If so, just mention they are averaged to avoid confusion, if not, I am unsure how the score differs from the discrimination of pollen from “other”.
Line 207: By ‘the classification algorithm with high accuracy’ do you mean the one that achieved F1 score of 0.86 as opposed to 0.84? Or simply that the algorithm managed to distinguish these pollen types with high accuracy, regardless as to which? Perhaps it may be better to write something like one of the following, depending on which you meant to avoid confusion…
“It is interesting to note that the latter classification algorithm (with merged classes) distinguished Urtica and Parietaria from Brousonetia despite these pollen grains being morphologically similar.”
Or
“It is interesting to note that the classification algorithm distinguished Urtica and Parietaria from Brousonetia with high accuracy, despite these pollen grains being morphologically similar.”
Fig. 2: The numbers and names are a bit small and blurry, would be good to make the characters a little larger if possible.
Line 226: what are the exact dates referred to here?
Line 235: Best to define PSLs in brackets for good measure as it is mentioned for the first time in this manuscript.
Line 241: At a glance, this sentence was a little confusing, I would correct it to something like: “Automatic detections of total pollen, as well as Juglans, Morus and Ambrosia, have a statistically significant positive correlation with…”
Line 243: “for most pollen classes” or “for most of the pollen classes”
Line 245: Perhaps rephrase as, for example, “Pollen grains that occur simultaneously in the air had a clear tendency to be confused amongst each other, which was expected…”
Line 261: “As demonstrated for the Rapid-E, this problem also exists for the Rapid-E+.”
Line 278: I would probably start a new sentence and replace the second i.e. before ‘different timing…’ with something else. This sentence is a bit confusing and long. Is it saying that since some pollen classes were comparable across devices, the differences observed across others shouldn’t be due to doing lab work at different times and different methods of pollen exposure to the instrument? Or are you saying each lab followed the same procedures so it shouldn’t be an issue?
Fig. 5 writing font too small and am unsure what I am looking at in 5D, can labels be added to the x, y and colour axes?
Fig. 6 again writing font too small.
-
AC3: 'Reply on RC1', Mikhail Sofiev, 19 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-187/egusphere-2024-187-AC3-supplement.pdf
-
AC3: 'Reply on RC1', Mikhail Sofiev, 19 Jun 2024
-
CC1: 'A very interesting and robust study', Matt Smith, 09 May 2024
The authors present a very interesting and robust study examining the classification accuracy and compatibility across devices of a new Rapid-E+ flow cytometer for examining airborne pollen. The paper is generally well written, although it could do with thorough editing with specific focus on the use of articles. I have listed some minor comments below that I hope will help. My one comment about the methods relates to the use of the Hirst type trap (Lines 161 to 165). When calibrating such sensitive instruments as the Rapid-E and Rapid-E+, it is important to remove as much uncertainty as possible. The authors might therefore consider counting whole slides from the Hirst type trap to reduce error. Obviously, this is not always feasible when examining whole seasons, but even examining a small subset of slides in this way might provide some interesting insights. Although I note that correlations were only conducted for or days when average pollen concentrations measured by the manual method exceeded 10 pollen m−3 in order to reduce uncertainty. Minor comments Line 47 - “which is a new model stemming from the PA-300 (Crouzy et al., 2016) and Rapid-E (Sauliene et al., 2019)”. line 49 – “In particular, Rapid-E+ samples at a flow rate of 5 l min-1” Line 53 – “Like its predecessor” Line 73 – “was trained in the Novi Sad laboratory” Line 74 – “owned by the City of Osijek in Croatia” Line 74 – “and the Finnish Meteorological Institute” Lines 79/80 – “for the Pannonian Plain” Lines 85/86 – “or capturing the main features” Line 91 – “Scientific names should be italics” (review throughout including figures and tables). Lines 98/99 – “To ensure identification” Line 102 - by using a Swisense Atomizer Line 193 – “It is interesting to note that after the start of rainfall the coarse particles” Line 196 – The following lacks clarity and should be rewritten "However, quite low flow rate" Line 208 – “despite these pollen grains being morphologically similar” (note that the plural of pollen is pollen) Line 245 – “There was a clear tendency towards confusion of different pollen occurring” Table 1 - It would be interesting to see the correlation coefficients for Taxaceae/Cupressaceae combined and for the Urticaceae family, as many pollen monitoring networks do not separate these into different genera due to the difficulty in identification. Line 256 – “Repeating it for each device in a network is unfeasible” Lines 261/262 – The following text lacks clarity and needs reworking, perhaps linked to another sentence "Demonstrated for Rapid-E, the problem also existed for Rapid-E+ (Fig. 4)". Line 263 - pollen not pollens Line 274 - pollen not pollens Line 277 – “Although this was not seen for all pollen types, there are pollen classes with comparable” Line 317 – “datasets, the creation of which is a highly demanding process”.
Citation: https://doi.org/10.5194/egusphere-2024-187-CC1 -
AC1: 'Reply on CC1', Mikhail Sofiev, 19 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-187/egusphere-2024-187-AC1-supplement.pdf
-
AC1: 'Reply on CC1', Mikhail Sofiev, 19 Jun 2024
-
RC2: 'Comment on egusphere-2024-187', Anonymous Referee #2, 20 May 2024
A new device from Plair SA company Rapid-E+ is investigated in current study. A two-step classification was applied. At the first step of classification pollen are separated from non-pollen particles. At the second step pollen are classified into 27 pollen classes. It as established, that as with previous device rapid-E remains a large discrepancy between the signals measured by different devices. Therefore individual models need to be trained for every device. In overall the paper is well prepared. Some minors points must be corrected before final publication.
The paragraph about the used model (135-150) should be extended. ResNet-18 has 4 2-layer blocks. What does mean 4-block-layer or 3-block-layer? In context of ResNet style models, a block is a container of layers. It means that a block is a larger unit than a layer. It seems that not all neural networks have 18 layers, because their architectures are different. That to present the architectures to readers, a good point would be to prepare a architecture table as Table 3 in the paper (https://arxiv.org/pdf/1803.06131). It would also be useful to show the size of the inputs arrays received by each mode sub-network. The scattering images of Rapid-E were of variable length. What is case in Rapid-E+? If they are of variable size, how the issue was solved?
It would seem that in the graphs shown in Figure B2 of Appendix B, the intensity should be positive. However, a large part of the shadow, which is bounded by the curvatures calculated adding and subtracting standard deviation to/from the mean, is in the negative range. The standard deviation is appropriate to characterize the dispersion when the values follow a normal distribution. In this case, the distribution does not appear to be normal and, moreover, asymmetric. In this case, it is preferable to represent in the center by solid line a median curve and to delimit the shaded area by curves corresponding to quantiles symmetrical with respect to the median.
Citation: https://doi.org/10.5194/egusphere-2024-187-RC2 -
AC2: 'Reply on RC2', Mikhail Sofiev, 19 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-187/egusphere-2024-187-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Mikhail Sofiev, 19 Jun 2024
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2024-187', Anonymous Referee #1, 02 May 2024
This manuscript “Classification accuracy and compatibility across devices of a new Rapid-E+ flow cytometer” describes the evaluation of a new instrument, the Rapid-E+, upgraded from a previous model made by Plair SA, and its ability to monitor pollen compared alongside a manual Hirst-type sampler. The necessary training of a classification algorithm to distinguish pollen types is detailed and lab evaluation is followed up by field evaluation, and cross-comparison with instruments at other sites to assess method generalisability. The study is thorough and comprehensive, looking into the detail of the different modalities of data obtained for different pollen types across different instruments.
The manuscript is of rigorous scientific quality and reports findings that are useful in this field to further the advancement of automated pollen monitoring. It is written and presented concisely and generally clearly, with ample supporting information in the Appendices. There are only some minor technical points that I would address before continuing to publication.
Please see below for specific comments by line.
Abstract
Line 22: I would use the term ‘instrument’ instead of ‘monitor’.
Introduction
Line 29: “Buters et al. 2022”
Line 30: “monitoring instruments”
Materials and Methods
Line 49-50: Not sure in this sentence exactly how the Rapid-E+ compares to the Rapid-E. Perhaps alter to “In particular the Rapid-E+ samples at a faster flow rate of 5 l min-1 (compared to 2.8 l min-1 for the Rapid-E), and records all particles passing through a 447 nm scattering laser into 4 size bins (>0.3 µm, >0.5 µm, >1 µm, >5 µm) unlike the Rapid-E which…?” (does the Rapid-E not have different size bins?)
Line 55-56: “also allows for adjusting the gain of the fluorescence spectrum and lifetime detectors”
Line 72: “Three Rapid-E+ air flow cytometers were involved in this study.”
Line 72: “…in Novi Sad, Serbia, …”
Line 73: “the Novi Sad laboratory” is very nondescript. Details about the organisation that runs the Novi Sad laboratory may be helpful, and the environment?
Line78: “The test period allowed for the exploration of measurement performance of the automatic bioaerosol monitoring instrument in a variety of conditions characteristic of the Pannonian Plain in [where?]. This region contains a large diversity of pollen and fungal spores…” This sentence was quite long so I suggest splitting it into two, e.g. where I have done so.
Line 82: “the period of seasonal allergies” – perhaps a little more description specifically as to what these seasonal allergies are in this place?
Line 83: “when large quantities of ragweed pollen are recorded in the air”
Line 85: “the main features of diurnal variations”
Line 89: “Reference pollen for training was collected locally.”
Line 98: “to ensure identity” - could you explain this better?
Line 102: “exposed to pollen using the Swisens Atomizer”
Line 103: “expose pollen to the Novi Sad and Osijek devices.
Line 106: “validating”
Line 109: Could say “colocated” instead of side-by-side.
Results and discussion
Line 201: Are these precision, recall and F1 scores averaged across scores for each pollen classification? If so, just mention they are averaged to avoid confusion, if not, I am unsure how the score differs from the discrimination of pollen from “other”.
Line 207: By ‘the classification algorithm with high accuracy’ do you mean the one that achieved F1 score of 0.86 as opposed to 0.84? Or simply that the algorithm managed to distinguish these pollen types with high accuracy, regardless as to which? Perhaps it may be better to write something like one of the following, depending on which you meant to avoid confusion…
“It is interesting to note that the latter classification algorithm (with merged classes) distinguished Urtica and Parietaria from Brousonetia despite these pollen grains being morphologically similar.”
Or
“It is interesting to note that the classification algorithm distinguished Urtica and Parietaria from Brousonetia with high accuracy, despite these pollen grains being morphologically similar.”
Fig. 2: The numbers and names are a bit small and blurry, would be good to make the characters a little larger if possible.
Line 226: what are the exact dates referred to here?
Line 235: Best to define PSLs in brackets for good measure as it is mentioned for the first time in this manuscript.
Line 241: At a glance, this sentence was a little confusing, I would correct it to something like: “Automatic detections of total pollen, as well as Juglans, Morus and Ambrosia, have a statistically significant positive correlation with…”
Line 243: “for most pollen classes” or “for most of the pollen classes”
Line 245: Perhaps rephrase as, for example, “Pollen grains that occur simultaneously in the air had a clear tendency to be confused amongst each other, which was expected…”
Line 261: “As demonstrated for the Rapid-E, this problem also exists for the Rapid-E+.”
Line 278: I would probably start a new sentence and replace the second i.e. before ‘different timing…’ with something else. This sentence is a bit confusing and long. Is it saying that since some pollen classes were comparable across devices, the differences observed across others shouldn’t be due to doing lab work at different times and different methods of pollen exposure to the instrument? Or are you saying each lab followed the same procedures so it shouldn’t be an issue?
Fig. 5 writing font too small and am unsure what I am looking at in 5D, can labels be added to the x, y and colour axes?
Fig. 6 again writing font too small.
-
AC3: 'Reply on RC1', Mikhail Sofiev, 19 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-187/egusphere-2024-187-AC3-supplement.pdf
-
AC3: 'Reply on RC1', Mikhail Sofiev, 19 Jun 2024
-
CC1: 'A very interesting and robust study', Matt Smith, 09 May 2024
The authors present a very interesting and robust study examining the classification accuracy and compatibility across devices of a new Rapid-E+ flow cytometer for examining airborne pollen. The paper is generally well written, although it could do with thorough editing with specific focus on the use of articles. I have listed some minor comments below that I hope will help. My one comment about the methods relates to the use of the Hirst type trap (Lines 161 to 165). When calibrating such sensitive instruments as the Rapid-E and Rapid-E+, it is important to remove as much uncertainty as possible. The authors might therefore consider counting whole slides from the Hirst type trap to reduce error. Obviously, this is not always feasible when examining whole seasons, but even examining a small subset of slides in this way might provide some interesting insights. Although I note that correlations were only conducted for or days when average pollen concentrations measured by the manual method exceeded 10 pollen m−3 in order to reduce uncertainty. Minor comments Line 47 - “which is a new model stemming from the PA-300 (Crouzy et al., 2016) and Rapid-E (Sauliene et al., 2019)”. line 49 – “In particular, Rapid-E+ samples at a flow rate of 5 l min-1” Line 53 – “Like its predecessor” Line 73 – “was trained in the Novi Sad laboratory” Line 74 – “owned by the City of Osijek in Croatia” Line 74 – “and the Finnish Meteorological Institute” Lines 79/80 – “for the Pannonian Plain” Lines 85/86 – “or capturing the main features” Line 91 – “Scientific names should be italics” (review throughout including figures and tables). Lines 98/99 – “To ensure identification” Line 102 - by using a Swisense Atomizer Line 193 – “It is interesting to note that after the start of rainfall the coarse particles” Line 196 – The following lacks clarity and should be rewritten "However, quite low flow rate" Line 208 – “despite these pollen grains being morphologically similar” (note that the plural of pollen is pollen) Line 245 – “There was a clear tendency towards confusion of different pollen occurring” Table 1 - It would be interesting to see the correlation coefficients for Taxaceae/Cupressaceae combined and for the Urticaceae family, as many pollen monitoring networks do not separate these into different genera due to the difficulty in identification. Line 256 – “Repeating it for each device in a network is unfeasible” Lines 261/262 – The following text lacks clarity and needs reworking, perhaps linked to another sentence "Demonstrated for Rapid-E, the problem also existed for Rapid-E+ (Fig. 4)". Line 263 - pollen not pollens Line 274 - pollen not pollens Line 277 – “Although this was not seen for all pollen types, there are pollen classes with comparable” Line 317 – “datasets, the creation of which is a highly demanding process”.
Citation: https://doi.org/10.5194/egusphere-2024-187-CC1 -
AC1: 'Reply on CC1', Mikhail Sofiev, 19 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-187/egusphere-2024-187-AC1-supplement.pdf
-
AC1: 'Reply on CC1', Mikhail Sofiev, 19 Jun 2024
-
RC2: 'Comment on egusphere-2024-187', Anonymous Referee #2, 20 May 2024
A new device from Plair SA company Rapid-E+ is investigated in current study. A two-step classification was applied. At the first step of classification pollen are separated from non-pollen particles. At the second step pollen are classified into 27 pollen classes. It as established, that as with previous device rapid-E remains a large discrepancy between the signals measured by different devices. Therefore individual models need to be trained for every device. In overall the paper is well prepared. Some minors points must be corrected before final publication.
The paragraph about the used model (135-150) should be extended. ResNet-18 has 4 2-layer blocks. What does mean 4-block-layer or 3-block-layer? In context of ResNet style models, a block is a container of layers. It means that a block is a larger unit than a layer. It seems that not all neural networks have 18 layers, because their architectures are different. That to present the architectures to readers, a good point would be to prepare a architecture table as Table 3 in the paper (https://arxiv.org/pdf/1803.06131). It would also be useful to show the size of the inputs arrays received by each mode sub-network. The scattering images of Rapid-E were of variable length. What is case in Rapid-E+? If they are of variable size, how the issue was solved?
It would seem that in the graphs shown in Figure B2 of Appendix B, the intensity should be positive. However, a large part of the shadow, which is bounded by the curvatures calculated adding and subtracting standard deviation to/from the mean, is in the negative range. The standard deviation is appropriate to characterize the dispersion when the values follow a normal distribution. In this case, the distribution does not appear to be normal and, moreover, asymmetric. In this case, it is preferable to represent in the center by solid line a median curve and to delimit the shaded area by curves corresponding to quantiles symmetrical with respect to the median.
Citation: https://doi.org/10.5194/egusphere-2024-187-RC2 -
AC2: 'Reply on RC2', Mikhail Sofiev, 19 Jun 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-187/egusphere-2024-187-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Mikhail Sofiev, 19 Jun 2024
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
368 | 102 | 36 | 506 | 20 | 20 |
- HTML: 368
- PDF: 102
- XML: 36
- Total: 506
- BibTeX: 20
- EndNote: 20
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Branko Sikoparija
Predrag Matavulj
Isidora Simovic
Predrag Radisic
Sanja Brdar
Vladan Minic
Danijela Tesendic
Evgeny Kadantsev
Julia Palamarchuk
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(5061 KB) - Metadata XML