A free, open-source method for automated mapping of quantitative mineralogy from energy-dispersive X-ray spectroscopy scans of rock thin sections

Reed, Miles Mark; Ferrier, Ken L.; Nachlas, William O.; Schneider, Bil; Arson, Chloé; Xu, Tingting; Shen, Xianda; West, Nicole

doi:10.5194/egusphere-2024-1017

Preprints

https://doi.org/10.5194/egusphere-2024-1017

Preprints

09 Apr 2024

| 09 Apr 2024

A free, open-source method for automated mapping of quantitative mineralogy from energy-dispersive X-ray spectroscopy scans of rock thin sections

Miles Mark Reed, Ken L. Ferrier, William O. Nachlas, Bil Schneider, Chloé Arson, Tingting Xu, Xianda Shen, and Nicole West

Abstract. Quantitative mapping of minerals in rock thin sections delivers data on mineral abundance, size, and spatial arrangement that are useful for many geoscience and engineering disciplines. Although automated methods for mapping mineralogy exist, these are often expensive, associated with proprietary software, or require programming skills, which limits their usage. Here we present a free, open-source method for automated mineralogy mapping from energy dispersive spectroscopy (EDS) scans of rock thin sections. This method uses a random forest machine learning image classification algorithm within the QGIS geographic information system and Orfeo Toolbox, which are both free and open source. To demonstrate the utility of this method, we apply it to 14 rock thin sections from the well-studied Rio Blanco tonalite lithology of Puerto Rico. Measurements of mineral abundance inferred from our method compare favourably to previous measurements of mineral abundance inferred from X-ray diffraction and point counts on thin sections. The model-generated mineral maps agree with independent, manually-delineated mineral maps at a mean rate of 95 %, with accuracies as high as 96 % for the most abundant phase (plagioclase) and as low as 72 % for the least abundant phase (apatite) in these samples. We show that the default random forest hyperparameters in Orfeo Toolbox yielded high accuracy in the model-generated mineral maps, and we demonstrate how users can determine the sensitivity of the mineral maps to hyperparameter values and input features. These results show that this method can be used to generate accurate maps of major mineral phases in rock thin sections using entirely free and open-source applications.

Received: 03 Apr 2024 – Discussion started: 09 Apr 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 37937 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (37937 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

02 Sep 2025

A free, open-source method for automated mapping of quantitative mineralogy from energy-dispersive X-ray spectroscopy scans of rock thin sections

Miles M. Reed, Ken L. Ferrier, William O. Nachlas, Bil Schneider, Chloé Arson, Tingting Xu, Xianda Shen, and Nicole West

Geosci. Instrum. Method. Data Syst., 14, 193–209, https://doi.org/10.5194/gi-14-193-2025,https://doi.org/10.5194/gi-14-193-2025, 2025

Short summary

Miles Mark Reed, Ken L. Ferrier, William O. Nachlas, Bil Schneider, Chloé Arson, Tingting Xu, Xianda Shen, and Nicole West

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2024-1017', Anonymous Referee #1, 25 Apr 2024

Review: A free, open-source method for automated mapping of quantitative mineralogy from energy-dispersive X-ray spectroscopy scans of rock thin sections
General Comments
The manuscript describes the analysis of 14 tonalite thin sections by SEM-EDS mapping and demonstrates an approach of mineral identification by the random forest classifier. The results are analyzed for the modal mineralogy and some geometric properties of the samples. The focus is on the approach, which was realized through open-source software only, giving access to any user of such data without the need of programming skills or expensive software. The need for and advantages of free software are an important topic within the community. High prices and restrictions of proprietary software are often a problem. Therefore, developing and providing the access to such kind of analysis is an important contribution to the scientific community and the readers of the journal.
In general, the results of the manuscript are convincing and the conclusions sound. The findings are well presented. The discussion might need some additions (see specific comments below). The structure, however, seemed a bit problematic, as there is some redundancy or repetition in the general approach and the used method. The reader has to move a lot in the manuscript when a topic is split into four different chapters. The reader would benefit from a condensation of the description.
I would suggest making a shorter version of the general approach (current chapter 2) with no discussion or recommendations (l. 134f for the procedure), but just the plain and neutral description of the method. Then in chapter 3 explain in more detail why certain parameters are important and were used in this study. Then the discussion should mention further explanations and limitations of the methods. Otherwise, there is in parts no real separation of results and discussion/interpretation.
As a personal preference: The style of the manuscript is in parts similar to a user manual. For a scientific paper, I would suggest to aim at a more neutral style of description in the results part.
In general, this work is useful research and should be published after the comments have been addressed.
Specific comments:
L. 28: Do you consider hyperparameter a generally known expression? If not, it would be better to use a more general word in the abstract.

Ll. 39f: I would suggest quoting the original papers regarding quantitative mineralogy, e.g.:

• Sutherland, D. N. and Gottlieb, P.: Application of automated quantitative mineralogy in mineral processing, Miner. Eng., 4, 753–762, 1991.

• Sutherland, D., P. Gottlieb, R. Jackson, G. Wilkie und P. Stewart (1988). "Measurement in section of particles of known composition." Minerals Engineering 1(4): 317-326.

• Gu, Y.: Automated scanning electron microscope based mineral liberation analysis An introduction to JKMRC/FEI mineral liberation analyser, Journal of Minerals and Materials Characterization and Engineering, 2, 33, 2003

L. 52f: I am not aware of a WDS system that can map mineral phases. I would be interested in more information.

L. 53f: I am not sure about LA-ICP-MS, here. From the quotation, you might mean Laser Induced Breakdown Spectroscopy (LIBS).

L. 77: “predicted mineralogy”. You cannot predict mineralogy. What you mean is the modal mineralogy or mineral abundance.

Ll. 88ff: The last paragraph of the introduction is normally a short summary of the hypothesis or goal of the paper and how it might be reached. You start with the goal in l. 88, but then continue to explain classification algorithms. Maybe you could move the algorithm part up and leave the goals to the last paragraph.

L. 96. “random sampling”: Do you mean a random sample?

L. 110. “In the remainder of this study”: Better: Furthermore, …

L. 113. “hope”: You could write that you intend to make it available to a broader community, which is good, but hope is the wrong word in this context. An ideal scientist would be neutral. The same applies to l.614.

L. 118: You should clarify that you mean EDS data from a SEM. There are other methods producing EDS spectra, some of which you list in the introduction, so until you mention it, the reader assumes EDS data from any source.

L. 127: How do you deal with overlapping peaks (e.g. S/As/Pb or Ti/Ba etc.)? What influence would overlaps have on the classification? Would it be possible to have several lines (e.g. K-alpha, K-beta, L-alpha) for one element?

Ll. 132ff: This raises several questions: If you need to know the minerals already, how would you do that? This would mean a lot of work done twice (manual identification + RF classification). What happens, if you miss one or more minerals? Can you analyze unknown samples?

L. 138 “electron beam”: It could also be X-rays or laser from what you mentioned in the introduction. Consider also the comment for l. 118.

L. 139: How would the smoothing help, when a peak was identified incorrectly? Please clarify.

L. 158: I believe 30% training data would be standard. How did you select the training areas? Randomly?

L. 165 expression “mineral phase”: phase is a defined technical term. The algorithm assigns a mineral name to a pixel, since EDS contain have only chemical information, which cannot always distinguish phases (polymorphs, limitations in element analysis such as light elements).

Ll. 164ff: What about minerals that have not been trained? Are all pixels classified? Can the similarity between training pixel and the classified pixel be described?

Ll. 173f: How did you find that 10 pixels are best? I assume this would vary, based on the sample type, grain size and resolution of the image. Can you offer an explanation, why pixel sizes influence accuracy? I would assume that, if you have less spatial blur (mean filter), the results would be more accurate. Could it be a user bias, as users prefer large grains for training areas?

L. 175: You should explain voting ties and how they are treated in the introduction.

L. 221: How can a beam step size and magnification result in a map? Please rephrase. Also, is beam step size the same as pixel size?

Chapter 3.2: If you have a mean raster with 20’000’000 pixels and measure 200 ms per pixel, how can you calculate an acquisition time of 3.5 h? Please explain.

L. 239: Why this recommendation? What if apatite is important to a study? Why would you combine minerals? One would assume that this affects accuracy negatively. Do you have any information on how many training areas are necessary and how large they should be? You should have a discussion on the selection of training areas in the discussion section.

L. 246f: Move the explanation of virtual rasters to the overview of the method or the introduction.

Figure 2: Could you plot the data from XRD and point counting as well for comparison? Then you could shorten the paragraph above.

L. 337: Why did you choose these three thin sections?

Ll. 363f “This indicates that the models correctly predicted apatite when attempted but the models often neglected to predict apatite.”: Why would the models not try to predict apatite, if they are trained for apatite? Please explain.

L. 356 and Ll. 391f: You use questions to structure your manuscript. I would suggest avoiding that and use a more descriptive style.

Table 3: You could add a line with the standard parameters for comparison. If I understood correctly, you used the standard values for what you showed in figure 2?

Ll. 535-543: How do you evaluate the effort to label training data compared to what you describe? In how far can the training data be extrapolated to unknown samples or other studies? As the training requires most of the work and time, what are the limits of this approach? – Some of the questions are answered further in the manuscript, which raises the question, if the text could be restructured.

Ll.545-558: How can you make sure that you actually have identified only one grain of e.g. biotite? With EDX, you get the chemical distribution, but if you have monomineralic aggregates or touching biotite grains, how can you separate those grains in order to calculate correct grain areas/ sizes? – After reading chapter 5.3, I realized that you addressed part of the comments. The manuscript would benefit in structure and clarity, if you could combine the paragraphs or address the problems/limitations in the discussion parts directly where you mention the analysis.

General Discussion: Do you have information on the time invested in creating and testing the models? Can you put in relation how much time the mineralogical analysis (e.g. XRD) costs compared to your approach?
Technical corrections:

L. 108 “requires no programming on the part of the user”: This sounds a bit odd. Do you mean no programming by the user?

L. 177: Can a map be interrogated?

L. 219: “studied” instead of “study”?

L. 500: comma is unnecessary.

Citation: https://doi.org/10.5194/egusphere-2024-1017-RC1
CC1: 'Comment on egusphere-2024-1017 minor revision, addition of one figure', Bernhard Schulz, 17 Jul 2024

The text is well written and organised. The limits of the method are given. The authors tested their approach to one of the most simple geological objects, an undeformed granite. Apparently it seems to work by producing a spectral map. However, this approach to SEM-AM is of very limited use for application to many mineral processing issues when compared to the professional software platforms, e.g. MLA 3.1 by FEI.
The authors should provide a further schematic Figure which explains how the primary EDS signal is transferred to the element map, see more detailed comment in the text.

Citation: https://doi.org/10.5194/egusphere-2024-1017-CC1
RC2: 'Comment on egusphere-2024-1017', Anonymous Referee #2, 09 Feb 2025

The authors should address the issue of the real added value of their work with respect to the pertinent literature.
It is not clear the assumption at the basis of considering the results used for the training on a region as useful for a test area.

Citation: https://doi.org/10.5194/egusphere-2024-1017-RC2
AC1: 'Responses to referee and public comments on egusphere-2024-1017', Miles Reed, 24 Mar 2025

Thank you for the reviews and comments. The attached pdf contains our detailed responses.

Citation: https://doi.org/10.5194/egusphere-2024-1017-AC1

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2024-1017', Anonymous Referee #1, 25 Apr 2024

Review: A free, open-source method for automated mapping of quantitative mineralogy from energy-dispersive X-ray spectroscopy scans of rock thin sections
General Comments
The manuscript describes the analysis of 14 tonalite thin sections by SEM-EDS mapping and demonstrates an approach of mineral identification by the random forest classifier. The results are analyzed for the modal mineralogy and some geometric properties of the samples. The focus is on the approach, which was realized through open-source software only, giving access to any user of such data without the need of programming skills or expensive software. The need for and advantages of free software are an important topic within the community. High prices and restrictions of proprietary software are often a problem. Therefore, developing and providing the access to such kind of analysis is an important contribution to the scientific community and the readers of the journal.
In general, the results of the manuscript are convincing and the conclusions sound. The findings are well presented. The discussion might need some additions (see specific comments below). The structure, however, seemed a bit problematic, as there is some redundancy or repetition in the general approach and the used method. The reader has to move a lot in the manuscript when a topic is split into four different chapters. The reader would benefit from a condensation of the description.
I would suggest making a shorter version of the general approach (current chapter 2) with no discussion or recommendations (l. 134f for the procedure), but just the plain and neutral description of the method. Then in chapter 3 explain in more detail why certain parameters are important and were used in this study. Then the discussion should mention further explanations and limitations of the methods. Otherwise, there is in parts no real separation of results and discussion/interpretation.
As a personal preference: The style of the manuscript is in parts similar to a user manual. For a scientific paper, I would suggest to aim at a more neutral style of description in the results part.
In general, this work is useful research and should be published after the comments have been addressed.
Specific comments:
L. 28: Do you consider hyperparameter a generally known expression? If not, it would be better to use a more general word in the abstract.

Ll. 39f: I would suggest quoting the original papers regarding quantitative mineralogy, e.g.:

• Sutherland, D. N. and Gottlieb, P.: Application of automated quantitative mineralogy in mineral processing, Miner. Eng., 4, 753–762, 1991.

• Sutherland, D., P. Gottlieb, R. Jackson, G. Wilkie und P. Stewart (1988). "Measurement in section of particles of known composition." Minerals Engineering 1(4): 317-326.

• Gu, Y.: Automated scanning electron microscope based mineral liberation analysis An introduction to JKMRC/FEI mineral liberation analyser, Journal of Minerals and Materials Characterization and Engineering, 2, 33, 2003

L. 52f: I am not aware of a WDS system that can map mineral phases. I would be interested in more information.

L. 53f: I am not sure about LA-ICP-MS, here. From the quotation, you might mean Laser Induced Breakdown Spectroscopy (LIBS).

L. 77: “predicted mineralogy”. You cannot predict mineralogy. What you mean is the modal mineralogy or mineral abundance.

Ll. 88ff: The last paragraph of the introduction is normally a short summary of the hypothesis or goal of the paper and how it might be reached. You start with the goal in l. 88, but then continue to explain classification algorithms. Maybe you could move the algorithm part up and leave the goals to the last paragraph.

L. 96. “random sampling”: Do you mean a random sample?

L. 110. “In the remainder of this study”: Better: Furthermore, …

L. 113. “hope”: You could write that you intend to make it available to a broader community, which is good, but hope is the wrong word in this context. An ideal scientist would be neutral. The same applies to l.614.

L. 118: You should clarify that you mean EDS data from a SEM. There are other methods producing EDS spectra, some of which you list in the introduction, so until you mention it, the reader assumes EDS data from any source.

L. 127: How do you deal with overlapping peaks (e.g. S/As/Pb or Ti/Ba etc.)? What influence would overlaps have on the classification? Would it be possible to have several lines (e.g. K-alpha, K-beta, L-alpha) for one element?

Ll. 132ff: This raises several questions: If you need to know the minerals already, how would you do that? This would mean a lot of work done twice (manual identification + RF classification). What happens, if you miss one or more minerals? Can you analyze unknown samples?

L. 138 “electron beam”: It could also be X-rays or laser from what you mentioned in the introduction. Consider also the comment for l. 118.

L. 139: How would the smoothing help, when a peak was identified incorrectly? Please clarify.

L. 158: I believe 30% training data would be standard. How did you select the training areas? Randomly?

L. 165 expression “mineral phase”: phase is a defined technical term. The algorithm assigns a mineral name to a pixel, since EDS contain have only chemical information, which cannot always distinguish phases (polymorphs, limitations in element analysis such as light elements).

Ll. 164ff: What about minerals that have not been trained? Are all pixels classified? Can the similarity between training pixel and the classified pixel be described?

Ll. 173f: How did you find that 10 pixels are best? I assume this would vary, based on the sample type, grain size and resolution of the image. Can you offer an explanation, why pixel sizes influence accuracy? I would assume that, if you have less spatial blur (mean filter), the results would be more accurate. Could it be a user bias, as users prefer large grains for training areas?

L. 175: You should explain voting ties and how they are treated in the introduction.

L. 221: How can a beam step size and magnification result in a map? Please rephrase. Also, is beam step size the same as pixel size?

Chapter 3.2: If you have a mean raster with 20’000’000 pixels and measure 200 ms per pixel, how can you calculate an acquisition time of 3.5 h? Please explain.

L. 239: Why this recommendation? What if apatite is important to a study? Why would you combine minerals? One would assume that this affects accuracy negatively. Do you have any information on how many training areas are necessary and how large they should be? You should have a discussion on the selection of training areas in the discussion section.

L. 246f: Move the explanation of virtual rasters to the overview of the method or the introduction.

Figure 2: Could you plot the data from XRD and point counting as well for comparison? Then you could shorten the paragraph above.

L. 337: Why did you choose these three thin sections?

Ll. 363f “This indicates that the models correctly predicted apatite when attempted but the models often neglected to predict apatite.”: Why would the models not try to predict apatite, if they are trained for apatite? Please explain.

L. 356 and Ll. 391f: You use questions to structure your manuscript. I would suggest avoiding that and use a more descriptive style.

Table 3: You could add a line with the standard parameters for comparison. If I understood correctly, you used the standard values for what you showed in figure 2?

Ll. 535-543: How do you evaluate the effort to label training data compared to what you describe? In how far can the training data be extrapolated to unknown samples or other studies? As the training requires most of the work and time, what are the limits of this approach? – Some of the questions are answered further in the manuscript, which raises the question, if the text could be restructured.

Ll.545-558: How can you make sure that you actually have identified only one grain of e.g. biotite? With EDX, you get the chemical distribution, but if you have monomineralic aggregates or touching biotite grains, how can you separate those grains in order to calculate correct grain areas/ sizes? – After reading chapter 5.3, I realized that you addressed part of the comments. The manuscript would benefit in structure and clarity, if you could combine the paragraphs or address the problems/limitations in the discussion parts directly where you mention the analysis.

General Discussion: Do you have information on the time invested in creating and testing the models? Can you put in relation how much time the mineralogical analysis (e.g. XRD) costs compared to your approach?
Technical corrections:

L. 108 “requires no programming on the part of the user”: This sounds a bit odd. Do you mean no programming by the user?

L. 177: Can a map be interrogated?

L. 219: “studied” instead of “study”?

L. 500: comma is unnecessary.

Citation: https://doi.org/10.5194/egusphere-2024-1017-RC1
CC1: 'Comment on egusphere-2024-1017 minor revision, addition of one figure', Bernhard Schulz, 17 Jul 2024

The text is well written and organised. The limits of the method are given. The authors tested their approach to one of the most simple geological objects, an undeformed granite. Apparently it seems to work by producing a spectral map. However, this approach to SEM-AM is of very limited use for application to many mineral processing issues when compared to the professional software platforms, e.g. MLA 3.1 by FEI.
The authors should provide a further schematic Figure which explains how the primary EDS signal is transferred to the element map, see more detailed comment in the text.

Citation: https://doi.org/10.5194/egusphere-2024-1017-CC1
RC2: 'Comment on egusphere-2024-1017', Anonymous Referee #2, 09 Feb 2025

The authors should address the issue of the real added value of their work with respect to the pertinent literature.
It is not clear the assumption at the basis of considering the results used for the training on a region as useful for a test area.

Citation: https://doi.org/10.5194/egusphere-2024-1017-RC2
AC1: 'Responses to referee and public comments on egusphere-2024-1017', Miles Reed, 24 Mar 2025

Thank you for the reviews and comments. The attached pdf contains our detailed responses.

Citation: https://doi.org/10.5194/egusphere-2024-1017-AC1

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

AR by Miles Reed on behalf of the Authors (24 Mar 2025) Author's response Author's tracked changes Manuscript

ED: Publish as is (07 Jun 2025) by Francesco Soldovieri

AR by Miles Reed on behalf of the Authors (11 Jun 2025) Manuscript

Journal article(s) based on this preprint

02 Sep 2025

A free, open-source method for automated mapping of quantitative mineralogy from energy-dispersive X-ray spectroscopy scans of rock thin sections

Miles M. Reed, Ken L. Ferrier, William O. Nachlas, Bil Schneider, Chloé Arson, Tingting Xu, Xianda Shen, and Nicole West

Geosci. Instrum. Method. Data Syst., 14, 193–209, https://doi.org/10.5194/gi-14-193-2025,https://doi.org/10.5194/gi-14-193-2025, 2025

Short summary

Miles Mark Reed, Ken L. Ferrier, William O. Nachlas, Bil Schneider, Chloé Arson, Tingting Xu, Xianda Shen, and Nicole West

Viewed

Total article views: 1,012 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
624	353	35	1,012	32	48

HTML: 624
PDF: 353
XML: 35
Total: 1,012
BibTeX: 32
EndNote: 48

Views and downloads (calculated since 09 Apr 2024)

Month	HTML	PDF	XML	Total
Apr 2024	123	36	9	168
May 2024	39	8	1	48
Jun 2024	56	21	3	80
Jul 2024	50	31	6	87
Aug 2024	16	14	2	32
Sep 2024	17	11	0	28
Oct 2024	15	24	0	39
Nov 2024	13	14	1	28
Dec 2024	9	11	0	20
Jan 2025	12	17	2	31
Feb 2025	39	15	2	56
Mar 2025	31	26	1	58
Apr 2025	27	23	0	50
May 2025	10	21	3	34
Jun 2025	47	33	2	82
Jul 2025	25	29	2	56
Aug 2025	92	19	1	112
Sep 2025	3	0	3
Oct 2025	0
Nov 2025	0
Dec 2025	0
Jan 2026	0
Feb 2026	0
Mar 2026	0
Apr 2026	0

Cumulative views and downloads (calculated since 09 Apr 2024)

Month	HTML	PDF	XML	Total
Apr 2024	123	36	9	168
May 2024	39	8	1	48
Jun 2024	56	21	3	80
Jul 2024	50	31	6	87
Aug 2024	16	14	2	32
Sep 2024	17	11	0	28
Oct 2024	15	24	0	39
Nov 2024	13	14	1	28
Dec 2024	9	11	0	20
Jan 2025	12	17	2	31
Feb 2025	39	15	2	56
Mar 2025	31	26	1	58
Apr 2025	27	23	0	50
May 2025	10	21	3	34
Jun 2025	47	33	2	82
Jul 2025	25	29	2	56
Aug 2025	92	19	1	112
Sep 2025	3	0	3
Oct 2025	0
Nov 2025	0
Dec 2025	0
Jan 2026	0
Feb 2026	0
Mar 2026	0
Apr 2026	0

Viewed (geographical distribution)

Total article views: 1,007 (including HTML, PDF, and XML) Thereof 1,007 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 11 Apr 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (37937 KB)
Metadata XML

Short summary

We constructed an easy-to-use, open-source method for mapping minerals in rock thin sections. This method is implemented with the geographical information system QGIS and the Orfeo Toolbox plugin using random forest image classification on scanning electron microscope data. We applied the method to 14 rock thin sections. Mineral abundance estimates from our method compare favorably to previously published estimates and agree 96 % in space and mineral type to manually derived mineral maps.


Total:	0
HTML:	0
PDF:	0
XML:	0