Preprints
https://doi.org/10.5194/egusphere-2024-1141
https://doi.org/10.5194/egusphere-2024-1141
11 Jun 2024
 | 11 Jun 2024

OpenMindat v1.0.0 R package: A machine interface to Mindat open data to facilitate data-intensive geoscience discoveries

Xiang Que, Jiyin Zhang, Weilin Chen, Jolyon Ralph, and Xiaogang Ma

Abstract. Powered by data-driven knowledge discovery technologies such as machine learning and deep learning, increasingly exciting patterns are discovered in complex earth science big data. One of the world's most enormous treasure troves of mineral databases, Mindat ("mindat.org"), contains vast amounts of knowledge that are yet to be mined. Through a project called OpenMindat, an application programming interface (API) to enable open data query and access from Mindat had been set up. This paper presented an open-source R package (OpenMindat v1.0.0) to bridge the data highway, connecting users' overwhelming data needs, facilitating data-intensive query and access, unlocking novel insights, and enabling groundbreaking geoscience discoveries. The package was designed to be user-friendly and extensible. It exploits the capabilities of the Mindat API, including the data subjects of geomaterials (e.g., rocks, minerals, synonyms, variety, mixture, and commodity), localities, and the IMA (International Mineralogical Association)-approved mineral list. In addition to providing functions for querying those data subjects, the package supports exporting data to various formats such as CSV, JSON-LD, and TTL. In applications, these functions only require minor coding and provide invaluable convenience for users with limited R environment experience. The package is open on GitHub under the MIT license and with detailed tutorial documentation. The field of mineralogy and many other geoscience disciplines are facing the opportunities enabled by open data. Various research topics such as mineral network analysis, mineral association rule mining, mineral ecology, mineral evolution, and critical minerals have already benefited from Mindat's open data. We hope this R package will accelerate the process of those data-intensive studies and lead to more scientific discoveries.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share

Journal article(s) based on this preprint

23 Jul 2025
The OpenMindat v1.0.0 R package: a machine interface to Mindat open data to facilitate data-intensive geoscience discoveries
Xiang Que, Jiyin Zhang, Weilin Chen, Jolyon Ralph, and Xiaogang Ma
Geosci. Model Dev., 18, 4455–4467, https://doi.org/10.5194/gmd-18-4455-2025,https://doi.org/10.5194/gmd-18-4455-2025, 2025
Short summary
Xiang Que, Jiyin Zhang, Weilin Chen, Jolyon Ralph, and Xiaogang Ma

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CC1: 'Comment on egusphere-2024-1141', Wenqiang Tang, 14 Nov 2024
  • CC2: 'Comment on egusphere-2024-1141', Sensen Wu, 14 Nov 2024
  • RC1: 'Comment on egusphere-2024-1141', Anonymous Referee #1, 25 Nov 2024
  • RC2: 'Comment on egusphere-2024-1141', Dominik Hezel, 21 Dec 2024
  • AC1: 'Comment on egusphere-2024-1141', Xiaogang Ma, 31 Jan 2025
  • AC2: 'Comment on egusphere-2024-1141', Xiaogang Ma, 31 Jan 2025

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CC1: 'Comment on egusphere-2024-1141', Wenqiang Tang, 14 Nov 2024
  • CC2: 'Comment on egusphere-2024-1141', Sensen Wu, 14 Nov 2024
  • RC1: 'Comment on egusphere-2024-1141', Anonymous Referee #1, 25 Nov 2024
  • RC2: 'Comment on egusphere-2024-1141', Dominik Hezel, 21 Dec 2024
  • AC1: 'Comment on egusphere-2024-1141', Xiaogang Ma, 31 Jan 2025
  • AC2: 'Comment on egusphere-2024-1141', Xiaogang Ma, 31 Jan 2025

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload
AR by Xiaogang Ma on behalf of the Authors (31 Jan 2025)  Author's response   Author's tracked changes   Manuscript 
ED: Referee Nomination & Report Request started (01 Feb 2025) by Andy Wickert
RR by Dominik Hezel (04 Mar 2025)
RR by Anonymous Referee #1 (22 Mar 2025)
ED: Publish subject to technical corrections (08 May 2025) by Andy Wickert
AR by Xiaogang Ma on behalf of the Authors (14 May 2025)  Author's response   Manuscript 

Journal article(s) based on this preprint

23 Jul 2025
The OpenMindat v1.0.0 R package: a machine interface to Mindat open data to facilitate data-intensive geoscience discoveries
Xiang Que, Jiyin Zhang, Weilin Chen, Jolyon Ralph, and Xiaogang Ma
Geosci. Model Dev., 18, 4455–4467, https://doi.org/10.5194/gmd-18-4455-2025,https://doi.org/10.5194/gmd-18-4455-2025, 2025
Short summary
Xiang Que, Jiyin Zhang, Weilin Chen, Jolyon Ralph, and Xiaogang Ma
Xiang Que, Jiyin Zhang, Weilin Chen, Jolyon Ralph, and Xiaogang Ma

Viewed

Total article views: 1,028 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
665 207 156 1,028 37 39
  • HTML: 665
  • PDF: 207
  • XML: 156
  • Total: 1,028
  • BibTeX: 37
  • EndNote: 39
Views and downloads (calculated since 11 Jun 2024)
Cumulative views and downloads (calculated since 11 Jun 2024)

Viewed (geographical distribution)

Total article views: 1,015 (including HTML, PDF, and XML) Thereof 1,015 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 23 Jul 2025
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
This paper describes an R package as the machine interface to the open data of Mindat.org, one of the world's most widely used databases of mineral species and their distribution. In the past decades many geoscientists have been using the Mindat data, but an open data service has never been fully established. The machine interface described in this paper will be an efficient way to meet the overwhelming data needs.
Share