the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Multiphysics property prediction from hyperspectral drill core data
Abstract. Hyperspectral data provides rich quantitative information on both the mineralogical and fine-scale textural properties of rocks, which, in turn, largely control their petrophysical characteristics. We therefore developed a deep learning model to predict petrophysical properties directly from hyperspectral drill core data. Our model learns relevant features from high-dimensional hyperspectral data and co-registered sonic, gamma-gamma density and gamma-ray logs to infer slowness, density, and gamma-ray counts. We demonstrated the performance of this approach on data acquired in the Spremberg region of Germany. Our results demonstrate that with meticulous pre-processing steps and thorough data cleaning, one can overcome the difference in capturing resolution and learn the relationship between hyperspectral data and petrophysics. Using a test dataset from a spatially independent borehole, we generate a pixel-resolution (≈ 1 mm2) model of the petrophysical properties and resample it to match the measured logs. This test indicates substantial accuracy, with R2 scores and root-mean-squared errors (RMSE) of 0.7 and 16.55 μs.m-1, 0.86 and 0.06 g.cm-3 and 0.90 and 15.29 API for the slowness, density and gamma-ray readings respectively. Overall, our findings lay the groundwork for building deep learning models that can learn to predict physical and mechanical rock properties from hyperspectral data. Such models could provide the high-resolution but large-extent data needed to bridge the different scales of mechanical and petrophysical characterisation.
- Preprint
(5778 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 07 Feb 2025)
-
RC1: 'Comment on egusphere-2024-3448', Andres Ortega Lucero & Steven Micklethwaite (co-review team), 19 Dec 2024
reply
Please find attached the document revision.
-
RC2: 'Comment on egusphere-2024-3448', McLean Trott, 20 Jan 2025
reply
Great job, interesting work. A few comments and suggested changes:
Some food for thought... If I'm using gamma logs as input parameters to predict formation, for instance, and I also have VNIR-SWIR-MWIR-LWIR data for the same holes and find that it can accurately predict gamma values, why not directly predict lithology? Same logic for sonic logs... If the reason for acquiring sonic logs is to log porosity/permeability, why not directly predict that rather than travel times? This is just a thought exercise, to spur you to think about end-user applications, it in no way invalidates your work.
Section 3.3 (Data co-registration). Downhole geophysical tools typically start measuring distance from surface at 0 and measure in a linear fashion downhole based on how much line has been unspooled. Really they have the best depth registration of virtually all the drillhole analysis methods, including core scanning. Core scanning hardware typically registers depth between driller blocks or on a per-box basis. Not as accurate, and depending on the circumstances may be significantly different from the depths provided by wireline geophysical tools. Section 3.3 does not address this issue of co-registration. Or perhaps there is an underlying assumption that the scanned data depths are accurate and correspond to the geophysical depths? Either way this should be addressed or at least acknowledged. It's one of the greatest barriers to performing ML workflows on drillhole data.
Section 3.5 (Data balancing): You've used HBSCAN to cluster the data and further identify noise (class -1) which you've removed from the dataset. You've mentioned resiliency to hyperparameter selection--- it is actually very well established that HDBSCAN outcomes are highly sensitive to hyperparameter selection, particularly the distance metric, min_samples, and min_clusters parameters. sklearn can automate hyperparameter selection using Randomized Search Cross Validation, which seeks to optimize the validity index for iterated hyperparameters, otherwise hyperparameter tuning is highly manual and hugely impacts the number of clusters and cluster distributions returned. I'd strongly suggest addressing this.
In the 3rd paragraph of section 3.5 you describe using stratified sampling; what was the category that you stratified? The drillhole ids? The HDBSCAN clusters? Did you stratify to bins of a petrophysical parameter? Please clarify.
In general I think you've done a great job of transmitting complex themes accessibly. The workflow described, aside from the above points, is clear and seems robust.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
141 | 49 | 9 | 199 | 5 | 5 |
- HTML: 141
- PDF: 49
- XML: 9
- Total: 199
- BibTeX: 5
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1