Preprints
https://doi.org/10.5194/egusphere-2024-2818
https://doi.org/10.5194/egusphere-2024-2818
30 Sep 2024
 | 30 Sep 2024

Unbiased statistical length analysis of linear features: Adapting survival analysis to geological applications

Gabriele Benedetti, Stefano Casiraghi, Daniela Bertacchi, and Andrea Luigi Paolo Bistacchi

Abstract. A proper quantitative statistical characterization of fracture length (or height) is of paramount importance when analysing outcrops of fractured rocks. Past literature suggested adopting a non-parametric approach, using circular scanlines, for the unbiased estimation of the fracture length mean value. However, necessities shifted and now there is an increasing demand for parametric solutions to correctly estimate and compare all the parameters (e.g. mean AND standard deviation) of several types of distributions. These changing requirements highlighted the absence in geological literature of properly structured theoretical works on this topic and in particular on different biases that affect this estimate. Here we propose to tackle the right censoring bias, caused by limited size of outcrops with respect to fracture length, by applying survival analysis techniques: a branch of statistics focused on modelling time to event data and correctly estimating model parameters with data affected by censoring. After discussing both theoretical and practical aspects of survival analysis applied to geological datasets, we propose a novel approach for selecting the most representative parametric model (i.e. statistical distribution), combining a direct visual approach and distance statistics modified to accommodate for censored data. The proposed approach has been applied to real outcrop data, correctly estimating censored length distributions. We also show the effects of censoring percentage on crude parametrical estimation that do not use this paradigm. The theory and techniques discussed here are wrapped in an easily installable open-source Python package called FracAbility (https://github.com/gecos-lab/FracAbility).

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Gabriele Benedetti, Stefano Casiraghi, Daniela Bertacchi, and Andrea Luigi Paolo Bistacchi

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CC1: 'Comment on egusphere-2024-2818', Stephen Laubach, 05 Nov 2024
    • AC1: 'Reply on CC1', Gabriele Benedetti, 08 Nov 2024
  • RC1: 'Comment on egusphere-2024-2818', Sarah Weihmann, 22 Nov 2024
  • RC2: 'Comment on egusphere-2024-2818', David Healy, 05 Dec 2024
Gabriele Benedetti, Stefano Casiraghi, Daniela Bertacchi, and Andrea Luigi Paolo Bistacchi

Data sets

Input shapefiles Stefano Casiraghi https://github.com/gecos-lab/FracAbility/tree/main/paper_materials

Model code and software

FracAbility source-code Gabriele Benedetti https://github.com/gecos-lab/FracAbility

Gabriele Benedetti, Stefano Casiraghi, Daniela Bertacchi, and Andrea Luigi Paolo Bistacchi

Viewed

Total article views: 446 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
247 107 92 446 2 4
  • HTML: 247
  • PDF: 107
  • XML: 92
  • Total: 446
  • BibTeX: 2
  • EndNote: 4
Views and downloads (calculated since 30 Sep 2024)
Cumulative views and downloads (calculated since 30 Sep 2024)

Viewed (geographical distribution)

Total article views: 433 (including HTML, PDF, and XML) Thereof 433 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 13 Dec 2024
Download
Short summary
At any scale, the limited size of a study area introduces a bias in the interpretation of linear features, defined as right-censoring bias. We show the effects of not considering such bias and apply survival analysis techniques to obtain unbiased estimates of multiple parametrical distributions in three censored length datasets. Finally, we propose a novel approach to select the most representative model from a sensible candidate pool using the Probability Integral Transform technique.