Unbiased statistical length analysis of linear features: Adapting survival analysis to geological applications
Abstract. A proper quantitative statistical characterization of fracture length (or height) is of paramount importance when analysing outcrops of fractured rocks. Past literature suggested adopting a non-parametric approach, using circular scanlines, for the unbiased estimation of the fracture length mean value. However, necessities shifted and now there is an increasing demand for parametric solutions to correctly estimate and compare all the parameters (e.g. mean AND standard deviation) of several types of distributions. These changing requirements highlighted the absence in geological literature of properly structured theoretical works on this topic and in particular on different biases that affect this estimate. Here we propose to tackle the right censoring bias, caused by limited size of outcrops with respect to fracture length, by applying survival analysis techniques: a branch of statistics focused on modelling time to event data and correctly estimating model parameters with data affected by censoring. After discussing both theoretical and practical aspects of survival analysis applied to geological datasets, we propose a novel approach for selecting the most representative parametric model (i.e. statistical distribution), combining a direct visual approach and distance statistics modified to accommodate for censored data. The proposed approach has been applied to real outcrop data, correctly estimating censored length distributions. We also show the effects of censoring percentage on crude parametrical estimation that do not use this paradigm. The theory and techniques discussed here are wrapped in an easily installable open-source Python package called FracAbility (https://github.com/gecos-lab/FracAbility).