Impact of sedimentary facies on machine learning of acoustic impedance from seismic data: lessons from a geologically realistic 3D model
We have developed a new machine-learning (ML) workflow that uses random forest (RF) regression to predict sedimentary-rock properties from stacked and migrated 3D seismic data. The training, validation, and testing are performed with 40 features extracted from a geologically realistic 46 × 66-trace model built in the Miocene Powderhorn Field in South Texas. We focus on the responses of the RF model to sedimentary facies and the strategies adopted to achieve better prediction with various data conditions. We apply explained variation (R2) and root-mean-square (rms) prediction errors to map the relationship between the quality of prediction and the sedimentary facies. In the single-well model, R2 and rms error maps highly resemble sand-percentage maps, or lithofacies maps, showing the facies control on the quality of the ML model. We observe that training with a small well data set (1–10 wells) leads to low and unstable test scores (R2 = 0.2–0.7). The R2 score increases and stabilizes with more (as many as 1000) training wells (R2 = 0.7–0.9), realizing the effective correction of facies bias. The stratigraphic and spatial features are useful and should be used. Weak to moderate random noise (−20 to −15 dB) slightly lowers the training score (R2 < 0.05) and should not be a major concern. Sparse well-supported models can outperform linear regression and model-based inversion and can be useful if caution is exercised. In the best-case scenario (500 wells), the predicted model largely duplicates the true model with a significant improvement in accuracy (R2 = 0.85) and stability. Such results can be applied in most, if not all, exploration and production practices.