Advanced Machine Learning Methods for Prediction of Fracture Closure Pressure

Learning Methods ◽

Pressure Derivative ◽

Fracture Geometry ◽

Personal Bias ◽

Fracture Closure

Abstract Determining the closure pressure is crucial for optimal hydraulic fracturing design and successful execution of fracturing treatment. Historically, the use of diagnostic tests before the main fracturing treatment has significantly advanced to gain more information about the pattern of fracture propagation and fluid performance to optimize the designs. The goal is to inject a small volume of fracturing fluid to breakdown the formation and create small fracture geometry, then once pumping is stopped the pressure decline is analyzed to observe the fracture closure. Many analytical methods such as G-Function, square root of time, etc. have been developed to determine the fracture closure pressure. There are cases in which there is difficulty in determining the fracture closure pressure, as well as personal bias and field experiences make it challenging to interpret the changes in the pressure derivative slope and identify fracture closure. These conditions include: High permeability reservoirs where fracture closure occurs very fast due to the quick fluid leakoff.Extremely low permeability reservoir, which requires a long shut-in time for the fluid to leak off and determine the fracture closure pressure.The non-ideal fluid leak-off behavior under complex conditions. The objective of this study is to apply machine learning methods to implement a predesigned algorithm to execute the required tasks and predict the fracture closure pressure while minimizing the shortcomings in determining the closure pressure for non-ideal or subjective conditions. This paper demonstrates training different supervised machine learning algorithms to help predict fracture closure pressure. The workflow involves using the datasets to train and optimize the models, which subsequently are used to predict the closure pressure of testing data. The output results are then compared with actual results from more than 120 DFIT data points. We further propose an integrated approach to feature selection and dataset processing and study the effects of data processing on the success of the model prediction. The results from this study limit the subjectivity and the need for the experience of personal interpreting the data. We speculate that a linear regression and MLP neural network algorithms can yield high scores in the prediction of fracture closure pressure.

Hydraulic Flow Unit Classification and Prediction Using Machine Learning Techniques: A Case Study from the Nam Con Son Basin, Offshore Vietnam

Energies ◽

10.3390/en14227714 ◽

2021 ◽

Vol 14 (22) ◽

pp. 7714

Author(s):

Ha Quang Man ◽

Doan Huy Hien ◽

Kieu Duy Thong ◽

Bui Viet Dung ◽

Nguyen Minh Hoa ◽

...

Keyword(s):

Machine Learning ◽

Flow Unit ◽

Support Vector ◽

Learning Methods ◽

Log Data ◽

Hydraulic Flow ◽

Core Data ◽

The test study area is the Miocene reservoir of Nam Con Son Basin, offshore Vietnam. In the study we used unsupervised learning to automatically cluster hydraulic flow units (HU) based on flow zone indicators (FZI) in a core plug dataset. Then we applied supervised learning to predict HU by combining core and well log data. We tested several machine learning algorithms. In the first phase, we derived hydraulic flow unit clustering of porosity and permeability of core data using unsupervised machine learning methods such as Ward’s, K mean, Self-Organize Map (SOM) and Fuzzy C mean (FCM). Then we applied supervised machine learning methods including Artificial Neural Networks (ANN), Support Vector Machines (SVM), Boosted Tree (BT) and Random Forest (RF). We combined both core and log data to predict HU logs for the full well section of the wells without core data. We used four wells with six logs (GR, DT, NPHI, LLD, LSS and RHOB) and 578 cores from the Miocene reservoir to train, validate and test the data. Our goal was to show that the correct combination of cores and well logs data would provide reservoir engineers with a tool for HU classification and estimation of permeability in a continuous geological profile. Our research showed that machine learning effectively boosts the prediction of permeability, reduces uncertainty in reservoir modeling, and improves project economics.

Comparative Analysis of Machine Learning Algorithms on Surface Enhanced Raman Spectra of Clinical Staphylococcus Species

Frontiers in Microbiology ◽

10.3389/fmicb.2021.696921 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jia-Wei Tang ◽

Qing-Hua Liu ◽

Xiao-Cong Yin ◽

Ya-Cheng Pan ◽

Peng-Bo Wen ◽

...

Keyword(s):

Machine Learning ◽

Raman Spectroscopy ◽

Raman Spectra ◽

Surface Enhanced Raman Spectroscopy ◽

Learning Methods ◽

Surface Enhanced ◽

Surface Enhanced Raman

Raman spectroscopy (RS) is a widely used analytical technique based on the detection of molecular vibrations in a defined system, which generates Raman spectra that contain unique and highly resolved fingerprints of the system. However, the low intensity of normal Raman scattering effect greatly hinders its application. Recently, the newly emerged surface enhanced Raman spectroscopy (SERS) technique overcomes the problem by mixing metal nanoparticles such as gold and silver with samples, which greatly enhances signal intensity of Raman effects by orders of magnitudes when compared with regular RS. In clinical and research laboratories, SERS provides a great potential for fast, sensitive, label-free, and non-destructive microbial detection and identification with the assistance of appropriate machine learning (ML) algorithms. However, choosing an appropriate algorithm for a specific group of bacterial species remains challenging, because with the large volumes of data generated during SERS analysis not all algorithms could achieve a relatively high accuracy. In this study, we compared three unsupervised machine learning methods and 10 supervised machine learning methods, respectively, on 2,752 SERS spectra from 117 Staphylococcus strains belonging to nine clinically important Staphylococcus species in order to test the capacity of different machine learning methods for bacterial rapid differentiation and accurate prediction. According to the results, density-based spatial clustering of applications with noise (DBSCAN) showed the best clustering capacity (Rand index 0.9733) while convolutional neural network (CNN) topped all other supervised machine learning methods as the best model for predicting Staphylococcus species via SERS spectra (ACC 98.21%, AUC 99.93%). Taken together, this study shows that machine learning methods are capable of distinguishing closely related Staphylococcus species and therefore have great application potentials for bacterial pathogen diagnosis in clinical settings.

Defect Detection in Atomic Resolution Transmission Electron Microscopy Images Using Machine Learning

Mathematics ◽

10.3390/math9111209 ◽

2021 ◽

Vol 9 (11) ◽

pp. 1209

Author(s):

Philip Cho ◽

Aihua Wood ◽

Krishnamurthy Mahalingam ◽

Kurt Eyink

Keyword(s):

Electron Microscopy ◽

Machine Learning ◽

Transmission Electron Microscopy ◽

Point Defects ◽

Defect Detection ◽

Learning Methods ◽

Transmission Electron

Point defects play a fundamental role in the discovery of new materials due to their strong influence on material properties and behavior. At present, imaging techniques based on transmission electron microscopy (TEM) are widely employed for characterizing point defects in materials. However, current methods for defect detection predominantly involve visual inspection of TEM images, which is laborious and poses difficulties in materials where defect related contrast is weak or ambiguous. Recent efforts to develop machine learning methods for the detection of point defects in TEM images have focused on supervised methods that require labeled training data that is generated via simulation. Motivated by a desire for machine learning methods that can be trained on experimental data, we propose two self-supervised machine learning algorithms that are trained solely on images that are defect-free. Our proposed methods use principal components analysis (PCA) and convolutional neural networks (CNN) to analyze a TEM image and predict the location of a defect. Using simulated TEM images, we show that PCA can be used to accurately locate point defects in the case where there is no imaging noise. In the case where there is imaging noise, we show that incorporating a CNN dramatically improves model performance. Our models rely on a novel approach that uses the residual between a TEM image and its PCA reconstruction.

Systematic literature review of machine learning methods used in the analysis of real-world data for patient-provider decision making

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01403-2 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Alan Brnabic ◽

Lisa M. Hess

Keyword(s):

Machine Learning ◽

Decision Making ◽

Literature Review ◽

Systematic Literature Review ◽

Real World ◽

Learning Algorithms ◽

External Validation ◽

Learning Methods ◽

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.

FRI0046 PHARMACOGENOMICS-DRIVEN INDIVIDUALIZED PREDICTION OF TREATMENT RESPONSE TO METHOTREXATE IN PATIENTS WITH RHEUMATOID ARTHRITIS: A MACHINE LEARNING APPROACH

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.4993 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 598.2-598

Author(s):

E. Myasoedova ◽

A. Athreya ◽

C. S. Crowson ◽

R. Weinshilboum ◽

L. Wang ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Machine Learning ◽

Research Support ◽

Eular Response ◽

Learning Methods ◽

Early Ra ◽

Genome Wide

Background:Methotrexate (MTX) is the most common anchor drug for rheumatoid arthritis (RA), but the risk of missing the opportunity for early effective treatment with alternative medications is substantial given the delayed onset of MTX action and 30-40% inadequate response rate. There is a compelling need to accurately predicting MTX response prior to treatment initiation, which allows for effectively identifying patients at RA onset who are likely to respond to MTX.Objectives:To test the ability of machine learning approaches with clinical and genomic biomarkers to predict MTX response with replications in independent samples.Methods:Age, sex, clinical, serological and genome-wide association study (GWAS) data on patients with early RA of European ancestry from 647 patients (336 recruited in United Kingdom [UK]; 307 recruited across Europe; 70% female; 72% rheumatoid factor [RF] positive; mean age 54 years; mean baseline Disease Activity Score with 28-joint count [DAS28] 5.65) of the PhArmacogenetics of Methotrexate in RA (PAMERA) consortium was used in this study. The genomics data comprised 160 genome-wide significant single nucleotide polymorphisms (SNPs) with p<1×10-5 associated with risk of RA and MTX metabolism. DAS28 score was available at baseline and 3-month follow-up visit. Response to MTX monotherapy at the dose of ≥15 mg/week was defined as good or moderate by the EULAR response criteria at 3 months’ follow up visit. Supervised machine-learning methods were trained with 5-repeats and 10-fold cross-validation using data from PAMERA’s 336 UK patients. Class imbalance (higher % of MTX responders) in training was accounted by using simulated minority oversampling technique. Prediction performance was validated in PAMERA’s 307 European patients (not used in training).Results:Age, sex, RF positivity and baseline DAS28 data predicted MTX response with 58% accuracy of UK and European patients (p = 0.7). However, supervised machine-learning methods that combined demographics, RF positivity, baseline DAS28 and genomic SNPs predicted EULAR response at 3 months with area under the receiver operating curve (AUC) of 0.83 (p = 0.051) in UK patients, and achieved prediction accuracies (fraction of correctly predicted outcomes) of 76.2% (p = 0.054) in the European patients, with sensitivity of 72% and specificity of 77%. The addition of genomic data improved the predictive accuracies of MTX response by 19% and achieved cross-site replication. Baseline DAS28 scores and following SNPs rs12446816, rs13385025, rs113798271, and rs2372536 were among the top predictors of MTX response.Conclusion:Pharmacogenomic biomarkers combined with DAS28 scores predicted MTX response in patients with early RA more reliably than using demographics and DAS28 scores alone. Using pharmacogenomics biomarkers for identification of MTX responders at early stages of RA may help to guide effective RA treatment choices, including timely escalation of RA therapies. Further studies on personalized prediction of response to MTX and other anti-rheumatic treatments are warranted to optimize control of RA disease and improve outcomes in patients with RA.Disclosure of Interests:Elena Myasoedova: None declared, Arjun Athreya: None declared, Cynthia S. Crowson Grant/research support from: Pfizer research grant, Richard Weinshilboum Shareholder of: co-founder and stockholder in OneOme, Liewei Wang: None declared, Eric Matteson Grant/research support from: Pfizer, Consultant of: Boehringer Ingelheim, Gilead, TympoBio, Arena Pharmaceuticals, Speakers bureau: Simply Speaking

Machine-learning based prediction of Cushing’s syndrome in dogs attending UK primary-care veterinary practice

Scientific Reports ◽

10.1038/s41598-021-88440-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Imogen Schofield ◽

David C. Brodbelt ◽

Noel Kennedy ◽

Stijn J. M. Niessen ◽

David B. Church ◽

...

Keyword(s):

Machine Learning ◽

Cushing’S Syndrome ◽

Clinical Decision Making ◽

Predictive Performance ◽

Clinical Decision ◽

Cushing's Syndrome ◽

Learning Methods ◽

Clinical Records

AbstractCushing’s syndrome is an endocrine disease in dogs that negatively impacts upon the quality-of-life of affected animals. Cushing’s syndrome can be a challenging diagnosis to confirm, therefore new methods to aid diagnosis are warranted. Four machine-learning algorithms were applied to predict a future diagnosis of Cushing's syndrome, using structured clinical data from the VetCompass programme in the UK. Dogs suspected of having Cushing's syndrome were included in the analysis and classified based on their final reported diagnosis within their clinical records. Demographic and clinical features available at the point of first suspicion by the attending veterinarian were included within the models. The machine-learning methods were able to classify the recorded Cushing’s syndrome diagnoses, with good predictive performance. The LASSO penalised regression model indicated the best overall performance when applied to the test set with an AUROC = 0.85 (95% CI 0.80–0.89), sensitivity = 0.71, specificity = 0.82, PPV = 0.75 and NPV = 0.78. The findings of our study indicate that machine-learning methods could predict the future diagnosis of a practicing veterinarian. New approaches using these methods could support clinical decision-making and contribute to improved diagnosis of Cushing’s syndrome in dogs.

Identifying Spatiotemporal Patterns in Land Use and Cover Samples from Satellite Image Time Series

Remote Sensing ◽

10.3390/rs13050974 ◽

2021 ◽

Vol 13 (5) ◽

pp. 974

Author(s):

Lorena Alves Santos ◽

Karine Ferreira ◽

Michelle Picoli ◽

Gilberto Camara ◽

Raul Zurita-Milla ◽

...

Keyword(s):

Machine Learning ◽

Land Use ◽

Time Series ◽

Satellite Image ◽

Spatiotemporal Patterns ◽

Learning Methods ◽

Self Organizing Maps ◽

Land Use And Cover

The use of satellite image time series analysis and machine learning methods brings new opportunities and challenges for land use and cover changes (LUCC) mapping over large areas. One of these challenges is the need for samples that properly represent the high variability of land used and cover classes over large areas to train supervised machine learning methods and to produce accurate LUCC maps. This paper addresses this challenge and presents a method to identify spatiotemporal patterns in land use and cover samples to infer subclasses through the phenological and spectral information provided by satellite image time series. The proposed method uses self-organizing maps (SOMs) to reduce the data dimensionality creating primary clusters. From these primary clusters, it uses hierarchical clustering to create subclusters that recognize intra-class variability intrinsic to different regions and periods, mainly in large areas and multiple years. To show how the method works, we use MODIS image time series associated to samples of cropland and pasture classes over the Cerrado biome in Brazil. The results prove that the proposed method is suitable for identifying spatiotemporal patterns in land use and cover samples that can be used to infer subclasses, mainly for crop-types.

Identification of Village Building via Google Earth Images and Supervised Machine Learning Methods

Remote Sensing ◽

10.3390/rs8040271 ◽

2016 ◽

Vol 8 (4) ◽

pp. 271 ◽

Cited By ~ 29

Author(s):

Zhiling Guo ◽

Xiaowei Shao ◽

Yongwei Xu ◽

Hiroyuki Miyazaki ◽

Wataru Ohira ◽

...

Keyword(s):

Machine Learning ◽

Google Earth ◽

Learning Methods ◽

Seeing It All: Evaluating Supervised Machine Learning Methods for the Classification of Diverse Otariid Behaviours

PLoS ONE ◽

10.1371/journal.pone.0166898 ◽

2016 ◽

Vol 11 (12) ◽

pp. e0166898 ◽

Cited By ~ 15

Author(s):

Monique A. Ladds ◽

Adam P. Thompson ◽

David J. Slip ◽

David P. Hocking ◽

Robert G. Harcourt

Keyword(s):

Machine Learning ◽

Learning Methods ◽

Acoustic feature-based sentiment analysis of call center data

10.32469/10355/66751 ◽

2017 ◽

Author(s):

◽

Zeshan Peng

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Emotion Recognition ◽

Sentiment Analysis ◽

Call Center ◽

Language Recognition ◽

Acoustic Features ◽

Learning Methods ◽

With the advancement of machine learning methods, audio sentiment analysis has become an active research area in recent years. For example, business organizations are interested in persuasion tactics from vocal cues and acoustic measures in speech. A typical approach is to find a set of acoustic features from audio data that can indicate or predict a customer's attitude, opinion, or emotion state. For audio signals, acoustic features have been widely used in many machine learning applications, such as music classification, language recognition, emotion recognition, and so on. For emotion recognition, previous work shows that pitch and speech rate features are important features. This thesis work focuses on determining sentiment from call center audio records, each containing a conversation between a sales representative and a customer. The sentiment of an audio record is considered positive if the conversation ended with an appointment being made, and is negative otherwise. In this project, a data processing and machine learning pipeline for this problem has been developed. It consists of three major steps: 1) an audio record is split into segments by speaker turns; 2) acoustic features are extracted from each segment; and 3) classification models are trained on the acoustic features to predict sentiment. Different set of features have been used and different machine learning methods, including classical machine learning algorithms and deep neural networks, have been implemented in the pipeline. In our deep neural network method, the feature vectors of audio segments are stacked in temporal order into a feature matrix, which is fed into deep convolution neural networks as input. Experimental results based on real data shows that acoustic features, such as Mel frequency cepstral coefficients, timbre and Chroma features, are good indicators for sentiment. Temporal information in an audio record can be captured by deep convolutional neural networks for improved prediction accuracy.