Improving Cost Estimation in Internet Advertising Using Machine Learning: Preliminary Results

Abstract Study question Can we develop a real-time diagnostic tool for chronic endometritis (CE) by using attenuated total reflection-Fourier transform infrared (ATR-FTIR) spectroscopy to evaluate biopsies obtained during hysteroscopy? Summary answer A discrimination model based on the absorbance data was developed by machine learning techniques, differentiating between positive and negative CE histopathology with 97% accuracy. What is known already CE is diagnosed in approximately 15% of infertile women who undergo in vitro fertilization (IVF), in 42% of women with recurrent implantation failure (RIF), and in 57.8% of women with RPL. Diagnosis is done by endometrial biopsy, and the presence of plasma cells in the endometrial stroma is the generally accepted histological diagnostic criterion. However, the histological detection of CE is time-consuming and difficult. ATR-FTIR spectroscopy is a non-destructive method that can provide valuable information on biochemical changes that occur during pathological processes, such as inflammation and cancer. Study design, size, duration We performed a prospective study in which fresh biopsies of endometrium were obtained during standard hysteroscopies. Each biopsy was examined by the spectrophotometer and afterward by histopathological analysis in which multiple myeloma oncogene 1 (MUM–1) staining for plasma cells, a marker of CE, was performed. We planned to investigate 80 samples to develop a discrimination model, and another 40 samples for validation of the model. The study was planned to last two years. Participants/materials, setting, methods Women that underwent hysteroscopy as a part of infertility evaluation were recruited. The hysteroscopies and the biopsy evaluation were performed at the same center. A cut-off of 8 MUM–1 positive cells per 10 high power fields (HPF) was set. We compared the spectroscopy analysis of the positive CE group (≥8) and the negative CE group (<8). Machine learning technique was utilized to build discrimination models. Data analysis was performed using Matlab and Unscrambler software packages. Main results and the role of chance We present preliminary results for our study. Forty-two women were recruited from January 2020 until November 2020. Of the 42 measured spectra, three were discarded due to high measurement noise. Of the 39 biopsies, 33 had MUM–1<8 (CE negative group) and 6 had MUM–1≥8 (CE positive group). Measured spectra of tissue smears from CE negative and positive groups differed from each other in the spectral range of 850–990 [cm–1] (p < 0.05). This wavenumber can be associated with the C-H in-plane bend in the alkene group (CnH2n). A discriminant model was developed between the groups using the Principal Component Analysis and Linear Discriminant Analysis techniques. The accuracy obtained by the model was 97%. We divided the 39 hysteroscopies based on the CE signs into 2 groups: “Negative hysteroscopic-CE” and “Positive hysteroscopic-CE”. Positive hysteroscopic signs were micropolyps, strawberry pattern, hyperemia, punctuation, or pale endometrium. Twenty-three samples were taken in the Negative group and 16 samples were taken in the Positive group. However, measured spectra of tissue smears from negative and positive hysteroscopy groups were not significantly different. The correlation coefficient between hysteroscopy groups and MUM–1 score was r = 0.29, meaning that the characteristic signs of CE in hysteroscopy were not correlated to the histopathology. Limitations, reasons for caution First, these are preliminary results and we need to investigate more samples to validate our model. Second, diagnostic criteria for CE are diverse in the literature and we chose 8 MUM–1 positive cells in 10 HPF, a criterion which may not be accepted by all experts in the field. Wider implications of the findings: ATR-FTIR spectroscopy is highly sensitive to molecular changes and has been utilized as a diagnostic tool in a variety of clinical studies. While histopathological results take about two weeks, ATR-FTIR spectroscopy might give us the possibility to diagnose CE in real-time, allowing an immediate initiation of the appropriate treatment. Trial registration number ClinicalTrials.gov Identifier: NCT04197167

Download Full-text

Combining Machine Learning and Classic Drilling Theories to Improve Rate of Penetration Prediction

10.2118/202202-ms ◽

2021 ◽

Author(s):

Hongbao Zhang ◽

Baoping Lu ◽

Lulu Liao ◽

Hongzhi Bao ◽

Zhifa Wang ◽

...

Keyword(s):

Machine Learning ◽

Data Processing ◽

Cost Estimation ◽

Linear Process ◽

Parameters Optimization ◽

Data Driven ◽

Rock Properties ◽

Rate Of Penetration ◽

Modelling Method ◽

Drilling Parameters

Abstract Theoretically, rate of penetration (ROP) model is the basic to drilling parameters design, ROP improvement tools selection and drill time & cost estimation. Currently, ROP modelling is mainly conducted by two approaches: equation-based approach and machine learning approach, and machine learning performs better because of the capacity in high-dimensional and non-linear process modelling. However, in deep or deviated wells, the ROP prediction accuracy of machine learning is always unsatisfied mainly because the energy loss along the wellbore and drill string is non-negligible and it's difficult to consider the effect of wellbore geometry in machine learning models by pure data-driven methods. Therefore, it's necessary to develop robust ROP modelling method for different scenarios. In the paper, the performance of several equation-based methods and machine learning methods are evaluated by data from 82 wells, the technical features and applicable scopes of different methods are analysed. A new machine learning based ROP modelling method suitable for different well path types was proposed. Integrated data processing pipeline was designed to dealing with data noises, data missing, and discrete variables. ROP effecting factors were analysed, including mechanical parameters, hydraulic parameters, bit characteristics, rock properties, wellbore geometry, etc. Several new features were created by classic drilling theories, such as downhole weight on bit (DWOB), hydraulic impact force, formation heterogeneity index, etc. to improve the efficiency of learning from data. A random forest model was trained by cross validation and hyperparameters optimization methods. Field test results shows that the model could predict the ROP in different hole sections (vertical, deviated and horizontal) and different drilling modes (sliding and rotating drilling) and the average accuracy meets the requirement of well planning. A novel data processing and feature engineering workflow was designed according the characteristics of ROP modelling in different well path types. An integrated data-driven ROP modelling and optimization software was developed, including functions of mechanical specific energy analysis, bit wear analysis and predict, 2D & 3D ROP sensitivity analysis, offset wells benchmark, ROP prediction, drilling parameters constraints analysis, cost per meter prediction, etc. and providing quantitative evidences for drilling parameters optimization, drilling tools selection and well time estimation.

Download Full-text

Indirect Estimation of Clastic Reservoir Rock Grain Size from Wireline Logs Using a Supervised Nearest Neighbor Algorithm: Preliminary Results

10.2118/205156-ms ◽

2021 ◽

Author(s):

Fatai Adesina Anifowose ◽

Mokhles Mustafa Mezghani ◽

Saeed Saad Shahrani

Keyword(s):

Machine Learning ◽

Grain Size ◽

Nearest Neighbor ◽

Reservoir Rock ◽

Alternative Methods ◽

Preliminary Results ◽

Wireline Logs ◽

Core Descriptions ◽

Core Description ◽

Clastic Reservoir

Abstract Reservoir rock textural properties such as grain size are typically estimated by direct visual observation of the physical texture of core samples. Grain size is one of the important inputs to petrophysical characterization, sedimentological facies classification, identification of depositional environments, and saturation models. A continuous log of grain size distribution over targeted reservoir sections is usually required for these applications. Core descriptions are typically not available over an entire targeted reservoir section. Physical core data may also be damaged during retrieval or due to plugging. Alternative methods proposed in literature are not sustainable due to their limitations in terms of input data requirements and inflexibility to apply them in environments with different geological settings. This paper presents the preliminary results of our investigation of a new methodology based on machine learning technology to complement and enhance the traditional core description and the alternative methods. We developed and optimized supervised machine learning models comprising K-nearest neighbor (KNN), support vector machines (SVM), and decision tree (DT) to indirectly estimate reservoir rock grain size for a new well or targeted reservoir sections from historical wireline logs and archival core descriptions. We used anonymized datasets consisting of nine wells from a clastic reservoir. Seven of the wells were used to train and optimize the models while the remaining two were reserved for validation. The grain size types range from clay to pebbles. The performance of the models confirmed the feasibility of this approach. The KNN, SVM, and DT models demonstrated the capability to estimate the grain size for the test wells by matching actual data with a minimum of 60% and close to 80% accuracy. This is an accomplishment taking into account the uncertainties inherent in the core analysis data. Further analysis of the results showed that the KNN model is the most accurate in performance compared to the other models. For future studies, we will explore more advanced classification algorithms and implement new class labeling strategies to improve the accuracy of this methodology. The attainment of this objective will further help to handle the complexity in the grain size estimation challenge and reduce the current turnaround time for core description.

Download Full-text