THE APPLICATION OF PATTERN RECOGNITION AND MACHINE LEARNING TO DETERMINE CEMENT CHANNELING & BOND QUALITY FROM AZIMUTHAL CEMENT BOND LOGS

Mapping Intimacies ◽

10.30632/spwla-2021-0068 ◽

2021 ◽

Author(s):

Andrew Imrie ◽

Keyword(s):

Machine Learning ◽

Pattern Recognition ◽

Input Data ◽

Synthetic Data ◽

Image Data ◽

Automated Analysis ◽

Comparison Method ◽

Automated Classification ◽

Bond Quality ◽

Automated Identification

Cement bond log interpretation methods consist of human pattern recognition and evaluation of the quality of the downhole isolation. Typically, a log interpreter compares acquisition data to their predefined classifications of cement bond quality. This paper outlines a complementary technique of intelligent cement evaluation and the implementation of the analysis of cement evaluation data by utilizing automatic pattern matching and machine learning. The proposed method is capable of defining bond quality across multiple distinct subclassification through analysis of image data using pattern recognition. Libraries of real log responses are used as comparisons to input data, and additionally may be supplemented with synthetic data. Using machine learning and image-based pattern recognition, the bond quality is classified into succinct categories to determine the presence of channeling. Successful classifications of the input data can then be added to the libraries, thus improving future analysis through an iterative process. The system uses the outputs of a conventional azimuthal ultrasonic scanning cement evaluation log and 5-ft CBL waveform to conclude a cement bond interpretation. The 5-ft CBL waveform is an optional addition to the processand improves the interpretation. The system searches forsimilarities between the acquisition data and thatcontained in the library. These similarities are comparedto evaluate the bonding. The process is described in two parts: i) image collection and library classification and ii) pattern recognition and interpretation. The former is the process of generating a readable library of reference data from historical cement evaluation logs and laboratory measurements and the latter is the machine learning and comparison method. Example results are shown with good correlations between automated analysis and interpreter analysis. The system is shown to be particularly capable at the automated identification of channeling of varying sizes, something which would be a challenge when using only the scalar curve representation of azimuthal data. Previously published methodologies for automated classification of bond quality typically utilize scaler data whereas this approach utilizes image-based pattern recognition for automated, learning and intelligent cement evaluation (ALICE). A discussion is presented on the limitations and merits of the ALICE process which include quality control, the removal of analyst bias during interpretation, and the fact that such a system will continually improve in accuracy through supervised training.

Download Full-text

Bayesian machine learning analysis of single-molecule fluorescence colocalization images

10.1101/2021.09.30.462536 ◽

2021 ◽

Author(s):

Yerdos A. Ordabayev ◽

Larry J. Friedman ◽

Jeff Gelles ◽

Douglas L. Theobald

Keyword(s):

Machine Learning ◽

Single Molecule ◽

Specific Binding ◽

Image Data ◽

Automated Analysis ◽

Surface Binding ◽

Single Molecule Fluorescence ◽

Binding Characteristics ◽

Wide Range ◽

Downstream Analysis

AbstractMulti-wavelength single-molecule fluorescence colocalization (CoSMoS) methods allow elucidation of complex biochemical reaction mechanisms. However, analysis of CoSMoS data is intrinsically challenging because of low image signal-to-noise ratios, non-specific surface binding of the fluorescent molecules, and analysis methods that require subjective inputs to achieve accurate results. Here, we use Bayesian probabilistic programming to implement Tapqir, an unsupervised machine learning method based on a holistic, physics-based causal model of CoSMoS data. This method accounts for uncertainties in image analysis due to photon and camera noise, optical non-uniformities, non-specific binding, and spot detection. Rather than merely producing a binary “spot/no spot” classification of unspecified reliability, Tapqir objectively assigns spot classification probabilities that allow accurate downstream analysis of molecular dynamics, thermodynamics, and kinetics. We both quantitatively validate Tapqir performance against simulated CoSMoS image data with known properties and also demonstrate that it implements fully objective, automated analysis of experiment-derived data sets with a wide range of signal, noise, and non-specific binding characteristics.

Download Full-text

CytoCensus, mapping cell identity and division in tissues and organs using machine learning

eLife ◽

10.7554/elife.51085 ◽

2020 ◽

Vol 9 ◽

Author(s):

Martin Hailstone ◽

Dominic Waithe ◽

Tamsin J Samuels ◽

Lu Yang ◽

Ita Costello ◽

...

Keyword(s):

Machine Learning ◽

Automated Analysis ◽

Time Lapse ◽

Supervised Machine Learning ◽

Cell Detection ◽

Automated Identification ◽

User Training ◽

General Utility ◽

Multiple Cell ◽

Mutant Phenotypes

A major challenge in cell and developmental biology is the automated identification and quantitation of cells in complex multilayered tissues. We developed CytoCensus: an easily deployed implementation of supervised machine learning that extends convenient 2D ‘point-and-click’ user training to 3D detection of cells in challenging datasets with ill-defined cell boundaries. In tests on such datasets, CytoCensus outperforms other freely available image analysis software in accuracy and speed of cell detection. We used CytoCensus to count stem cells and their progeny, and to quantify individual cell divisions from time-lapse movies of explanted Drosophila larval brains, comparing wild-type and mutant phenotypes. We further illustrate the general utility and future potential of CytoCensus by analysing the 3D organisation of multiple cell classes in Zebrafish retinal organoids and cell distributions in mouse embryos. CytoCensus opens the possibility of straightforward and robust automated analysis of developmental phenotypes in complex tissues.

Download Full-text

Cluster Validation

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch038 ◽

2011 ◽

pp. 231-236

Author(s):

Ricardo Vilalta ◽

Tomasz Stepinski

Keyword(s):

Machine Learning ◽

Pattern Recognition ◽

Data Analysis ◽

Visual Inspection ◽

Automated Analysis ◽

Learning Tools ◽

Planetary Surfaces ◽

Martian Surface ◽

New Classification ◽

Domain Experts

Spacecrafts orbiting a selected suite of planets and moons of our solar system are continuously sending long sequences of data back to Earth. The availability of such data provides an opportunity to invoke tools from machine learning and pattern recognition to extract patterns that can help to understand geological processes shaping planetary surfaces. Due to the marked interest of the scientific community on this particular planet, we base our current discussion on Mars, where there are presently three spacecrafts in orbit (e.g., NASA’s Mars Odyssey Orbiter, Mars Reconnaissance Orbiter, ESA’s Mars Express). Despite the abundance of available data describing Martian surface, only a small fraction of the data is being analyzed in detail because current techniques for data analysis of planetary surfaces rely on a simple visual inspection and descriptive characterization of surface landforms (Wilhelms, 1990). The demand for automated analysis of Mars surface has prompted the use of machine learning and pattern recognition tools to generate geomorphic maps, which are thematic maps of landforms (or topographical expressions). Examples of landforms are craters, valley networks, hills, basins, etc. Machine learning can play a vital role in automating the process of geomorphic mapping. A learning system can be employed to either fully automate the process of discovering meaningful landform classes using clustering techniques; or it can be used instead to predict the class of unlabeled landforms (after an expert has manually labeled a representative sample of the landforms) using classification techniques. The impact of these techniques on the analysis of Mars topography can be of immense value due to the sheer size of the Martian surface that remains unmapped. While it is now clear that machine learning can greatly help in automating the detailed analysis of Mars’ surface (Stepinski et al., 2007; Stepinski et al., 2006; Bue and Stepinski, 2006; Stepinski and Vilalta, 2005), an interesting problem, however, arises when an automated data analysis has produced a novel classification of a specific site’s landforms. The problem lies on the interpretation of this new classification as compared to traditionally derived classifications generated through visual inspection by domain experts. Is the new classification novel in all senses? Is the new classification only partially novel, with many landforms matching existing classifications? This article discusses how to assess the value of clusters generated by machine learning tools as applied to the analysis of Mars’ surface.

Download Full-text

ACCURACY IMPROVING OF PRE-TRAINED NEURAL NETWORKS BY FINE TUNING

EurasianUnionScientists ◽

10.31618/esu.2413-9335.2021.5.82.1231 ◽

2021 ◽

Vol 5 (1(82)) ◽

pp. 26-28

Author(s):

D. Кonarev ◽

А. Gulamov

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Pattern Recognition ◽

Artificial Neural Networks ◽

Input Data ◽

Fine Tuning ◽

Artificial Neural

Methods of accuracy improving of pre-trained networks are discussed. Images of ships are input data for the networks. Networks are built and trained using Keras and TensorFlow machine learning libraries. Fine tuning of previously trained convoluted artificial neural networks for pattern recognition tasks is described. Fine tuning of VGG16 and VGG19 networks are done by using Keras Applications. The accuracy of VGG16 network with finetuning of the last convolution unit increased from 94.38% to 95.21%. An increase is only 0.83%. The accuracy of VGG19 network with fine-tuning of the last convolution unit increased from 92.97% to 96.39%, which is 3.42%.

Download Full-text

A system for automated analysis of subsea movies using citizen science and machine learning

10.3897/arphapreprints.e60597 ◽

2020 ◽

Author(s):

Victor Anton ◽

Jannes Germishuys ◽

Matthias Obst

Keyword(s):

Machine Learning ◽

Citizen Science ◽

High Performance ◽

Marine Protected Area ◽

Cold Water ◽

Image Data ◽

Application Programming Interface ◽

Automated Analysis ◽

Machine Learning Algorithms ◽

High Performance Computers

This paper describes a data system to analyse large amounts of subsea movie data for marine ecological research. The system consists of three distinct modules for data management and archiving, citizen science, and machine learning in a high performance computation environment. It allows scientists to upload underwater footage to a customised citizen science website hosted by Zooniverse, where volunteers from the public classify the footage. Classifications with high agreement among citizen scientists are then used to train machine learning algorithms. An application programming interface allows researchers to test the algorithms and track biological objects in new footage. We tested our system using recordings from remotely operated vehicles (ROVs) in a Marine Protected Area, the Kosterhavet National Park in Sweden. Results indicate a strong decline of cold-water corals in the park over a period of 15 years, showing that our system allows to effectively extract valuable occurrence and abundance data for key ecological species from underwater footage. We argue that the combination of citizen science tools, machine learning, and high performance computers are key to successfully analyse large amounts of image data in the future, suggesting that these services should be consolidated and interlinked by national and international research infrastructures. Novel information system to analyse marine underwater footage.

Download Full-text

Automated Classification of Online Sources for Infectious Disease Occurrences Using Machine-Learning-Based Natural Language Processing Approaches

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17249467 ◽

2020 ◽

Vol 17 (24) ◽

pp. 9467

Author(s):

Mira Kim ◽

Kyunghee Chae ◽

Seungwoo Lee ◽

Hong-Jun Jang ◽

Sukil Kim

Keyword(s):

Infectious Disease ◽

Machine Learning ◽

Language Processing ◽

Short Term Memory ◽

Area Under The Curve ◽

Relevant Information ◽

Machine Learning Algorithms ◽

Automated Classification ◽

Automated Identification

Collecting valid information from electronic sources to detect the potential outbreak of infectious disease is time-consuming and labor-intensive. The automated identification of relevant information using machine learning is necessary to respond to a potential disease outbreak. A total of 2864 documents were collected from various websites and subsequently manually categorized and labeled by two reviewers. Accurate labels for the training and test data were provided based on a reviewer consensus. Two machine learning algorithms—ConvNet and bidirectional long short-term memory (BiLSTM)—and two classification methods—DocClass and SenClass—were used for classifying the documents. The precision, recall, F1, accuracy, and area under the curve were measured to evaluate the performance of each model. ConvNet yielded higher average, min, and max accuracies (87.6%, 85.2%, and 91.1%, respectively) than BiLSTM with DocClass, while BiLSTM performed better than ConvNet with SenClass with average, min, and max accuracies of 92.8%, 92.6%, and 93.3%, respectively. The performance of BiLSTM with SenClass yielded an overall accuracy of 92.9% in classifying infectious disease occurrences. Machine learning had a compatible performance with a human expert given a particular text extraction system. This study suggests that analyzing information from the website using machine learning can achieve significant accuracies in the presence of abundant articles/documents.

Download Full-text

Automated Image Analysis for High-Content Screening and Analysis

CrossRef Listing of Deleted DOIs ◽

10.1177/1087057110370894 ◽

2010 ◽

Vol 15 (7) ◽

pp. 726-734 ◽

Cited By ~ 81

Author(s):

Aabid Shariff ◽

Joshua Kangas ◽

Luis Pedro Coelho ◽

Shannon Quinn ◽

Robert F. Murphy

Keyword(s):

Machine Learning ◽

Image Processing ◽

Image Analysis ◽

Cell Biology ◽

Image Data ◽

Automated Analysis ◽

Image Features ◽

Automated Image Analysis ◽

High Content Screening ◽

Spatial Transformation

The field of high-content screening and analysis consists of a set of methodologies for automated discovery in cell biology and drug development using large amounts of image data. In most cases, imaging is carried out by automated microscopes, often assisted by automated liquid handling and cell culture. Image processing, computer vision, and machine learning are used to automatically process high-dimensional image data into meaningful cell biological results. The key is creating automated analysis pipelines typically consisting of 4 basic steps: (1) image processing (normalization, segmentation, tracing, tracking), (2) spatial transformation to bring images to a common reference frame (registration), (3) computation of image features, and (4) machine learning for modeling and interpretation of data. An overview of these image analysis tools is presented here, along with brief descriptions of a few applications.

Download Full-text

A Review on Heart Disease Detection Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0200 ◽

2017 ◽

Vol 7 (7) ◽

pp. 395

Author(s):

. Anika ◽

Navpreet Kaur

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Review Paper ◽

Vascular Diseases ◽

Disease Detection ◽

Automated Classification ◽

Feature Extraction Method ◽

Detection Techniques ◽

Cardio Vascular ◽

Public Domain Software

The paper exhibits a formal audit on early detection of heart disease which are the major cause of death. Computational science has potential to detect disease in prior stages automatically. With this review paper we describe machine learning for disease detection. Machine learning is a method of data analysis that automates analytical model building.Various techniques develop to predict cardiac disease based on cases through MRI was developed. Automated classification using machine learning. Feature extraction method using Cell Profiler and GLCM. Cell Profiler a public domain software, freely available is flourished by the Broad Institute's Imaging Platform and Glcm is a statistical method of examining texture .Various techniques to detect cardio vascular diseases.

Download Full-text

Advances in the Prediction of Protein Subcellular Locations with Machine Learning

Current Bioinformatics ◽

10.2174/1574893614666181217145156 ◽

2019 ◽

Vol 14 (5) ◽

pp. 406-421 ◽

Cited By ~ 3

Author(s):

Ting-He Zhang ◽

Shao-Wu Zhang

Keyword(s):

Machine Learning ◽

Feature Fusion ◽

Protein Sequences ◽

Subcellular Location ◽

Automated Analysis ◽

Cellular Level ◽

Machine Learning Algorithms ◽

Feature Representation ◽

Protein Subcellular Location ◽

Protein Subcellular Locations

Background: Revealing the subcellular location of a newly discovered protein can bring insight into their function and guide research at the cellular level. The experimental methods currently used to identify the protein subcellular locations are both time-consuming and expensive. Thus, it is highly desired to develop computational methods for efficiently and effectively identifying the protein subcellular locations. Especially, the rapidly increasing number of protein sequences entering the genome databases has called for the development of automated analysis methods. Methods: In this review, we will describe the recent advances in predicting the protein subcellular locations with machine learning from the following aspects: i) Protein subcellular location benchmark dataset construction, ii) Protein feature representation and feature descriptors, iii) Common machine learning algorithms, iv) Cross-validation test methods and assessment metrics, v) Web servers. Result & Conclusion: Concomitant with a large number of protein sequences generated by highthroughput technologies, four future directions for predicting protein subcellular locations with machine learning should be paid attention. One direction is the selection of novel and effective features (e.g., statistics, physical-chemical, evolutional) from the sequences and structures of proteins. Another is the feature fusion strategy. The third is the design of a powerful predictor and the fourth one is the protein multiple location sites prediction.

Download Full-text

Artificial neural network models for coronary artery disease

Current Bioinformatics ◽

10.2174/1574893615666200214102837 ◽

2020 ◽

Vol 15 ◽

Author(s):

Elham Shamsara ◽

Sara Saffar Soflaei ◽

Mohammad Tajfard ◽

Ivan Yamshchikov ◽

Habibollah Esmaili ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Coronary Artery Disease ◽

Pattern Recognition ◽

Artificial Neural Network ◽

Coronary Artery ◽

Diagnostic Model ◽

Early Prediction ◽

Artificial Neural ◽

Artery Disease

Background: Coronary artery disease (CAD) is an important cause of mortality and morbidity globally. Objective : The early prediction of the CAD would be valuable in identifying individuals at risk, and in focusing resources on its prevention. In this paper, we aimed to establish a diagnostic model to predict CAD by using three approaches of ANN (pattern recognition-ANN, LVQ-ANN, and competitive ANN). Methods: One promising method for early prediction of disease based on risk factors is machine learning. Among diﬀerent machine learning algorithms, the artificial neural network (ANN) algo-rithms have been applied widely in medicine and a variety of real-world classifications. ANN is a non-linear computational model, that is inspired by the human brain to analyze and process complex datasets. Results: Diﬀerent methods of ANN that are investigated in this paper indicates in both pattern recognition ANN and LVQ-ANN methods, the predictions of Angiography+ class have high accuracy. Moreover, in CNN the correlations between the individuals in cluster ”c” with the class of Angiography+ is strongly high. This accuracy indicates the significant diﬀerence among some of the input features in Angiography+ class and the other two output classes. A comparison among the chosen weights in these three methods in separating control class and Angiography+ shows that hs-CRP, FSG, and WBC are the most substantial excitatory weights in recognizing the Angiography+ individuals although, HDL-C and MCH are determined as inhibitory weights. Furthermore, the effect of decomposition of a multi-class problem to a set of binary classes and random sampling on the accuracy of the diagnostic model is investigated. Conclusion : This study confirms that pattern recognition-ANN had the most accuracy of performance among diﬀerent methods of ANN. That’s due to the back-propagation procedure of the process in which the network classify input variables based on labeled classes. The results of binarization show that decomposition of the multi-class set to binary sets could achieve higher accuracy.

Download Full-text