scholarly journals QUATgo: Protein quaternary structural attributes predicted by two-stage machine learning approaches with heterogeneous feature encoding

PLoS ONE ◽  
2020 ◽  
Vol 15 (4) ◽  
pp. e0232087
Author(s):  
Chi-Hua Tung ◽  
Ching-Hsuan Chien ◽  
Chi-Wei Chen ◽  
Lan-Ying Huang ◽  
Yu-Nan Liu ◽  
...  
2021 ◽  
Author(s):  
Qurrat Ul Ain ◽  
Harith Al-Sahaf ◽  
Bing Xue ◽  
Mengjie Zhang

Developing a computer-aided diagnostic system for detecting various skin malignancies from images has attracted many researchers. Unlike many machine learning approaches such as Artificial Neural Networks, Genetic Programming (GP) automatically evolves models with flexible representation. GP successfully provides effective solutions using its intrinsic ability to select prominent features (i.e. feature selection) and build new features (i.e. feature construction). Existing approaches have utilized GP to construct new features from the complete set of original features and the set of operators. However, the complete set of features may contain redundant or irrelevant features that do not provide useful information for classification. This study aims to develop a two-stage GP method where the first stage selects prominent features, and the second stage constructs new features from these selected features and operators such as multiplication in a wrapper approach to improve the classification performance. To include local, global, texture, color, and multi-scale image properties of skin images, GP selects and constructs features extracted from local binary patterns and pyramid-structured wavelet decomposition. The accuracy of this GP method is assessed using two real-world skin image datasets captured from the standard camera and specialized instruments, and compared with commonly used classification algorithms, three state-of-the-art, and an existing embedded GP method. The results reveal that this new approach of feature selection and feature construction effectively helps improve the performance of the machine learning classification algorithms. Unlike other black-box models, the evolved models by GP are interpretable, therefore the proposed method can assist dermatologists to identify prominent features, which has been shown by further analysis on the evolved models.


2021 ◽  
Author(s):  
Qurrat Ul Ain ◽  
Harith Al-Sahaf ◽  
Bing Xue ◽  
Mengjie Zhang

Developing a computer-aided diagnostic system for detecting various skin malignancies from images has attracted many researchers. Unlike many machine learning approaches such as Artificial Neural Networks, Genetic Programming (GP) automatically evolves models with flexible representation. GP successfully provides effective solutions using its intrinsic ability to select prominent features (i.e. feature selection) and build new features (i.e. feature construction). Existing approaches have utilized GP to construct new features from the complete set of original features and the set of operators. However, the complete set of features may contain redundant or irrelevant features that do not provide useful information for classification. This study aims to develop a two-stage GP method where the first stage selects prominent features, and the second stage constructs new features from these selected features and operators such as multiplication in a wrapper approach to improve the classification performance. To include local, global, texture, color, and multi-scale image properties of skin images, GP selects and constructs features extracted from local binary patterns and pyramid-structured wavelet decomposition. The accuracy of this GP method is assessed using two real-world skin image datasets captured from the standard camera and specialized instruments, and compared with commonly used classification algorithms, three state-of-the-art, and an existing embedded GP method. The results reveal that this new approach of feature selection and feature construction effectively helps improve the performance of the machine learning classification algorithms. Unlike other black-box models, the evolved models by GP are interpretable, therefore the proposed method can assist dermatologists to identify prominent features, which has been shown by further analysis on the evolved models.


2019 ◽  
Vol 70 (3) ◽  
pp. 214-224
Author(s):  
Bui Ngoc Dung ◽  
Manh Dzung Lai ◽  
Tran Vu Hieu ◽  
Nguyen Binh T. H.

Video surveillance is emerging research field of intelligent transport systems. This paper presents some techniques which use machine learning and computer vision in vehicles detection and tracking. Firstly the machine learning approaches using Haar-like features and Ada-Boost algorithm for vehicle detection are presented. Secondly approaches to detect vehicles using the background subtraction method based on Gaussian Mixture Model and to track vehicles using optical flow and multiple Kalman filters were given. The method takes advantages of distinguish and tracking multiple vehicles individually. The experimental results demonstrate high accurately of the method.


2017 ◽  
Author(s):  
Sabrina Jaeger ◽  
Simone Fulle ◽  
Samo Turk

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.


2019 ◽  
Author(s):  
Oskar Flygare ◽  
Jesper Enander ◽  
Erik Andersson ◽  
Brjánn Ljótsson ◽  
Volen Z Ivanov ◽  
...  

**Background:** Previous attempts to identify predictors of treatment outcomes in body dysmorphic disorder (BDD) have yielded inconsistent findings. One way to increase precision and clinical utility could be to use machine learning methods, which can incorporate multiple non-linear associations in prediction models. **Methods:** This study used a random forests machine learning approach to test if it is possible to reliably predict remission from BDD in a sample of 88 individuals that had received internet-delivered cognitive behavioral therapy for BDD. The random forest models were compared to traditional logistic regression analyses. **Results:** Random forests correctly identified 78% of participants as remitters or non-remitters at post-treatment. The accuracy of prediction was lower in subsequent follow-ups (68%, 66% and 61% correctly classified at 3-, 12- and 24-month follow-ups, respectively). Depressive symptoms, treatment credibility, working alliance, and initial severity of BDD were among the most important predictors at the beginning of treatment. By contrast, the logistic regression models did not identify consistent and strong predictors of remission from BDD. **Conclusions:** The results provide initial support for the clinical utility of machine learning approaches in the prediction of outcomes of patients with BDD. **Trial registration:** ClinicalTrials.gov ID: NCT02010619.


Sign in / Sign up

Export Citation Format

Share Document