scholarly journals Image-based effective feature generation for protein structural class and ligand binding prediction

2020 ◽  
Vol 6 ◽  
pp. e253
Author(s):  
Nafees Sadique ◽  
Al Amin Neaz Ahmed ◽  
Md Tajul Islam ◽  
Md. Nawshad Pervage ◽  
Swakkhar Shatabda

Proteins are the building blocks of all cells in both human and all living creatures of the world. Most of the work in the living organism is performed by proteins. Proteins are polymers of amino acid monomers which are biomolecules or macromolecules. The tertiary structure of protein represents the three-dimensional shape of a protein. The functions, classification and binding sites are governed by the protein’s tertiary structure. If two protein structures are alike, then the two proteins can be of the same kind implying similar structural class and ligand binding properties. In this paper, we have used the protein tertiary structure to generate effective features for applications in structural similarity to detect structural class and ligand binding. Firstly, we have analyzed the effectiveness of a group of image-based features to predict the structural class of a protein. These features are derived from the image generated by the distance matrix of the tertiary structure of a given protein. They include local binary pattern (LBP) histogram, Gabor filtered LBP histogram, separate row multiplication matrix with uniform LBP histogram, neighbor block subtraction matrix with uniform LBP histogram and atom bond. Separate row multiplication matrix and neighbor block subtraction matrix filters, as well as atom bond, are our novels. The experiments were done on a standard benchmark dataset. We have demonstrated the effectiveness of these features over a large variety of supervised machine learning algorithms. Experiments suggest support vector machines is the best performing classifier on the selected dataset using the set of features. We believe the excellent performance of Hybrid LBP in terms of accuracy would motivate the researchers and practitioners to use it to identify protein structural class. To facilitate that, a classification model using Hybrid LBP is readily available for use at http://brl.uiu.ac.bd/PL/. Protein-ligand binding is accountable for managing the tasks of biological receptors that help to cure diseases and many more. Therefore, binding prediction between protein and ligand is important for understanding a protein’s activity or to accelerate docking computations in virtual screening-based drug design. Protein-ligand binding prediction requires three-dimensional tertiary structure of the target protein to be searched for ligand binding. In this paper, we have proposed a supervised learning algorithm for predicting protein-ligand binding, which is a similarity-based clustering approach using the same set of features. Our algorithm works better than the most popular and widely used machine learning algorithms.

2019 ◽  
Author(s):  
Nafees Sadique ◽  
Al Amin Neaz Ahmed ◽  
Md Tajul Islam ◽  
Md. Nawshad Pervage ◽  
Swakkhar Shatabda

Proteins are the building blocks of all cells in both human and all our living creatures of the world. Most of the work in the living organism is performed by Proteins. Proteins are polymers of amino acid monomers which are biomolecules or macromolecules. The tertiary structure of protein represents the three-dimensional shape of a protein. The functions, classification and binding sites are governed by protein’s tertiary structure. If two protein structures are alike then the two proteins can be of the same kind implying similar structural class and ligand binding properties. In this paper, we have used protein structure to generate effective features for applications in structural similarity to detect structural class and ligand binding. Firstly, we analyze the effectiveness of a group of image based features to predict the structural class of a protein. These features are derived from the image generated by the distance matrix of the tertiary structure of a given protein. They include local binary pattern histogram, Gabor filtered local binary pattern histogram, separate row multiplication matrix with uniform local binary pattern histogram, neighbour block subtraction matrix with uniform local binary pattern histogram and atom bond. The experiments were done on a standard benchmark dataset. We have demonstrated the effectiveness of these features over a large variety of supervised machine learning algorithms. Experiments suggest Random Forest is the best performing classifier on the selected dataset using the set of features. We believe the excellent performance of Hybrid LBP in terms of accuracy would motivate the researchers and practitioners to use it to identify protein structural class. To facilitate that, a classification model using Hybrid LBP is readily available for use at http://brl.uiu.ac.bd/PL/. Protein-Ligand binding is accountable for managing the tasks of biological receptors that helps to cure diseases and many more. So, binding prediction between protein and ligand is important for understanding a protein’s activity or to accelerate docking computations in virtual screening-based drug design. Protein-Ligand Binding Prediction requires three-dimensional tertiary structure of the target protein to be searched for ligand binding. In this paper, we’ve proposed a supervised learning algorithm for predicting Protein-Ligand Binding which is a Similarity-Based Clustering approach using the same set of features. Our algorithm works better than most popular and widely used machine learning algorithms


2019 ◽  
Author(s):  
Nafees Sadique ◽  
Al Amin Neaz Ahmed ◽  
Md Tajul Islam ◽  
Md. Nawshad Pervage ◽  
Swakkhar Shatabda

Proteins are the building blocks of all cells in both human and all our living creatures of the world. Most of the work in the living organism is performed by Proteins. Proteins are polymers of amino acid monomers which are biomolecules or macromolecules. The tertiary structure of protein represents the three-dimensional shape of a protein. The functions, classification and binding sites are governed by protein’s tertiary structure. If two protein structures are alike then the two proteins can be of the same kind implying similar structural class and ligand binding properties. In this paper, we have used protein structure to generate effective features for applications in structural similarity to detect structural class and ligand binding. Firstly, we analyze the effectiveness of a group of image based features to predict the structural class of a protein. These features are derived from the image generated by the distance matrix of the tertiary structure of a given protein. They include local binary pattern histogram, Gabor filtered local binary pattern histogram, separate row multiplication matrix with uniform local binary pattern histogram, neighbour block subtraction matrix with uniform local binary pattern histogram and atom bond. The experiments were done on a standard benchmark dataset. We have demonstrated the effectiveness of these features over a large variety of supervised machine learning algorithms. Experiments suggest Random Forest is the best performing classifier on the selected dataset using the set of features. We believe the excellent performance of Hybrid LBP in terms of accuracy would motivate the researchers and practitioners to use it to identify protein structural class. To facilitate that, a classification model using Hybrid LBP is readily available for use at http://brl.uiu.ac.bd/PL/. Protein-Ligand binding is accountable for managing the tasks of biological receptors that helps to cure diseases and many more. So, binding prediction between protein and ligand is important for understanding a protein’s activity or to accelerate docking computations in virtual screening-based drug design. Protein-Ligand Binding Prediction requires three-dimensional tertiary structure of the target protein to be searched for ligand binding. In this paper, we’ve proposed a supervised learning algorithm for predicting Protein-Ligand Binding which is a Similarity-Based Clustering approach using the same set of features. Our algorithm works better than most popular and widely used machine learning algorithms


Crystals ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 324
Author(s):  
Lin Zhu ◽  
Mehdi D. Davari ◽  
Wenjin Li

In the postgenomic age, rapid growth in the number of sequence-known proteins has been accompanied by much slower growth in the number of structure-known proteins (as a result of experimental limitations), and a widening gap between the two is evident. Because protein function is linked to protein structure, successful prediction of protein structure is of significant importance in protein function identification. Foreknowledge of protein structural class can help improve protein structure prediction with significant medical and pharmaceutical implications. Thus, a fast, suitable, reliable, and reasonable computational method for protein structural class prediction has become pivotal in bioinformatics. Here, we review recent efforts in protein structural class prediction from protein sequence, with particular attention paid to new feature descriptors, which extract information from protein sequence, and the use of machine learning algorithms in both feature selection and the construction of new classification models. These new feature descriptors include amino acid composition, sequence order, physicochemical properties, multiprofile Bayes, and secondary structure-based features. Machine learning methods, such as artificial neural networks (ANNs), support vector machine (SVM), K-nearest neighbor (KNN), random forest, deep learning, and examples of their application are discussed in detail. We also present our view on possible future directions, challenges, and opportunities for the applications of machine learning algorithms for prediction of protein structural classes.


2021 ◽  
Vol 8 (1) ◽  
pp. 205395172110135
Author(s):  
Florian Jaton

This theoretical paper considers the morality of machine learning algorithms and systems in the light of the biases that ground their correctness. It begins by presenting biases not as a priori negative entities but as contingent external referents—often gathered in benchmarked repositories called ground-truth datasets—that define what needs to be learned and allow for performance measures. I then argue that ground-truth datasets and their concomitant practices—that fundamentally involve establishing biases to enable learning procedures—can be described by their respective morality, here defined as the more or less accounted experience of hesitation when faced with what pragmatist philosopher William James called “genuine options”—that is, choices to be made in the heat of the moment that engage different possible futures. I then stress three constitutive dimensions of this pragmatist morality, as far as ground-truthing practices are concerned: (I) the definition of the problem to be solved (problematization), (II) the identification of the data to be collected and set up (databasing), and (III) the qualification of the targets to be learned (labeling). I finally suggest that this three-dimensional conceptual space can be used to map machine learning algorithmic projects in terms of the morality of their respective and constitutive ground-truthing practices. Such techno-moral graphs may, in turn, serve as equipment for greater governance of machine learning algorithms and systems.


Sign in / Sign up

Export Citation Format

Share Document