scholarly journals Recent Trends in Machine Learning-based Protein Fold Recognition Methods

2020 ◽  
Vol 11 (4) ◽  
pp. 11233-11243

Proteins are macromolecules that enable life. Protein function is due to its three-dimensional structure and shape. It is challenging to understand how a linear sequence of amino acid residues folds into a three-dimensional structure. Machine learning-based methods may help significantly in reducing the gap present between known protein sequence and structure. Identifying protein folds from a sequence can help predict protein tertiary structure, determine protein function, and give insights into protein-protein interactions. This work focuses on the following aspects. The kind of features such as sequential, structural, functional, and evolutionary extracted for representing protein sequence and different methods of extracting these features. This work also includes details of machine learning algorithms used with respective settings and protein fold recognition structures. Detailed performance comparison of well-known works is also given.

2019 ◽  
Author(s):  
Mohammad Saleh Refahi ◽  
A. Mir ◽  
Jalal A. Nasiri

AbstractProtein fold recognition plays a crucial role in discovering three-dimensional structure of proteins and protein functions. Several approaches have been employed for the prediction of protein folds. Some of these approaches are based on extracting features from protein sequences and using a strong classifier. Feature extraction techniques generally utilize syntactical-based information, evolutionary-based information and physiochemical-based information to extract features. In recent years, Finding an efficient technique for integrating discriminate features have been received advancing attention. In this study, we integrate Auto-Cross-Covariance (ACC) and Separated dimer (SD) evolutionary feature extraction methods. The results features are scored by Information gain (IG) to define and select several discriminated features. According to three benchmark datasets, DD, RDD and EDD, the results of the support vector machine (SVM) show more than 6% improvement in accuracy on these benchmark datasets.


2020 ◽  
Vol 6 (36) ◽  
pp. eabc0023
Author(s):  
Jānis Rūmnieks ◽  
Ilva Liekniņa ◽  
Gints Kalniņš ◽  
Mihails Šišovs ◽  
Ināra Akopjana ◽  
...  

The single-stranded RNA (ssRNA) bacteriophages are among the simplest known viruses with small genomes and exceptionally high mutation rates. The number of ssRNA phage isolates has remained very low, but recent metagenomic studies have uncovered an immense variety of distinct uncultured ssRNA phages. The coat proteins (CPs) in these genomes are particularly diverse, with notable variation in length and often no recognizable similarity to previously known viruses. We recombinantly expressed metagenome-derived ssRNA phage CPs to produce virus-like particles and determined the three-dimensional structure of 22 previously uncharacterized ssRNA phage capsids covering nine distinct CP types. The structures revealed substantial deviations from the previously known ssRNA phage CP fold, uncovered an unusual prolate particle shape, and revealed a previously unseen dsRNA binding mode. These data expand our knowledge of the evolution of viral structural proteins and are of relevance for applications such as ssRNA phage–based vaccine design.


Sign in / Sign up

Export Citation Format

Share Document