Protein Remote Homology Detection by Combining Pseudo Dimer Composition with an Ensemble Learning Method

2016 ◽  
Vol 13 (2) ◽  
pp. 86-91 ◽  
Author(s):  
Bin Liu ◽  
Junjie Chen ◽  
Shanyi Wang
Author(s):  
N. Srinivasan ◽  
G. Agarwal ◽  
R. M. Bhaskara ◽  
R. Gadkari ◽  
O. Krishnadev ◽  
...  

In the post-genomic era, biological databases are growing at a tremendous rate. Despite rapid accumulation of biological information, functions and other biological properties of many putative gene products of various organisms remain either unknown or obscure. This paper examines how strategic integration of large biological databases and combinations of various biological information helps address some of the fundamental questions on protein structure, function and interactions. New developments in function recognition by remote homology detection and strategic use of sequence databases aid recognition of functions of newly discovered proteins. Knowledge of 3-D structures and combined use of sequences and 3-D structures of homologous protein domains expands the ability of remote homology detection enormously. The authors also demonstrate how combined consideration of functions of individual domains of multi-domain proteins helps in recognizing gross biological attributes. This paper also discusses a few cases of combining disparate biological datasets or combination of disparate biological information in obtaining new insights about protein-protein interactions across a host and a pathogen. Finally, the authors discuss how combinations of low resolution structural data, obtained using cryoEM studies, of gigantic multi-component assemblies, and atomic level 3-D structures of the components is effective in inferring finer features in the assembly.


Biomolecules ◽  
2019 ◽  
Vol 10 (1) ◽  
pp. 26 ◽  
Author(s):  
Guillermin Agüero-Chapin ◽  
Deborah Galpert ◽  
Reinaldo Molina-Ruiz ◽  
Evys Ancede-Gallardo ◽  
Gisselle Pérez-Machado ◽  
...  

Alignment-free (AF) methodologies have increased in popularity in the last decades as alternative tools to alignment-based (AB) algorithms for performing comparative sequence analyses. They have been especially useful to detect remote homologs within the twilight zone of highly diverse gene/protein families and superfamilies. The most popular alignment-free methodologies, as well as their applications to classification problems, have been described in previous reviews. Despite a new set of graph theory-derived sequence/structural descriptors that have been gaining relevance in the detection of remote homology, they have been omitted as AF predictors when the topic is addressed. Here, we first go over the most popular AF approaches used for detecting homology signals within the twilight zone and then bring out the state-of-the-art tools encoding graph theory-derived sequence/structure descriptors and their success for identifying remote homologs. We also highlight the tendency of integrating AF features/measures with the AB ones, either into the same prediction model or by assembling the predictions from different algorithms using voting/weighting strategies, for improving the detection of remote signals. Lastly, we briefly discuss the efforts made to scale up AB and AF features/measures for the comparison of multiple genomes and proteomes. Alongside the achieved experiences in remote homology detection by both the most popular AF tools and other less known ones, we provide our own using the graphical–numerical methodologies, MARCH-INSIDE, TI2BioP, and ProtDCal. We also present a new Python-based tool (SeqDivA) with a friendly graphical user interface (GUI) for delimiting the twilight zone by using several similar criteria.


Sign in / Sign up

Export Citation Format

Share Document