scholarly journals GFICLEE: ultrafast tree-based phylogenetic profile method inferring gene function at the genomic-wide level

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yang Fang ◽  
Menglong Li ◽  
Xufeng Li ◽  
Yi Yang

Abstract Background Phylogenetic profiling is widely used to predict novel members of large protein complexes and biological pathways. Although methods combined with phylogenetic trees have significantly improved prediction accuracy, computational efficiency is still an issue that limits its genome-wise application. Results Here we introduce a new tree-based phylogenetic profiling algorithm named GFICLEE, which infers common single and continuous loss (SCL) events in the evolutionary patterns. We validated our algorithm with human pathways from three databases and compared the computational efficiency with current tree-based with 10 different scales genome dataset. Our algorithm has a better predictive performance with high computational efficiency. Conclusions The GFICLEE is a new method to infers genome-wide gene function. The accuracy and computational efficiency of GFICLEE make it possible to explore gene functions at the genome-wide level on a personal computer.

PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3712 ◽  
Author(s):  
Yulong Niu ◽  
Chengcheng Liu ◽  
Shayan Moghimyfiroozabad ◽  
Yi Yang ◽  
Kambiz N. Alavian

Direct and indirect functional links between proteins as well as their interactions as part of larger protein complexes or common signaling pathways may be predicted by analyzing the correlation of their evolutionary patterns. Based on phylogenetic profiling, here we present a highly scalable and time-efficient computational framework for predicting linkages within the whole human proteome. We have validated this method through analysis of 3,697 human pathways and molecular complexes and a comparison of our results with the prediction outcomes of previously published co-occurrency model-based and normalization methods. Here we also introduce PrePhyloPro, a web-based software that uses our method for accurately predicting proteome-wide linkages. We present data on interactions of human mitochondrial proteins, verifying the performance of this software. PrePhyloPro is freely available at http://prephylopro.org/phyloprofile/.


2021 ◽  
Author(s):  
Jonathan Filee ◽  
Hubert J. Becker ◽  
Lucille Mellottee ◽  
Zhihui LI ◽  
Jean-Christophe Lambry ◽  
...  

Little is known about the evolution and biosynthetic function of DNA precursor and the folate metabolism in the Asgard group of archaea. As Asgard occupy a key position in the archaeal and eukaryotic phylogenetic trees, we have exploited very recently emerged genome and metagenome sequence information to investigate these central metabolic pathways. Our genome-wide analyses revealed that the recently cultured Asgard archaeon Candidatus Prometheoarchaeum syntrophicum strain MK-D1 (Psyn) contains a complete folate-dependent network for the biosynthesis of DNA/RNA precursors, amino acids and syntrophic amino acid utilization. Altogether our experimental and computational data suggest that phylogenetic incongruences of functional folate-dependent enzymes from Asgard archaea reflect their persistent horizontal transmission from various bacterial groups, which has rewired the key metabolic reactions in an important and recently identified archaeal phylogenetic group. We also experimentally validated the functionality of the lateral gene transfer of Psyn thymidylate synthase ThyX. This enzyme uses bacterial-like folates efficiently and is inhibited by mycobacterial ThyX inhibitors. Our data raise the possibility that the thymidylate metabolism, required for de novo DNA synthesis, originated in bacteria and has been independently transferred to archaea and eukaryotes. In conclusion, our study has revealed that recent prevalent lateral gene transfer has markedly shaped the evolution of Asgard archaea by allowing them to adapt to specific ecological niches.


2021 ◽  
Author(s):  
Rania Jbir Koubaa ◽  
Mariem Ayadi ◽  
Mohamed Najib Saidi ◽  
Safa Charfeddine ◽  
Radhia Gargouri Bouzid ◽  
...  

Abstract As antioxidant enzymes, catalase (CAT) protects organisms from oxidative stress via the production of reactive oxygen species (ROS). These enzymes play important roles in diverse biological processes. However, little is known about the CAT genes in potato plants despite its important economical rank of this crop in the world. Yet, abiotic and biotic stresses severely hinder growth and development of the plants which affects the production and quality of the crop. To define the possible roles of CAT genes under various stresses, a genome-wide analysis of CAT gene family has been performed in potato plant.In this study, the StCAT gene’s structure, secondary and 3D protein structure, physicochemical properties, synteny analysis, phylogenetic tree and also expression profiling under various developmental and environmental cues were predicted using bioinformatics tools. The expression analysis by RT-PCR was performed using commercial potato cultivar. Three genes encoding StCAT that code for three proteins each of size 492 aa, interrupted by seven introns have been identified in potatoes. StCAT proteins were found to be localized in the peroxisome which is judged as the main H2O2 cell production site during different processes. Many regulating cis-elements related to stress responses and plant hormones signaling were found in the promoter sequence of each gene. The analysis of motifs and phylogenetic trees showed that StCAT are closer to their homologous in S. lycopersicum and share a 41% – 95% identity with other plants’ CATs. Expression profiling revealed that StCAT1 is the constitutively expressive member; while StCAT2 and StCAT3 are the stress-responsive members.


Author(s):  
Sakellarios Zairis ◽  
Hossein Khiabanian ◽  
Andrew J. Blumberg ◽  
Raul Rabadan

2020 ◽  
Author(s):  
Dan Wang ◽  
Hui Tang ◽  
Jian-Feng Liu ◽  
Shizhong Xu ◽  
Qin Zhang ◽  
...  

SummaryWe have developed a rapid mixed model algorithm for exhaustive genome-wide epistatic association analysis by controlling multiple polygenic effects. Our model can simultaneously handle additive by additive epistasis, dominance by dominance epistasis and additive by dominance epistasis, and account for intrasubject fluctuations due to individuals with repeated records. Furthermore, we suggest a simple but efficient approximate algorithm, which allows examination of all pairwise interactions in a remarkably fast manner of linear with population size. Application to publicly available yeast and human data has showed that our mixed model-based method has similar performance with simple linear model-based Plink on computational efficiency. It took less than 40 hours for the pairwise analysis of 5,000 individuals genotyped with roughly 350,000 SNPs with five threads on Intel Xeon E5 2.6GHz CPU.Availability and implementationSource codes are freely available at https://github.com/chaoning/GMAT.


2019 ◽  
Author(s):  
Igor Mačinković ◽  
Ina Theofel ◽  
Tim Hundertmark ◽  
Kristina Kovač ◽  
Stephan Awe ◽  
...  

Abstract CoREST has been identified as a subunit of several protein complexes that generate transcriptionally repressive chromatin structures during development. However, a comprehensive analysis of the CoREST interactome has not been carried out. We use proteomic approaches to define the interactomes of two dCoREST isoforms, dCoREST-L and dCoREST-M, in Drosophila. We identify three distinct histone deacetylase complexes built around a common dCoREST/dRPD3 core: A dLSD1/dCoREST complex, the LINT complex and a dG9a/dCoREST complex. The latter two complexes can incorporate both dCoREST isoforms. By contrast, the dLSD1/dCoREST complex exclusively assembles with the dCoREST-L isoform. Genome-wide studies show that the three dCoREST complexes associate with chromatin predominantly at promoters. Transcriptome analyses in S2 cells and testes reveal that different cell lineages utilize distinct dCoREST complexes to maintain cell-type-specific gene expression programmes: In macrophage-like S2 cells, LINT represses germ line-related genes whereas other dCoREST complexes are largely dispensable. By contrast, in testes, the dLSD1/dCoREST complex prevents transcription of germ line-inappropriate genes and is essential for spermatogenesis and fertility, whereas depletion of other dCoREST complexes has no effect. Our study uncovers three distinct dCoREST complexes that function in a lineage-restricted fashion to repress specific sets of genes thereby maintaining cell-type-specific gene expression programmes.


Sign in / Sign up

Export Citation Format

Share Document