scholarly journals A Comparative Genomic and Phylogenetic Analysis of the Origin and Evolution of the CCN Gene Family

2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Kuan Hu ◽  
Yiming Tao ◽  
Juanni Li ◽  
Zhuang Liu ◽  
Xinyan Zhu ◽  
...  

CCN gene family members have recently been identified as multifunctional regulators involved in diverse biological functions, especially in vascular and skeletal development. In the present study, a comparative genomic and phylogenetic analysis was performed to show the similarities and differences in structure and function of CCNs from different organisms and to reveal their potential evolutionary relationship. First, CCN homologs of metazoans from different species were identified. Then we made multiple sequence alignments, MEME analysis, and functional sites prediction, which show the highly conserved structural features among CCN metazoans. The phylogenetic tree was further established, and thus CCNs were found undergoing extensive lineage-specific duplication events and lineage-specific expansion during the evolutionary process. Besides, comparative analysis about the genomic organization and chromosomal CCN gene surrounding indicated a clear orthologous relationship among these species counterparts. At last, based on these research results above, a potential evolutionary scenario was generated to overview the origin and evolution of the CCN gene family.

2019 ◽  
Vol 37 (1) ◽  
pp. 295-299 ◽  
Author(s):  
Sergei L Kosakovsky Pond ◽  
Art F Y Poon ◽  
Ryan Velazquez ◽  
Steven Weaver ◽  
N Lance Hepler ◽  
...  

Abstract HYpothesis testing using PHYlogenies (HyPhy) is a scriptable, open-source package for fitting a broad range of evolutionary models to multiple sequence alignments, and for conducting subsequent parameter estimation and hypothesis testing, primarily in the maximum likelihood statistical framework. It has become a popular choice for characterizing various aspects of the evolutionary process: natural selection, evolutionary rates, recombination, and coevolution. The 2.5 release (available from www.hyphy.org) includes a completely re-engineered computational core and analysis library that introduces new classes of evolutionary models and statistical tests, delivers substantial performance and stability enhancements, improves usability, streamlines end-to-end analysis workflows, makes it easier to develop custom analyses, and is mostly backward compatible with previous HyPhy releases.


2021 ◽  
Author(s):  
Rohit Kunar ◽  
Jagat Kumar Roy

The mRNA decapping proteins (DCPs) function to hydrolyze the 7-methylguanosine cap at the 5 ′ end of mRNAs thereby, exposing the transcript for degradation by the exonuclease(s) and hence, play a pioneering role in the mRNA decay pathway. In Drosophila melanogaster, the mRNA decapping protein 2 (DCP2) is the only catalytically active mRNA decapping enzyme present. Despite its presence being reported across diverse species in the phylogenetic tree, a quantitative approach to the index of its conservation in terms of its sequence has not been reported so far. With structural and mechanistic insights being explored in the yeasts, the insect DCP2 has never been explored in the perspectives of structure and the indices of the conservation of its sequence and/or structure vis-a-vis topological facets. Being an evolutionarily conserved protein, the present endeavor aimed at deciphering the evolutionary relationship(s) and the pattern of conservation of the sequence of DCP2 across the phylogenetic tree as well as in sibling species of D. melanogaster through a semi-quantitative approach relying on multiple sequence alignment and analyses of percentage identity matrices. Since NUDIX proteins are functionally diverse, an attempt to identify the other NUDIX proteins (or, DCP2 paralogs) in D. melanogaster and compare and align their structural features with that of DCP2 through in silico approaches was endeavored in parallel. Our observations provide quantitative and structural bases for the observed evolutionary conservation of DCP2 across the diverse phyla and also, identify and reinforce the structural conservation of the NUDIX family in D. melanogaster.


2019 ◽  
Vol 36 (8) ◽  
pp. 1831-1842 ◽  
Author(s):  
Mario A Cerón-Romero ◽  
Xyrus X Maurer-Alcalá ◽  
Jean-David Grattepanche ◽  
Ying Yan ◽  
Miguel M Fonseca ◽  
...  

Abstract Estimating multiple sequence alignments (MSAs) and inferring phylogenies are essential for many aspects of comparative biology. Yet, many bioinformatics tools for such analyses have focused on specific clades, with greatest attention paid to plants, animals, and fungi. The rapid increase in high-throughput sequencing (HTS) data from diverse lineages now provides opportunities to estimate evolutionary relationships and gene family evolution across the eukaryotic tree of life. At the same time, these types of data are known to be error-prone (e.g., substitutions, contamination). To address these opportunities and challenges, we have refined a phylogenomic pipeline, now named PhyloToL, to allow easy incorporation of data from HTS studies, to automate production of both MSAs and gene trees, and to identify and remove contaminants. PhyloToL is designed for phylogenomic analyses of diverse lineages across the tree of life (i.e., at scales of >100 My). We demonstrate the power of PhyloToL by assessing stop codon usage in Ciliophora, identifying contamination in a taxon- and gene-rich database and exploring the evolutionary history of chromosomes in the kinetoplastid parasite Trypanosoma brucei, the causative agent of African sleeping sickness. Benchmarking PhyloToL’s homology assessment against that of OrthoMCL and a published paper on superfamilies of bacterial and eukaryotic organellar outer membrane pore-forming proteins demonstrates the power of our approach for determining gene family membership and inferring gene trees. PhyloToL is highly flexible and allows users to easily explore HTS data, test hypotheses about phylogeny and gene family evolution and combine outputs with third-party tools (e.g., PhyloChromoMap, iGTP).


Protein Multiple sequence alignment (MSA) is a process, that helps in alignment of more than two protein sequences to establish an evolutionary relationship between the sequences. As part of Protein MSA, the biological sequences are aligned in a way to identify maximum similarities. Over time the sequencing technologies are becoming more sophisticated and hence the volume of biological data generated is increasing at an enormous rate. This increase in volume of data poses a challenge to the existing methods used to perform effective MSA as with the increase in data volume the computational complexities also increases and the speed to process decreases. The accuracy of MSA is another factor critically important as many bioinformatics inferences are dependent on the output of MSA. This paper elaborates on the existing state of the art methods of protein MSA and performs a comparison of four leading methods namely MAFFT, Clustal Omega, MUSCLE and ProbCons based on the speed and accuracy of these methods. BAliBASE version 3.0 (BAliBASE is a repository of manually refined multiple sequence alignments) has been used as a benchmark database and accuracy of alignment methods is computed through the two widely used criteria named Sum of pair score (SPscore) and total column score (TCscore). We also recorded the execution time for each method in order to compute the execution speed.


2021 ◽  
Author(s):  
Konstantin Weissenow ◽  
Michael Heinzinger ◽  
Burkhard Rost

All state-of-the-art (SOTA) protein structure predictions rely on evolutionary information captured in multiple sequence alignments (MSAs), primarily on evolutionary couplings (co-evolution). Such information is not available for all proteins and is computationally expensive to generate. Prediction models based on Artificial Intelligence (AI) using only single sequences as input are easier and cheaper but perform so poorly that speed becomes irrelevant. Here, we described the first competitive AI solution exclusively inputting embeddings extracted from pre-trained protein Language Models (pLMs), namely from the transformer pLM ProtT5, from single sequences into a relatively shallow (few free parameters) convolutional neural network (CNN) trained on inter-residue distances, i.e. protein structure in 2D. The major advance originated from processing the attention heads learned by ProtT5. Although these models required at no point any MSA, they matched the performance of methods relying on co-evolution. Although not reaching the very top, our lean approach came close at substantially lower costs thereby speeding up development and each future prediction. By generating protein-specific rather than family-averaged predictions, these new solutions could distinguish between structural features differentiating members of the same family of proteins with similar structure predicted alike by all other top methods.


2021 ◽  
Vol 12 ◽  
Author(s):  
Saif ur Rehman ◽  
Tong Feng ◽  
Siwen Wu ◽  
Xier Luo ◽  
An Lei ◽  
...  

Buffalo is a luxurious genetic resource with multiple utilities (as a dairy, draft, and meat animal) and economic significance in the tropical and subtropical regions of the globe. The excellent potential to survive and perform on marginal resources makes buffalo an important source for nutritious products, particularly milk and meat. This study was aimed to investigate the evolutionary relationship, physiochemical properties, and comparative genomic analysis of the casein gene family (CSN1S1, CSN2, CSN1S2, and CSN3) in river and swamp buffalo. Phylogenetic, gene structure, motif, and conserved domain analysis revealed the evolutionarily conserved nature of the casein genes in buffalo and other closely related species. Results indicated that casein proteins were unstable, hydrophilic, and thermostable, although αs1-CN, β-CN, and κ-CN exhibited acidic properties except for αs2-CN, which behaved slightly basic. Comparative analysis of amino acid sequences revealed greater variation in the river buffalo breeds than the swamp buffalo indicating the possible role of these variations in the regulation of milk traits in buffalo. Furthermore, we identified lower transcription activators STATs and higher repressor site YY1 distribution in swamp buffalo, revealing its association with lower expression of casein genes that might subsequently affect milk production. The role of the main motifs in controlling the expression of casein genes necessitates the need for functional studies to evaluate the effect of these elements on the regulation of casein gene function in buffalo.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Tingting Li ◽  
Dongxia Liu ◽  
Yadi Yang ◽  
Jiali Guo ◽  
Yujie Feng ◽  
...  

AbstractCorona Virus Disease 2019 (COVID-19) caused by the emerged coronavirus SARS-CoV-2 is spreading globally. The origin of SARS-Cov-2 and its evolutionary relationship is still ambiguous. Several reports attempted to figure out this critical issue by genome-based phylogenetic analysis, yet limited progress was obtained, principally owing to the disability of these methods to reasonably integrate phylogenetic information from all genes of SARS-CoV-2. Supertree method based on multiple trees can produce the overall reasonable phylogenetic tree. However, the supertree method has been barely used for phylogenetic analysis of viruses. Here we applied the matrix representation with parsimony (MRP) pseudo-sequence supertree analysis to study the origin and evolution of SARS-CoV-2. Compared with other phylogenetic analysis methods, the supertree method showed more resolution power for phylogenetic analysis of coronaviruses. In particular, the MRP pseudo-sequence supertree analysis firmly disputes bat coronavirus RaTG13 be the last common ancestor of SARS-CoV-2, which was implied by other phylogenetic tree analysis based on viral genome sequences. Furthermore, the discovery of evolution and mutation in SARS-CoV-2 was achieved by MRP pseudo-sequence supertree analysis. Taken together, the MRP pseudo-sequence supertree provided more information on the SARS-CoV-2 evolution inference relative to the normal phylogenetic tree based on full-length genomic sequences.


2020 ◽  
Vol 100 (2) ◽  
pp. 359-367
Author(s):  
Tiantian Gu ◽  
Guoqin Li ◽  
Yong Tian ◽  
Li Chen ◽  
Xinsheng Wu ◽  
...  

Melanoma differentiation-associated gene 5 (MDA5) is an important cytoplasmic RNA sensor that detects viral double-stranded RNA in innate immunity. The objective of this study was to characterize the structure and function of the MDA5 gene in the duck. In this study, full-length duck MDA5 (duMDA5) complementary DNA (cDNA) was obtained using the reverse transcription-polymerase chain reaction and rapid amplification of the cDNA ends. The cDNA consisted of a 123 nucleotide 5′ untranslated region (UTR), a 735 nucleotide 3′ UTR, and a 3012 nucleotide open-reading frame, encoding 1003 amino acids. Multiple sequence alignments showed that duMDA5 had 91.18% and 83.11% amino acid sequence similarity with geese and chicken MDA5, respectively, as well as 59.76%–61.26% sequence identity with mammalian homologs. Phylogenetic analysis demonstrated that MDA5 has been highly conserved throughout vertebrate evolution. Quantitative real-time polymerase chain reaction analysis indicated that the duMDA5 mRNA is scarcely detected in healthy tissues and the highest relative transcript level of duMDA5 was induced during poly(I:C) stimulation. Furthermore, knockdown duMDA5 significantly inhibited the transcription of poly(I:C)-induced beta interferons, nuclear factor kappa-B, interferon regulatory factor 7, translocated intimin receptor domain-containing adaptor protein inducing beta interferons, interferon-induced GTP-binding protein, signal transducer and activator of transcription 1 and 2 mRNA. Taken together, these results suggest that duMDA5 is an important receptor for inducing antiviral activity in the duck’s innate immune response.


2019 ◽  
Author(s):  
Reeki Emrizal ◽  
Nor Azlan Nor Muhammad

Porphyromonas gingivalis is one of the major bacteria that causes periodontitis. Chronic periodontitis is a severe form of periodontal disease that occurs due to prolong inflammatory conditions. If left untreated, deterioration of the supporting structures such as gingiva, bone, and ligament can ultimately lead to tooth loss. Virulence factors produce by P. gingivalis that are responsible for the pathophysiology of periodontitis are secreted by Type IX Secretion System (T9SS). T9SS-acquiring bacteria have been linked to several systemic diseases such as atherosclerosis, aspiration pneumonia, cancer, rheumatoid arthritis, and diabetes mellitus. This study aims to investigate the phylogenetic relationship and taxonomic distribution between putative members of T9SS component protein families. There are 20 protein components of T9SS being investigated in this study. We have constructed multiple sequence alignments for each component using homologs of those components. Then we proceed to phylogenetic analysis by constructing the maximum-likelihood (ML) trees. ML trees for 19 protein components of T9SS exhibit clustering of terminal nodes based on their respective classes under Bacteroidetes phylum. The ML tree of PorR, which is an aminotransferase that involved in Wbp pathway that produces structural sugar of A-LPS, exhibits different clustering pattern of terminal nodes where the nodes do not cluster based on their respective classes. Hence, PorR might evolve independently from the other T9SS protein components which might suggest that PorR is acquired by T9SS-acquiring bacteria through horizontal gene transfer. The part of P. gingivalis strain ATCC 33277 genome that contains porR gene has been extracted to support the possibility that porR gene has been horizontally transferred. Through homology searching using NCBI blastx, we found that seven genes (including porR) that involved in the biosynthesis of A-LPS that anchored the virulence factor secreted by T9SS to bacterial cell surface are flanked by insertion sequences (ISs) that encode IS5 family transposase. The IS5 transposons contain a single open reading frame that encodes for the transposase that will cleave the 12 bp inverted repeats that flanked the transposons. Consequently, this can mobilise the intervening DNA segment that contains porR gene and subsequently contributes to the possibility that porR gene is subjected to conjugative transfer. The taxonomic distribution of T9SS protein components revealed that they can be found across all classes under Bacteroidetes phylum. Additionally, we have identified species under Chitinophagia, Saprospiria, and unclassified that acquired homologs of T9SS protein components that, to our knowledge, have not been reported. In conclusion, this study can provide a better understanding about the phylogeny and taxonomic distribution of T9SS protein components.


2019 ◽  
Author(s):  
Jiaokun Li ◽  
Tianyuan Gu ◽  
Weimin Zeng ◽  
Runlan Yu ◽  
Yuandong Liu ◽  
...  

Abstract Background: Antimonite [Sb(III)]-oxidizing bacterium have great potential in the environmental bioremediation of Sb-polluted sites. Bacillus sp. S3 isolated from antimony-contaminated soil showed high Sb(III) resistance and Sb(III) oxidation efficiency. However, very little genomic information and evolutionary relationships that bacterial oxidation of Sb(III) is available. Results: Here, we identified a 5579638 bp chromosome with 40.30% GC content and a 241339 bp plasmid with 36.74% GC content in the complete genome of Bacillus sp. S3. Genomic annotation showed that Bacillus sp. S3 contains key aioB gene potentially encoding As(III)/Sb(III) oxidase, is not shared with other Bacillus strains. Further, a series of genes associated with Sb(III) and other heavy metal(loid) were also ascertained, reflecting adaptive advantage for growth in the harsh eco-environment. It is noteworthy that Bacillus sp. S3 is a novel species within the Bacillus genus as indicated by phylogenetic relationship and the average nucleotide identities (ANI) analysis. The presence of genomic plasticity demonstrated a high number of mobile genetic elements (MGEs) that were mainly distributed on chromosomes within the Bacillus genus. The core genome contained 554 core genes and many unique genes were dissected in analyzed genomes, indicating a conserved core but adaptive pan repertoire. Whole genomic alignment indicates that frequently genomic reshuffling and rearrangements, genetic gain and loss, and other recombination events occurred during the evolutionary history of Bacillus genus. In addition, the origin and evolution analysis of Sb(III)-resistance genes revealed that evolutionary relationships and events of horizontal gene transfer (HGT) among the Bacillus genus. The assessment of functionality of heavy metal(loid) resistance genes emphasized its indispensable roles in the harsh eco-environment of Bacillus genus. The real-time Quantitative PCR(RT-qPCR) results of Sb(III)-related genes indicated that the Sb(III) resistance was constantly increased under the Sb(III) stress. Conclusions: The insights provided in this study shed light on the molecular details of Bacillus sp. S3 coping with Sb(III), which extended our understanding on the evolutionary relationship between Bacillus sp. S3 and other closely related species and will enrich the Sb(III) resistance genetic data sources.


Sign in / Sign up

Export Citation Format

Share Document