scholarly journals DGINN, an automated and highly-flexible pipeline for the Detection of Genetic INNovations on protein-coding genes

Author(s):  
Lea Picard ◽  
Quentin Ganivet ◽  
Omran Allatif ◽  
Andrea Cimarelli ◽  
Laurent Guéguen ◽  
...  

AbstractAdaptive evolution has shaped major biological processes. Finding the protein-coding genes and the sites that have been subjected to adaptation during evolutionary time is a major endeavor. However, very few methods fully automate the identification of positively selected genes, and widespread sources of genetic innovations as gene duplication and recombination are absent from most pipelines. Here, we developed DGINN, a highly-flexible and public pipeline to Detect Genetic INNovations and adaptive evolution in protein-coding genes. DGINN automates, from a gene’s sequence, all steps of the evolutionary analyses necessary to detect the aforementioned innovations, including the search for homologues in databases, assignation of orthology groups, identification of duplication and recombination events, as well as detection of positive selection using five different methods to increase precision and ranking of genes when a large panel is analyzed. DGINN was validated on nineteen genes with previously-characterized evolutionary histories in primates, including some engaged in host-pathogen arms-races. The results obtained with DGINN confirm and also expand results from the literature, establishing DGINN as an efficient tool to automatically detect genetic innovations and adaptive evolution in diverse datasets, from the user’s gene of interest to a large gene list in any species range.

2020 ◽  
Vol 48 (18) ◽  
pp. e103-e103 ◽  
Author(s):  
Lea Picard ◽  
Quentin Ganivet ◽  
Omran Allatif ◽  
Andrea Cimarelli ◽  
Laurent Guéguen ◽  
...  

Abstract Adaptive evolution has shaped major biological processes. Finding the protein-coding genes and the sites that have been subjected to adaptation during evolutionary time is a major endeavor. However, very few methods fully automate the identification of positively selected genes, and widespread sources of genetic innovations such as gene duplication and recombination are absent from most pipelines. Here, we developed DGINN, a highly-flexible and public pipeline to Detect Genetic INNovations and adaptive evolution in protein-coding genes. DGINN automates, from a gene's sequence, all steps of the evolutionary analyses necessary to detect the aforementioned innovations, including the search for homologs in databases, assignation of orthology groups, identification of duplication and recombination events, as well as detection of positive selection using five methods to increase precision and ranking of genes when a large panel is analyzed. DGINN was validated on nineteen genes with previously-characterized evolutionary histories in primates, including some engaged in host-pathogen arms-races. Our results confirm and also expand results from the literature, including novel findings on the Guanylate-binding protein family, GBPs. This establishes DGINN as an efficient tool to automatically detect genetic innovations and adaptive evolution in diverse datasets, from the user's gene of interest to a large gene list in any species range.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e8450 ◽  
Author(s):  
Sunan Huang ◽  
Xuejun Ge ◽  
Asunción Cano ◽  
Betty Gaby Millán Salazar ◽  
Yunfei Deng

The genus Dicliptera (Justicieae, Acanthaceae) consists of approximately 150 species distributed throughout the tropical and subtropical regions of the world. Newly obtained chloroplast genomes (cp genomes) are reported for five species of Dilciptera (D. acuminata, D. peruviana, D. montana, D. ruiziana and D. mucronata) in this study. These cp genomes have circular structures of 150,689–150,811 bp and exhibit quadripartite organizations made up of a large single copy region (LSC, 82,796–82,919 bp), a small single copy region (SSC, 17,084–17,092 bp), and a pair of inverted repeat regions (IRs, 25,401–25,408 bp). Guanine-Cytosine (GC) content makes up 37.9%–38.0% of the total content. The complete cp genomes contain 114 unique genes, including 80 protein-coding genes, 30 transfer RNA (tRNA) genes, and four ribosomal RNA (rRNA) genes. Comparative analyses of nucleotide variability (Pi) reveal the five most variable regions (trnY-GUA-trnE-UUC, trnG-GCC, psbZ-trnG-GCC, petN-psbM, and rps4-trnL-UUA), which may be used as molecular markers in future taxonomic identification and phylogenetic analyses of Dicliptera. A total of 55-58 simple sequence repeats (SSRs) and 229 long repeats were identified in the cp genomes of the five Dicliptera species. Phylogenetic analysis identified a close relationship between D. ruiziana and D. montana, followed by D. acuminata, D. peruviana, and D. mucronata. Evolutionary analysis of orthologous protein-coding genes within the family Acanthaceae revealed only one gene, ycf15, to be under positive selection, which may contribute to future studies of its adaptive evolution. The completed genomes are useful for future research on species identification, phylogenetic relationships, and the adaptive evolution of the Dicliptera species.


Author(s):  
Nicolas Rodrigue ◽  
Thibault Latrille ◽  
Nicolas Lartillot

Abstract In recent years, codon substitution models based on the mutation–selection principle have been extended for the purpose of detecting signatures of adaptive evolution in protein-coding genes. However, the approaches used to date have either focused on detecting global signals of adaptive regimes—across the entire gene—or on contexts where experimentally derived, site-specific amino acid fitness profiles are available. Here, we present a Bayesian site-heterogeneous mutation–selection framework for site-specific detection of adaptive substitution regimes given a protein-coding DNA alignment. We offer implementations, briefly present simulation results, and apply the approach on a few real data sets. Our analyses suggest that the new approach shows greater sensitivity than traditional methods. However, more study is required to assess the impact of potential model violations on the method, and gain a greater empirical sense its behavior on a broader range of real data sets. We propose an outline of such a research program.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 464 ◽  
Author(s):  
Leos G. Kral ◽  
Sara Watson

Background: Mitochondrial DNA of vertebrates contains genes for 13 proteins involved in oxidative phosphorylation. Some of these genes have been shown to undergo adaptive evolution in a variety of species. This study examines all mitochondrial protein coding genes in 11 darter species to determine if any of these genes show evidence of positive selection. Methods: The mitogenome from four darter was sequenced and annotated. Mitogenome sequences for another seven species were obtained from GenBank. Alignments of each of the protein coding genes were subject to codon-based identification of positive selection by Selecton, MEME and FEL. Results: Evidence of positive selection was obtained for six of the genes by at least one of the methods. CYTB was identified as having evolved under positive selection by all three methods at the same codon location. Conclusions: Given the evidence for positive selection of mitochondrial protein coding genes in darters, a more extensive analysis of mitochondrial gene evolution in all the extant darter species is warranted.


2017 ◽  
Vol 2017 ◽  
pp. 1-13
Author(s):  
Fuquan Chen ◽  
Jiaojiao Ji ◽  
Jian Shen ◽  
Xinyi Lu

Most of the human genome can be transcribed into RNAs, but only a minority of these regions produce protein-coding mRNAs whereas the remaining regions are transcribed into noncoding RNAs. Long noncoding RNAs (lncRNAs) were known for their influential regulatory roles in multiple biological processes such as imprinting, dosage compensation, transcriptional regulation, and splicing. The physiological functions of protein-coding genes have been extensively characterized through genome editing in pluripotent stem cells (PSCs) in the past 30 years; however, the study of lncRNAs with genome editing technologies only came into attentions in recent years. Here, we summarize recent advancements in dissecting the roles of lncRNAs with genome editing technologies in PSCs and highlight potential genome editing tools useful for examining the functions of lncRNAs in PSCs.


2021 ◽  
Vol 12 ◽  
Author(s):  
Zhipeng Wang ◽  
Yuanyuan Guo ◽  
Shengwei Liu ◽  
Qingli Meng

Copy number variations (CNVs) are important structural variations that can cause significant phenotypic diversity. Reliable CNVs mapping can be achieved by identification of CNVs from different genetic backgrounds. Investigations on the characteristics of overlapping between CNV regions (CNVRs) and protein-coding genes (CNV genes) or miRNAs (CNV-miRNAs) can reveal the potential mechanisms of their regulation. In this study, we used 50 K SNP arrays to detect CNVs in Duroc purebred pig. A total number of 211 CNVRs were detected with a total length of 118.48 Mb, accounting for 5.23% of the autosomal genome sequence. Of these CNVRs, 32 were gains, 175 losses, and four contained both types (loss and gain within the same region). The CNVRs we detected were non-randomly distributed in the swine genome and were significantly enriched in the segmental duplication and gene density region. Additionally, these CNVRs were overlapping with 1,096 protein-coding genes (CNV-genes), and 39 miRNAs (CNV-miRNAs), respectively. The CNV-genes were enriched in terms of dosage-sensitive gene list. The expression of the CNV genes was significantly higher than that of the non-CNV genes in the adult Duroc prostate. Of all detected CNV genes, 22.99% genes were tissue-specific (TSI > 0.9). Strong negative selection had been underway in the CNV-genes as the ones that were located entirely within the loss CNVRs appeared to be evolving rapidly as determined by the median dN plus dS values. Non-CNV genes tended to be miRNA target than CNV-genes. Furthermore, CNV-miRNAs tended to target more genes compared to non-CNV-miRNAs, and a combination of two CNV-miRNAs preferentially synergistically regulated the same target genes. We also focused our efforts on examining CNV genes and CNV-miRNAs functions, which were also involved in the lipid metabolism, including DGAT1, DGAT2, MOGAT2, miR143, miR335, and miRLET7. Further molecular experiments and independent large studies are needed to confirm our findings.


2019 ◽  
Author(s):  
Mei Yang ◽  
Lin Gong ◽  
Jixing Sui ◽  
Xinzheng Li

AbstractThe deep sea is one of the most extreme environments on earth, with low oxygen, high hydrostatic pressure and high levels of toxins. Species of the family Vesicomyidae are among the dominant chemosymbiotic bivalves found in this harsh habitat. Mitochondria play a vital role in oxygen usage and energy metabolism; thus, they may be under selection during the adaptive evolution of deep-sea vesicomyids. In this study, the mitochondrial genome (mitogenome) of the vesicomyid bivalve Calyptogena marissinica was sequenced with Illumina sequencing. The mitogenome of C. marissinica is 17,374 bp in length and contains 13 protein-coding genes, 2 ribosomal RNA genes (rrnS and rrnL) and 22 transfer RNA genes. All of these genes are encoded on the heavy strand. Some special elements, such as tandem repeat sequences, “G(A)nT” motifs and AT-rich sequences, were observed in the control region of the C. marissinica mitogenome, which is involved in the regulation of replication and transcription of the mitogenome and may be helpful in adjusting the mitochondrial energy metabolism of organisms to adapt to the deep-sea environment. The gene arrangement of protein-coding genes was identical to that of other sequenced vesicomyids. Phylogenetic analyses clustered C. marissinica with previously reported vesicomyid bivalves with high support values. Positive selection analysis revealed evidence of adaptive change in the mitogenome of Vesicomyidae. Ten potentially important adaptive residues were identified, which were located in cox1, cox3, cob, nad2, nad4 and nad5. Overall, this study sheds light on the mitogenomic adaptation of vesicomyid bivalves that inhabit the deep-sea environment.


Sign in / Sign up

Export Citation Format

Share Document