scholarly journals Phylogenomics of the Epigenetic Toolkit Reveals Punctate Retention of Genes across Eukaryotes

2020 ◽  
Vol 12 (12) ◽  
pp. 2196-2210
Author(s):  
Agnes K M Weiner ◽  
Mario A Cerón-Romero ◽  
Ying Yan ◽  
Laura A Katz

Abstract Epigenetic processes in eukaryotes play important roles through regulation of gene expression, chromatin structure, and genome rearrangements. The roles of chromatin modification (e.g., DNA methylation and histone modification) and non-protein-coding RNAs have been well studied in animals and plants. With the exception of a few model organisms (e.g., Saccharomyces and Plasmodium), much less is known about epigenetic toolkits across the remainder of the eukaryotic tree of life. Even with limited data, previous work suggested the existence of an ancient epigenetic toolkit in the last eukaryotic common ancestor. We use PhyloToL, our taxon-rich phylogenomic pipeline, to detect homologs of epigenetic genes and evaluate their macroevolutionary patterns among eukaryotes. In addition to data from GenBank, we increase taxon sampling from understudied clades of SAR (Stramenopila, Alveolata, and Rhizaria) and Amoebozoa by adding new single-cell transcriptomes from ciliates, foraminifera, and testate amoebae. We focus on 118 gene families, 94 involved in chromatin modification and 24 involved in non-protein-coding RNA processes based on the epigenetics literature. Our results indicate 1) the presence of a large number of epigenetic gene families in the last eukaryotic common ancestor; 2) differential conservation among major eukaryotic clades, with a notable paucity of genes within Excavata; and 3) punctate distribution of epigenetic gene families between species consistent with rapid evolution leading to gene loss. Together these data demonstrate the power of taxon-rich phylogenomic studies for illuminating evolutionary patterns at scales of >1 billion years of evolution and suggest that macroevolutionary phenomena, such as genome conflict, have shaped the evolution of the eukaryotic epigenetic toolkit.

2017 ◽  
Author(s):  
S.G. Foy ◽  
B.A. Wilson ◽  
J. Bertram ◽  
M.H.J. Cordes ◽  
J. Masel

AbstractTo detect a direction to evolution, without the pitfalls of reconstructing ancestral states, we need to compare “more evolved” to “less evolved” entities. But because all extant species have the same common ancestor, none are chronologically more evolved than any other. However, different gene families were born at different times, allowing us to compare young protein-coding genes to those that are older and hence have been evolving for longer. To be retained during evolution, a protein must not only have a function, but must also avoid toxic dysfunction such as protein aggregation. There is conflict between the two requirements; hydrophobic amino acids form the cores of protein folds, but also promote aggregation. Young genes avoid strongly hydrophobic amino acids, which is presumably the simplest solution to the aggregation problem. Here we show that young genes’ few hydrophobic residues are clustered near one another along the primary sequence, presumably to assist folding. The higher aggregation risk created by the higher hydrophobicity of older genes is counteracted by more subtle effects in the ordering of the amino acids, including a reduction in the clustering of hydrophobic residues until they eventually become more interspersed than if distributed randomly. This interspersion has previously been reported to be a general property of proteins, but here we find that it is restricted to old genes. Quantitatively, the index of dispersion delineates a gradual trend, i.e. a decrease in the clustering of hydrophobic amino acids over billions of years.


2020 ◽  
Vol 10 (10) ◽  
pp. 3467-3478 ◽  
Author(s):  
Peter M. Thielen ◽  
Amanda L. Pendleton ◽  
Robert A. Player ◽  
Kenneth V. Bowden ◽  
Thomas J. Lawton ◽  
...  

Setaria viridis (green foxtail) is an important model system for improving cereal crops due to its diploid genome, ease of cultivation, and use of C4 photosynthesis. The S. viridis accession ME034V is exceptionally transformable, but the lack of a sequenced genome for this accession has limited its utility. We present a 397 Mb highly contiguous de novo assembly of ME034V using ultra-long nanopore sequencing technology (read N50 = 41kb). We estimate that this genome is largely complete based on our updated k-mer based genome size estimate of 401 Mb for S. viridis. Genome annotation identified 37,908 protein-coding genes and >300k repetitive elements comprising 46% of the genome. We compared the ME034V assembly with two other previously sequenced Setaria genomes as well as to a diversity panel of 235 S. viridis accessions. We found the genome assemblies to be largely syntenic, but numerous unique polymorphic structural variants were discovered. Several ME034V deletions may be associated with recent retrotransposition of copia and gypsy LTR repeat families, as evidenced by their low genotype frequencies in the sampled population. Lastly, we performed a phylogenomic analysis to identify gene families that have expanded in Setaria, including those involved in specialized metabolism and plant defense response. The high continuity of the ME034V genome assembly validates the utility of ultra-long DNA sequencing to improve genetic resources for emerging model organisms. Structural variation present in Setaria illustrates the importance of obtaining the proper genome reference for genetic experiments. Thus, we anticipate that the ME034V genome will be of significant utility for the Setaria research community.


2019 ◽  
Vol 4 ◽  
pp. 112
Author(s):  
Cordula Boehm ◽  
Mark C. Field

Background: The eukaryotic endomembrane system likely arose via paralogous expansion of genes encoding proteins specifying organelle identity, coat complexes and government of fusion specificity. While the majority of these gene families were established by the time of the last eukaryotic common ancestor (LECA), subsequent evolutionary events molded these systems, likely reflecting adaptations retained for increased fitness. As well as sequence evolution, these adaptations include loss of otherwise canonical subunits, emergence of lineage-specific proteins and paralog expansion. The exocyst complex is involved in late exocytosis, and possibly additional pathways, and is a member of the complexes associated with tethering containing helical rods (CATCHR) tethering complex family, which includes conserved oligomeric Golgi (COG), homotypic fusion and vacuole protein sorting (HOPS), class C core vacuole/endosome tethering (CORVET) and others. The exocyst is integrated into a complex GTPase signaling network in animals, fungi and other lineages. Prompted by discovery of Exo99, a non-canonical subunit in the excavate protist Trypanosoma brucei, and significantly increased genome sequence data, we examined evolution of the exocyst. Methods: We examined evolution of the exocyst by comparative genomics, phylogenetics and structure prediction. Results: The exocyst is highly conserved, but with substantial losses of subunits in the Apicomplexa and expansions in Streptophyta plants and Metazoa. Significantly, few taxa retain a partial complex, suggesting that, in the main, all subunits are required for functionality. Further, the ninth exocyst subunit Exo99 is specific to the Euglenozoa with a distinct architecture compared to the other subunits and which possibly represents a coat system. Conclusions: These data reveal a remarkable degree of evolutionary flexibility within the exocyst complex, suggesting significant diversity in exocytosis mechanisms.


2019 ◽  
Author(s):  
Eric Hugoson ◽  
Tea Ammunét ◽  
Lionel Guy

AbstractBacteria adapting to living in a host cell caused the most salient events in the evolution of eukaryotes, namely the seminal fusion with an archaeon 1, and the emergence of both the mitochondrion and the chloroplast 2. A bacterial clade that may hold the key to understanding these events is the deep-branching gammaproteobacterial order Legionellales – containing among others Coxiella and Legionella – of which all known members grow inside eukaryotic cells 3. Here, by analyzing 35 novel Legionellales genomes mainly acquired through metagenomics, we show that this group is much more diverse than previously thought, and that key host-adaptation events took place very early in its evolution. Crucial virulence factors like the Type IVB secretion (Dot/Icm) system and two shared effector proteins were gained in the last Legionellales common ancestor (LLCA), while many metabolic gene families were lost in LLCA and its immediate descendants. We estimate that LLCA lived circa 2.4 Ga ago, predating the last eukaryotic common ancestor (LECA) by at least 0.5 Ga 4. These elements strongly indicate that host-adaptation arose only once in Legionellales, and that these bacteria were using advanced molecular machinery to exploit and manipulate host cells very early in eukaryogenesis.


2020 ◽  
Author(s):  
Peter M. Thielen ◽  
Amanda L. Pendleton ◽  
Robert A. Player ◽  
Kenneth V. Bowden ◽  
Thomas J. Lawton ◽  
...  

ABSTRACTSetaria viridis (green foxtail) is an important model system for improving cereal crops due to its diploid genome, ease of cultivation, and use of C4 photosynthesis. The S. viridis cultivar ME034V is exceptionally transformable, but the lack of a sequenced genome for this cultivar has limited its utility. We present a 397 Mb highly contiguous de novo assembly of ME034V using ultra-long nanopore sequencing technology (read N50=41kb). We estimate that this genome is largely complete based on our updated k-mer based genome size estimate of 401 Mb for S. viridis. Genome annotation identified 37,908 protein-coding genes and >300k repetitive elements comprising 46% of the genome. We compared the ME034V assembly with two other previously sequenced Setaria genomes as well as to a diversity panel of 235 S. viridis cultivars. We found the genome assemblies to be largely syntenic, but numerous unique polymorphic structural variants were discovered. Several ME034V deletions may be associated with recent retrotransposition of copia and gypsy LTR repeat families, as evidenced by their low genotype frequencies in the sampled population. Lastly, we performed a phylogenomic analysis to identify gene families that have expanded in Setaria, including those involved in specialized metabolism and plant defense response. The high continuity of the ME034V genome assembly validates the utility of ultra-long DNA sequencing to improve genetic resources for emerging model organisms. Structural variation present in Setaria illustrates the importance of obtaining the proper genome reference for genetic experiments. Thus, we anticipate that the ME034V genome will be of significant utility for the Setaria research community.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Ying Li ◽  
Pengchuan Sun ◽  
Zhiqiang Lu ◽  
Jinyuan Chen ◽  
Zhenyue Wang ◽  
...  

AbstractHazelnut is popular for its flavor, and it has also been suggested that hazelnut is beneficial to cardiovascular health because it is rich in oleic acid. Here, we report the first high-quality chromosome-scale genome for the hazelnut species Corylus mandshurica (2n = 22), which has a high concentration of oleic acid in its nuts. The assembled genome is 367.67 Mb in length, and the contig N50 is 14.85 Mb. All contigs were assembled into 11 chromosomes, and 28,409 protein-coding genes were annotated. We reconstructed the evolutionary trajectories of the genomes of Betulaceae species and revealed that the 11 chromosomes of the hazelnut genus were derived from the most ancestral karyotype in Betula pendula, which has 14 protochromosomes, by inferring homology among five Betulaceae genomes. We identified 96 candidate genes involved in oleic acid biosynthesis, and 10 showed rapid evolution or positive selection. These findings will help us to understand the mechanisms of lipid synthesis and storage in hazelnuts. Several gene families related to salicylic acid metabolism and stress responses experienced rapid expansion in this hazelnut species, which may have increased its stress tolerance. The reference genome presented here constitutes a valuable resource for molecular breeding and genetic improvement of the important agronomic properties of hazelnut.


2020 ◽  
Vol 21 (11) ◽  
pp. 1068-1077
Author(s):  
Xiaochao Sun ◽  
Bin Yang ◽  
Qunye Zhang

: Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.


Forests ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 384
Author(s):  
Baiba Krivmane ◽  
Ilze Šņepste ◽  
Vilnis Šķipars ◽  
Igor Yakovlev ◽  
Carl Gunnar Fossdal ◽  
...  

MicroRNAs (miRNAs) are non-protein coding RNAs of ~20–24 nucleotides in length that play an important role in many biological and metabolic processes, including the regulation of gene expression, plant growth and developmental processes, as well as responses to stress and pathogens. The aim of this study was to identify and characterize novel and conserved microRNAs expressed in methyl jasmonate-treated Scots pine needles. In addition, potential precursor sequences and target genes of the identified miRNAs were determined by alignment to the Pinus unigene set. Potential precursor sequences were identified using the miRAtool, conserved miRNA precursors were also tested for the ability to form the required stem-loop structure, and the minimal folding free energy indexes were calculated. By comparison with miRBase, 4975 annotated sequences were identified and assigned to 173 miRNA groups, belonging to a total of 60 conserved miRNA families. A total of 1029 potential novel miRNAs, grouped into 34 families were found, and 46 predicted precursor sequences were identified. A total of 136 potential target genes targeted by 28 families were identified. The majority of previously reported highly conserved plant miRNAs were identified in this study, as well as some conserved miRNAs previously reported to be monocot specific. No conserved dicot-specific miRNAs were identified. A number of potential gymnosperm or conifer specific miRNAs were found, shared among a range of conifer species.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Xing Wang ◽  
Yi Zhang ◽  
Yufeng Zhang ◽  
Mingming Kang ◽  
Yuanbo Li ◽  
...  

AbstractEarthworms (Annelida: Crassiclitellata) are widely distributed around the world due to their ancient origination as well as adaptation and invasion after introduction into new habitats over the past few centuries. Herein, we report a 1.2 Gb complete genome assembly of the earthworm Amynthas corticis based on a strategy combining third-generation long-read sequencing and Hi-C mapping. A total of 29,256 protein-coding genes are annotated in this genome. Analysis of resequencing data indicates that this earthworm is a triploid species. Furthermore, gene family evolution analysis shows that comprehensive expansion of gene families in the Amynthas corticis genome has produced more defensive functions compared with other species in Annelida. Quantitative proteomic iTRAQ analysis shows that expression of 147 proteins changed in the body of Amynthas corticis and 16 S rDNA sequencing shows that abundance of 28 microorganisms changed in the gut of Amynthas corticis when the earthworm was incubated with pathogenic Escherichia coli O157:H7. Our genome assembly provides abundant and valuable resources for the earthworm research community, serving as a first step toward uncovering the mysteries of this species, and may provide molecular level indicators of its powerful defensive functions, adaptation to complex environments and invasion ability.


Insects ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 326
Author(s):  
Yu-Jun Wang ◽  
Hua-Ling Wang ◽  
Xiao-Wei Wang ◽  
Shu-Sheng Liu

Females and males often differ obviously in morphology and behavior, and the differences between sexes are the result of natural selection and/or sexual selection. To a great extent, the differences between the two sexes are the result of differential gene expression. In haplodiploid insects, this phenomenon is obvious, since males develop from unfertilized zygotes and females develop from fertilized zygotes. Whiteflies of the Bemisia tabaci species complex are typical haplodiploid insects, and some species of this complex are important pests of many crops worldwide. Here, we report the transcriptome profiles of males and females in three species of this whitefly complex. Between-species comparisons revealed that non-sex-biased genes display higher variation than male-biased or female-biased genes. Sex-biased genes evolve at a slow rate in protein coding sequences and gene expression and have a pattern of evolution that differs from those of social haplodiploid insects and diploid animals. Genes with high evolutionary rates are more related to non-sex-biased traits—such as nutrition, immune system, and detoxification—than to sex-biased traits, indicating that the evolution of protein coding sequences and gene expression has been mainly driven by non-sex-biased traits.


Sign in / Sign up

Export Citation Format

Share Document