scholarly journals Functional Characterization of Enhancer Evolution in the Primate Lineage

2018 ◽  
Author(s):  
Jason C. Klein ◽  
Aidan Keith ◽  
Vikram Agarwal ◽  
Timothy Durham ◽  
Jay Shendure

BackgroundEnhancers play an important role in morphological evolution and speciation by controlling the spatiotemporal expression of genes. Due to technological limitations, previous efforts to understand the evolution of enhancers in primates have typically studied many enhancers at low resolution, or single enhancers at high resolution. Although comparative genomic studies reveal large-scale turnover of enhancers, a specific understanding of the molecular steps by which mammalian or primate enhancers evolve remains elusive.ResultsWe identified candidate hominoid-specific liver enhancers from H3K27ac ChIP-seq data. After locating orthologs in 11 primates spanning ∼40 million years, we synthesized all orthologs as well as computational reconstructions of 9 ancestral sequences for 348 “active tiles” of 233 putative enhancers. We concurrently tested all sequences (20 per tile) for regulatory activity with STARR-seq in HepG2 cells, with the goal of characterizing the evolutionary-functional trajectories of each enhancer. We observe groups of enhancer tiles with coherent trajectories, most of which can be explained by one or two mutational events per tile. We quantify the correlation between the number of mutations along a branch and the magnitude of change in functional activity. Finally, we identify 57 mutations that correlate with functional changes; these are enriched for cytosine deamination events within CpGs, compared to background events.ConclusionsWe characterized the evolutionary-functional trajectories of hundreds of liver enhancers throughout the primate phylogeny. We observe subsets of regulatory sequences that appear to have gained or lost activity at various positions in the primate phylogeny. We use these data to quantify the relationship between sequence and functional divergence, and to identify CpG deamination as a potentially important force in driving changes in enhancer activity during primate evolution.

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Andreas Lange ◽  
Prajal H. Patel ◽  
Brennen Heames ◽  
Adam M. Damry ◽  
Thorsten Saenger ◽  
...  

AbstractComparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard’s orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard’s structure appears to have been maintained with only minor changes over millions of years.


2019 ◽  
Author(s):  
Michael J. Bronski ◽  
Ciera C. Martinez ◽  
Holli A. Weld ◽  
Michael B. Eisen

AbstractLarge groups of species with well-defined phylogenies are excellent systems for testing evolutionary hypotheses. In this paper, we describe the creation of a comparative genomic resource consisting of 23 genomes from the species-rich Drosophila montium species group, 22 of which are presented here for the first time. The montium group is uniquely positioned for comparative studies. Within the montium clade, evolutionary distances are such that large numbers of sequences can be accurately aligned while also recovering strong signals of divergence; and the distance between the montium group and D. melanogaster is short enough so that orthologous sequence can be readily identified. All genomes were assembled from a single, small-insert library using MaSuRCA, before going through an extensive post-assembly pipeline. Estimated genome sizes within the montium group range from 155 Mb to 223 Mb (mean=196 Mb). The absence of long-distance information during the assembly process resulted in fragmented assemblies, with the scaffold NG50s varying widely based on repeat content and sample heterozygosity (min=18 kb, max=390 kb, mean=74 kb). The total scaffold length for most assemblies is also shorter than the estimated genome size, typically by 5 - 15 %. However, subsequent analysis showed that our assemblies are highly complete. Despite large differences in contiguity, all assemblies contain at least 96 % of known single-copy Dipteran genes (BUSCOs, n=2,799). Similarly, by aligning our assemblies to the D. melanogaster genome and remapping coordinates for a large set of transcriptional enhancers (n=3,457), we showed that each montium assembly contains orthologs for at least 91 % of D. melanogaster enhancers. Importantly, the genic and enhancer contents of our assemblies are comparable to that of far more contiguous Drosophila assemblies. The alignment of our own D. serrata assembly to a previously published PacBio D. serrata assembly also showed that our longest scaffolds (up to 1 Mb) are free of large-scale misassemblies. Our genome assemblies are a valuable resource that can be used to further resolve the montium group phylogeny; study the evolution of protein-coding genes and cis-regulatory sequences; and determine the genetic basis of ecological and behavioral adaptations.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Guo Wei ◽  
Franziska Eberl ◽  
Xinlu Chen ◽  
Chi Zhang ◽  
Sybille B. Unsicker ◽  
...  

Abstract Terpene synthases (TPSs) and trans-isoprenyl diphosphate synthases (IDSs) are among the core enzymes for creating the enormous diversity of terpenoids. Despite having no sequence homology, TPSs and IDSs share a conserved “α terpenoid synthase fold” and a trinuclear metal cluster for catalysis, implying a common ancestry with TPSs hypothesized to evolve from IDSs anciently. Here we report on the identification and functional characterization of novel IDS-like TPSs (ILTPSs) in fungi that evolved from IDS relatively recently, indicating recurrent evolution of TPSs from IDSs. Through large-scale bioinformatic analyses of fungal IDSs, putative ILTPSs that belong to the geranylgeranyl diphosphate synthase (GGDPS) family of IDSs were identified in three species of Melampsora. Among the GGDPS family of the two Melampsora species experimentally characterized, one enzyme was verified to be bona fide GGDPS and all others were demonstrated to function as TPSs. Melampsora ILTPSs displayed kinetic parameters similar to those of classic TPSs. Key residues underlying the determination of GGDPS versus ILTPS activity and functional divergence of ILTPSs were identified. Phylogenetic analysis implies a recent origination of these ILTPSs from a GGDPS progenitor in fungi, after the split of Melampsora from other genera within the class of Pucciniomycetes. For the poplar leaf rust fungus Melampsora larici-populina, the transcripts of its ILTPS genes were detected in infected poplar leaves, suggesting possible involvement of these recently evolved ILTPS genes in the infection process. This study reveals the recurrent evolution of TPSs from IDSs since their ancient occurrence and points to the possibility of a wide distribution of ILTPS genes in three domains of life.


2021 ◽  
Author(s):  
Andreas Lange ◽  
Prajal H. Patel ◽  
Brennen Heames ◽  
Adam M. Damry ◽  
Thorsten Saenger ◽  
...  

AbstractComparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from non-coding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus.Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and CD data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard’s orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard’s structure appears to have been maintained with only minor changes over millions of years.


2020 ◽  
Vol 10 (5) ◽  
pp. 1443-1455 ◽  
Author(s):  
Michael J. Bronski ◽  
Ciera C. Martinez ◽  
Holli A. Weld ◽  
Michael B. Eisen

Large groups of species with well-defined phylogenies are excellent systems for testing evolutionary hypotheses. In this paper, we describe the creation of a comparative genomic resource consisting of 23 genomes from the species-rich Drosophila montium species group, 22 of which are presented here for the first time. The montium group is well-positioned for clade genomics. Within the montium clade, evolutionary distances are such that large numbers of sequences can be accurately aligned while also recovering strong signals of divergence; and the distance between the montium group and D. melanogaster is short enough so that orthologous sequence can be readily identified. All genomes were assembled from a single, small-insert library using MaSuRCA, before going through an extensive post-assembly pipeline. Estimated genome sizes within the montium group range from 155 Mb to 223 Mb (mean = 196 Mb). The absence of long-distance information during the assembly process resulted in fragmented assemblies, with the scaffold NG50s varying widely based on repeat content and sample heterozygosity (min = 18 kb, max = 390 kb, mean = 74 kb). The total scaffold length for most assemblies is also shorter than the estimated genome size, typically by 5–15%. However, subsequent analysis showed that our assemblies are highly complete. Despite large differences in contiguity, all assemblies contain at least 96% of known single-copy Dipteran genes (BUSCOs, n = 2,799). Similarly, by aligning our assemblies to the D. melanogaster genome and remapping coordinates for a large set of transcriptional enhancers (n = 3,457), we showed that each montium assembly contains orthologs for at least 91% of D. melanogaster enhancers. Importantly, the genic and enhancer contents of our assemblies are comparable to that of far more contiguous Drosophila assemblies. The alignment of our own D. serrata assembly to a previously published PacBio D. serrata assembly also showed that our longest scaffolds (up to 1 Mb) are free of large-scale misassemblies. Our genome assemblies are a valuable resource that can be used to further resolve the montium group phylogeny; study the evolution of protein-coding genes and cis-regulatory sequences; and determine the genetic basis of ecological and behavioral adaptations.


2006 ◽  
Vol 34 (6) ◽  
pp. 1209-1214 ◽  
Author(s):  
B. Hamberger ◽  
J. Bohlmann

Diterpene resin acids, together with monoterpenes and sesquiterpenes, are the most prominent defence chemicals in conifers. These compounds belong to the large group of structurally diverse terpenoids formed by enzymes known as terpenoid synthases. CYPs (cytochrome P450-dependent mono-oxygenases) can further increase the structural diversity of these terpenoids. While most terpenoids are characterized as specialized or secondary metabolites, some terpenoids, such as the phytohormones GA (gibberellic acid), BRs (brassinosteroids) and ABA (abscisic acid), have essential functions in plant growth and development. To date, very few CYP genes involved in conifer terpenoid metabolism have been functionally characterized and were limited to two systems, yew (Taxus) and loblolly pine (Pinus taeda). The characterized yew CYP genes are involved in taxol diterpene biosynthesis, while the only characterized pine terpenoid CYP gene is part of DRA (diterpene resin acid) biosynthesis. These CYPs from yew and pine are members of two apparently conifer-specific CYP families within the larger CYP85 clan, one of four plant CYP multifamily clans. Other CYP families within the CYP85 clan were characterized from a variety of angiosperms with functions in terpenoid phytohormone metabolism of GA, BR, and ABA. The recent development of EST (expressed sequence tag) and FLcDNA (where FL is full-length) sequence databases and cDNA collections for species of two conifers, spruce (Picea) and pine, allows for the discovery of new terpenoid CYPs in gymnosperms by means of large-scale sequence mining, phylogenetic analysis and functional characterization. Here, we present a snapshot of conifer CYP data mining, discovery of new conifer CYPs in all but one family within the CYP85 clan, and suggestions for their functional characterization. This paper will focus on the discovery of conifer CYPs associated with diterpene metabolism and CYP with possible functions in the formation of GA, BR, and ABA in conifers.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Smritikana Dutta ◽  
Anwesha Deb ◽  
Prasun Biswas ◽  
Sukanya Chakraborty ◽  
Suman Guha ◽  
...  

AbstractBamboos, member of the family Poaceae, represent many interesting features with respect to their fast and extended vegetative growth, unusual, yet divergent flowering time across species, and impact of sudden, large scale flowering on forest ecology. However, not many studies have been conducted at the molecular level to characterize important genes that regulate vegetative and flowering habit in bamboo. In this study, two bamboo FD genes, BtFD1 and BtFD2, which are members of the florigen activation complex (FAC) have been identified by sequence and phylogenetic analyses. Sequence comparisons identified one important amino acid, which was located in the DNA-binding basic region and was altered between BtFD1 and BtFD2 (Ala146 of BtFD1 vs. Leu100 of BtFD2). Electrophoretic mobility shift assay revealed that this alteration had resulted into ten times higher binding efficiency of BtFD1 than BtFD2 to its target ACGT motif present at the promoter of the APETALA1 gene. Expression analyses in different tissues and seasons indicated the involvement of BtFD1 in flower and vegetative development, while BtFD2 was very lowly expressed throughout all the tissues and conditions studied. Finally, a tenfold increase of the AtAP1 transcript level by p35S::BtFD1 Arabidopsis plants compared to wild type confirms a positively regulatory role of BtFD1 towards flowering. However, constitutive expression of BtFD1 had led to dwarfisms and apparent reduction in the length of flowering stalk and numbers of flowers/plant, whereas no visible phenotype was observed for BtFD2 overexpression. This signifies that timely expression of BtFD1 may be critical to perform its programmed developmental role in planta.


Genome ◽  
2004 ◽  
Vol 47 (2) ◽  
pp. 239-245 ◽  
Author(s):  
Yaping Qian ◽  
Li Jin ◽  
Bing Su

The large-insert genomic DNA library is a critical resource for genome-wide genetic dissection of target species. We constructed a high-redundancy bacterial artificial chromosome (BAC) library of a New World monkey species, the black-handed spider monkey (Ateles geoffroyi). A total of 193 152 BAC clones were generated in this library. The average insert size of the BAC clones was estimated to be 184.6 kb with the small inserts (50-100 kb) accounting for less than 3% and the non-recombinant clones only 1.2%. Assuming a similar genome size with humans, the spider monkey BAC library has about 11× genome coverage. In addition, by end sequencing of randomly selected BAC clones, we generated 367 sequence tags for the library. When blasted against human genome, they showed a good correlation between the number of hit clones and the size of the chromosomes, an indication of unbiased chromosomal distribution of the library. This black-handed spider monkey BAC library would serve as a valuable resource in comparative genomic study and large-scale genome sequencing of nonhuman primates.Key words: black-handed spider monkeys, Ateles geoffroyi, BAC library.


Sign in / Sign up

Export Citation Format

Share Document