scholarly journals Targeted Enrichment of Large Gene Families for Phylogenetic Inference: Phylogeny and Molecular Evolution of Photosynthesis Genes in the Portullugo (Caryophyllales)

2017 ◽  
Author(s):  
Abigail J. Moore ◽  
Jurriaan M. de Vos ◽  
Lillian P. Hancock ◽  
Eric Goolsby ◽  
Erika J. Edwards

ABSTRACTHybrid enrichment is an increasingly popular approach for obtaining hundreds of loci for phylogenetic analysis across many taxa quickly and cheaply. The genes targeted for sequencing are typically single-copy loci, which facilitate a more straightforward sequence assembly and homology assignment process. However, single copy loci are relatively uncommon elements of most genomes, and as such may provide a biased evolutionary history. Furthermore, this approach limits the inclusion of most genes of functional interest, which often belong to multi-gene families. Here we demonstrate the feasibility of including large gene families in hybrid enrichment protocols for phylogeny reconstruction and subsequent analyses of molecular evolution, using a new set of bait sequences designed for the “portullugo” (Caryophyllales), a moderately sized lineage of flowering plants (~2200 species) that includes the cacti and harbors many evolutionary transitions to C4 and CAM photosynthesis. Including multi-gene families allowed us to simultaneously infer a robust phylogeny and construct a dense sampling of sequences for a major enzyme of C4 and CAM photosynthesis, which revealed the accumulation of adaptive amino acid substitutions associated with C4 and CAM origins in particular paralogs. Our final set of matrices for phylogenetic analyses included 75–218 loci across 74 taxa, with ~50% matrix completeness across datasets. Phylogenetic resolution was greatly improved across the tree, at both shallow and deep levels. Concatenation and coalescent-based approaches both resolve with strong support the sister lineage of the cacti: Anacampserotaceae + Portulacaceae, two lineages of mostly diminutive succulent herbs of warm, arid regions. In spite of this congruence, BUCKy concordance analyses demonstrated strong and conflicting signals across gene trees for the resolution of the sister group of the cacti. Our results add to the growing number of examples illustrating the complexity of phylogenetic signals in genomic-scale data.

2018 ◽  
Author(s):  
Meng Wu ◽  
Jamie L. Kostyun ◽  
Leonie C. Moyle

ABSTRACTWithin the economically important plant family Solanaceae, Jaltomata is a rapidly evolving genus that has extensive diversity in flower size and shape, as well as fruit and nectar color, among its ∼80 species. Here we report the whole-genome sequencing, assembly, and annotation, of one representative species (Jaltomata sinuosa) from this genus. Combining PacBio long-reads (25X) and Illumina short-reads (148X) achieved an assembly of approximately 1.45 Gb, spanning ∼96% of the estimated genome. 96% of curated single-copy orthologs in plants were detected in the assembly, supporting a high level of completeness of the genome. Similar to other Solanaceous species, repetitive elements made up a large fraction (∼80%) of the genome, with the most recently active element, Gypsy, expanding across the genome in the last 1-2 million years.Computational gene prediction, in conjunction with a merged transcriptome dataset from 11 tissues, identified 34725 protein-coding genes. Comparative phylogenetic analyses with six other sequenced Solanaceae species determined that Jaltomata is most likely sister to Solanum, although a large fraction of gene trees supported a conflicting bipartition consistent with substantial introgression between Jaltomata and Capsicum after these species split. We also identified gene family dynamics specific to Jaltomata, including expansion of gene families potentially involved in novel reproductive trait development, and loss of gene families that accompanied the loss of self-incompatibility. This high-quality genome will facilitate studies of phenotypic diversification in this rapidly radiating group, and provide a new point of comparison for broader analyses of genomic evolution across the Solanaceae.


2021 ◽  
Author(s):  
Alberto Cenci ◽  
Mairenys Concepci&oacuten-Hernández ◽  
Geert Angenon ◽  
Mathieu Rouard

GDSL-type esterase/lipase (GELP) enzymes have multiple functions in plants, spanning from developmental processes to the response to biotic and abiotic stresses. Genes encoding GELP belong to a large gene family with several tens to more than hundred members per species in angiosperms. Here, we applied iterative phylogenic analyses to identify 10 main clusters subdivided into 44 expert-curated reference orthogroups (OGs) using three monocot and five dicot genomes. Our results show that some GELP OGs expanded while others were maintained as single copy genes. This semi-automatic approach proves to be effective to characterize large gene families and provides a solid classification framework for the GELP members in angiosperms. The orthogroup-based reference will be useful to perform comparative studies, infer gene functions and better understand the evolutionary history of this gene family.


2019 ◽  
Vol 10 (2) ◽  
pp. 811-826 ◽  
Author(s):  
Albert Erives ◽  
Bernd Fritzsch

The evolutionary diversification of animals is one of Earth’s greatest marvels, yet its earliest steps are shrouded in mystery. Animals, the monophyletic clade known as Metazoa, evolved wildly divergent multicellular life strategies featuring ciliated sensory epithelia. In many lineages epithelial sensoria became coupled to increasingly complex nervous systems. Currently, different phylogenetic analyses of single-copy genes support mutually-exclusive possibilities that either Porifera or Ctenophora is sister to all other animals. Resolving this dilemma would advance the ecological and evolutionary understanding of the first animals and the evolution of nervous systems. Here we describe a comparative phylogenetic approach based on gene duplications. We computationally identify and analyze gene families with early metazoan duplications using an approach that mitigates apparent gene loss resulting from the miscalling of paralogs. In the transmembrane channel-like (TMC) family of mechano-transducing channels, we find ancient duplications that define separate clades for Eumetazoa (Placozoa + Cnidaria + Bilateria) vs. Ctenophora, and one duplication that is shared only by Eumetazoa and Porifera. In the Max-like protein X (MLX and MLXIP) family of bHLH-ZIP regulators of metabolism, we find that all major lineages from Eumetazoa and Porifera (sponges) share a duplicated gene pair that is sister to the single-copy gene maintained in Ctenophora. These results suggest a new avenue for deducing deep phylogeny by choosing rather than avoiding ancient gene paralogies.


2020 ◽  
Author(s):  
Matthew H Van Dam ◽  
James B Henderson ◽  
Lauren Esposito ◽  
Michelle Trautwein

Abstract Ultraconserved genomic elements (UCEs) are generally treated as independent loci in phylogenetic analyses. The identification pipeline for UCE probes does not require prior knowledge of genetic identity, only selecting loci that are highly conserved, single copy, without repeats, and of a particular length. Here, we characterized UCEs from 11 phylogenomic studies across the animal tree of life, from birds to marine invertebrates. We found that within vertebrate lineages, UCEs are mostly intronic and intergenic, while in invertebrates, the majority are in exons. We then curated four different sets of UCE markers by genomic category from five different studies including: birds, mammals, fish, Hymenoptera (ants, wasps, and bees), and Coleoptera (beetles). Of genes captured by UCEs, we find that many are represented by two or more UCEs, corresponding to nonoverlapping segments of a single gene. We considered these UCEs to be nonindependent, merged all UCEs that belonged to a particular gene, constructed gene and species trees, and then evaluated the subsequent effect of merging cogenic UCEs on gene and species tree reconstruction. Average bootstrap support for merged UCE gene trees was significantly improved across all data sets apparently driven by the increase in loci length. Additionally, we conducted simulations and found that gene trees generated from merged UCEs were more accurate than those generated by unmerged UCEs. As loci length improves gene tree accuracy, this modest degree of UCE characterization and curation impacts downstream analyses and demonstrates the advantages of incorporating basic genomic characterizations into phylogenomic analyses. [Anchored hybrid enrichment; ants; ASTRAL; bait capture; carangimorph; Coleoptera; conserved nonexonic elements; exon capture; gene tree; Hymenoptera; mammal; phylogenomic markers; songbird; species tree; ultraconserved elements; weevils.]


2009 ◽  
Vol 34 (1) ◽  
pp. 162-172 ◽  
Author(s):  
Katherine G. Mathews ◽  
Niall Dunne ◽  
Emily York ◽  
Lena Struwe

A phylogenetic study and taxonomic revision of the four currently accepted species of Bartonia (Gentianaceae, subtribe Swertiinae) were conducted in order to test species boundaries and interspecific relationships. Species boundaries were examined based on measurements of key quantitative and qualitative morphological characters as given in the original descriptions. Phylogenetic analyses were performed using molecular data from the nuclear internal transcribed spacer region and chloroplast DNA (trnL intron through the trnL-F spacer), separately and combined using parsimony and Bayesian methodologies, incorporating outgroups from subtribes Swertiinae and Gentianinae. The morphological study revealed that characters of one species, B. texana, represent a subset of the morphological variation found within B. paniculata, but that B. paniculata, B. verna, and B. virginica could all be separated from one another. The molecular phylogenetic analyses all found B. texana to nest in a clade with the two recognized subspecies of B. paniculata (subsp. paniculata and subsp. iodandra), making the latter paraphyletic. Bartonia texana is here reduced to subspecific rank, as Bartonia paniculata subsp. texana. Also, the phylogenetic analyses showed strong support for a sister group relationship between B. verna and B. virginica, as opposed to between B. paniculata and B. virginica as has been previously suggested.


2020 ◽  
Author(s):  
Chendan Wei ◽  
Zhenyi Wang ◽  
Jianyu Wang ◽  
Jia Teng ◽  
Shaoqi Shen ◽  
...  

AbstractExtensive sequence similarity between duplicated gene pairs produced by paleo-polyploidization may result from illegitimate recombination between homologous chromosomes. The genomes of Asian cultivated rice Xian/indica (XI) and Geng/japonica (GJ) have recently been updated, providing new opportunities for investigating on-going gene conversion events. Using comparative genomics and phylogenetic analyses, we evaluated gene conversion rates between duplicated genes produced by polyploidization 100 million years ago (mya) in GJ and XI. At least 5.19%–5.77% of genes duplicated across three genomes were affected by whole-gene conversion after the divergence of GJ and XI at ~0.4 mya, with more (7.77%–9.53%) showing conversion of only gene portions. Independently converted duplicates surviving in genomes of different subspecies often used the same donor genes. On-going gene conversion frequency was higher near chromosome termini, with a single pair of homoeologous chromosomes 11 and 12 in each genome most affected. Notably, on-going gene conversion has maintained similarity between very ancient duplicates, provided opportunities for further gene conversion, and accelerated rice divergence. Chromosome rearrangement after polyploidization may result in gene loss, providing a basis for on-going gene conversion, and may have contributed directly to restricted recombination/conversion between homoeologous regions. Gene conversion affected biological functions associated with multiple genes, such as catalytic activity, implying opportunities for interaction among members of large gene families, such as NBS-LRR disease-resistance genes, resulting in gene conversion. Duplicated genes in rice subspecies generated by grass polyploidization ~100 mya remain affected by gene conversion at high frequency, with important implications for the divergence of rice subspecies.One-sentence summaryOn-going gene conversion between duplicated genes produced by 100 mya polyploidization contributes to rice subspecies divergence, often involving the same donor genes at chromosome termini.


2019 ◽  
Author(s):  
Matthew H. Van Dam ◽  
James B. Henderson ◽  
Lauren Esposito ◽  
Michelle Trautwein

ABSTRACTUltraconserved genomic elements (UCEs), are generally treated as independent loci in phylogenetic analyses. The identification pipeline for UCE probes is agnostic to genetic identity, only selecting loci that are highly conserved, single copy, without repeats, and of a particular length. Here we characterized UCEs from 12 phylogenomic studies across the animal tree of life, from birds to marine invertebrates. We found that within vertebrate lineages, UCEs are mostly intronic and intergenic, while in invertebrates, the majority are in exons. We then curated 4 different sets of UCE markers by genomic category from 5 different studies including; birds, mammals, fish, Hymenoptera (ants, wasps and bees) and Coleoptera (beetles). Of genes captured by UCEs, we find that many are represented by 2 or more UCEs, corresponding to non-overlapping segments of a single gene. We considered these UCEs to be non-independent, merged all UCEs that belonged to a particular gene, constructed gene and species trees, and then evaluated the subsequent effect of merging co-genic UCEs on gene and species tree reconstruction. Average bootstrap support for merged UCE gene trees were significantly improved across all datasets. Increased loci length appears to drive this increase in bootstrap support. Additionally, we found that gene trees generated from merged UCEs were more accurate than those generated by unmerged and randomly merged UCEs, based on our simulation study. This modest degree of UCE characterization and curation impacts downstream analyses and demonstrates the advantages of incorporating basic genomic characterizations into phylogenomic analyses.


2019 ◽  
Author(s):  
Albert Erives ◽  
Bernd Fritzsch

The evolutionary diversification of animals is one of Earth’s greatest triumphs, yet its origins are still shrouded in mystery. Animals, the monophyletic clade known as Metazoa, evolved wildly divergent multicellular life strategies featuring ciliated sensory epithelia. In many lineages epithelial sensoria became coupled to increasingly complex nervous systems. Currently, different phylogenetic analyses of single-copy genes support mutually-exclusive possibilities that either Porifera or Ctenophora is sister to all other animals. Resolving this dilemma would advance the ecological and evolutionary understanding of the first animals and the evolution of nervous systems. Here we describe a comparative phylogenetic approach based on gene duplications. We computationally identify and analyze gene families with early metazoan duplications using an approach that mitigates apparent gene loss resulting from the miscalling of paralogs. In the transmembrane channel-like (TMC) family of mechano-transducing channels, we find ancient duplications that define separate clades for Eumetazoa (Placozoa + Cnidaria + Bilateria) versus Ctenophora, and one duplication that is shared only by Eumetazoa and Porifera. In the MLX/MLXIP family of bHLH-ZIP regulators of metabolism, we find that all major lineages from Eumetazoa and Porifera (sponges) share a duplication, absent in Ctenophora. These results suggest a new avenue for deducing deep phylogeny by choosing rather than avoiding ancient gene paralogies.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mehmet Dayi ◽  
Natsumi Kanzaki ◽  
Simo Sun ◽  
Tatsuya Ide ◽  
Ryusei Tanaka ◽  
...  

AbstractCaenorhabditis auriculariae, which was morphologically described in 1999, was re-isolated from a Platydema mushroom-associated beetle. Based on the re-isolated materials, some morphological characteristics were re-examined and ascribed to the species. In addition, to clarify phylogenetic relationships with other Caenorhabditis species and biological features of the nematode, the whole genome was sequenced and assembled into 109.5 Mb with 16,279 predicted protein-coding genes. Molecular phylogenetic analyses based on ribosomal RNA and 269 single-copy genes revealed the species is closely related to C. sonorae and C. monodelphis placing them at the most basal clade of the genus. C. auriculariae has morphological characteristics clearly differed from those two species and harbours a number of species-specific gene families, indicating its usefulness as a new outgroup species for Caenorhabditis evolutionary studies. A comparison of carbohydrate-active enzyme (CAZy) repertoires in genomes, which we found useful to speculate about the lifestyle of Caenorhabditis nematodes, suggested that C. auriculariae likely has a life-cycle with tight-association with insects.


Insects ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1049
Author(s):  
Huifeng Zhao ◽  
Ye Chen ◽  
Zitong Wang ◽  
Haifeng Chen ◽  
Yaoguang Qin

The complete mitochondrial genomes of two species of Chalcididae were newly sequenced: Brachymeria lasus and Haltichella nipponensis. Both circular mitogenomes are 15,147 and 15,334 bp in total length, respectively, including 13 protein-coding genes (PCGs), two ribosomal RNA genes (rRNAs), and 22 transfer RNA genes (tRNAs) and an A+T-rich region. The nucleotide composition indicated a strong A/T bias. All PCGs of B. lasus and H. nipponensis began with the start codon ATD, except for B. lasus, which had an abnormal initiation codon TTG in ND1. Most PCGs of the two mitogenomes are terminated by a codon of TAR, and the remaining PCGs by the incomplete stop codon T or TA (ATP6, COX3, and ND4 in both species, with an extra CYTB in B. lasus). Except for trnS1 and trnF, all tRNAs can be folded into a typical clover structure. Both mitogenomes had similar control regions, and two repeat units of 135 bp were found in H. nipponensis. Phylogenetic analyses based on two datasets (PCG123 and PCG12) covering Chalcididae and nine families of Chalcidoidea were conducted using two methods (maximum likelihood and Bayesian inference); all the results support Mymaridae as the sister group of the remaining Chalcidoidea, with Chalcididae as the next successive group. Only analyses of PCG123 generated similar topologies of Mymaridae + (Chalcididae + (Agaonidae + remaining Chalcidoidea)) and provided one relative stable clade as Eulophidae + (Torymidae + (Aphelinidae + Trichogrammatidae)). Our mitogenomic phylogenetic results share one important similarity with earlier molecular phylogenetic efforts: strong support for the monophyly of many families, but a largely unresolved or unstable “backbone” of relationships among families.


Sign in / Sign up

Export Citation Format

Share Document