scholarly journals homologizer: Phylogenetic phasing of gene copies into polyploid subgenomes

2020 ◽  
Author(s):  
William A. Freyman ◽  
Matthew G. Johnson ◽  
Carl J. Rothfels

SummaryOrganisms such as allopolyploids and F1 hybrids contain multiple subgenomes, each potentially with its own evolutionary history. These organisms present a challenge for multilocus phylogenetic inference and other analyses since it is not apparent which gene copies from different loci are from the same subgenome.Here we introduce homologizer, a flexible Bayesian approach that uses a phylogenetic framework to infer the phasing of gene copies across loci into polyploid subgenomes.Through the use of simulation tests we demonstrate that homologizer is robust to a wide range of factors, such as the phylogenetic informativeness of loci and incomplete lineage sorting. Furthermore, we establish the utility of homologizer on real data, by analyzing a multilocus dataset consisting of nine diploids and 19 tetraploids from the fern family Cystopteridaceae.Finally, we describe how homologizer may potentially be used beyond its core phasing functionality to identify non-homologous sequences, such as hidden paralogs, contaminants, or allelic variation that was erroneously modelled as homeologous.

2021 ◽  
Author(s):  
Caitlin Cherryh ◽  
Bui Quang Minh ◽  
Rob Lanfear

AbstractMost phylogenetic analyses assume that the evolutionary history of an alignment (either that of a single locus, or of multiple concatenated loci) can be described by a single bifurcating tree, the so-called the treelikeness assumption. Treelikeness can be violated by biological events such as recombination, introgression, or incomplete lineage sorting, and by systematic errors in phylogenetic analyses. The incorrect assumption of treelikeness may then mislead phylogenetic inferences. To quantify and test for treelikeness in alignments, we develop a test statistic which we call the tree proportion. This statistic quantifies the proportion of the edge weights in a phylogenetic network that are represented in a bifurcating phylogenetic tree of the same alignment. We extend this statistic to a statistical test of treelikeness using a parametric bootstrap. We use extensive simulations to compare tree proportion to a range of related approaches. We show that tree proportion successfully identifies non-treelikeness in a wide range of simulation scenarios, and discuss its strengths and weaknesses compared to other approaches. The power of the tree-proportion test to reject non-treelike alignments can be lower than some other approaches, but these approaches tend to be limited in their scope and/or the ease with which they can be interpreted. Our recommendation is to test treelikeness of sequence alignments with both tree proportion and mosaic methods such as 3Seq. The scripts necessary to replicate this study are available at https://github.com/caitlinch/treelikeness


Author(s):  
Todd McLay ◽  
Gareth D. Holmes ◽  
Paul I. Forster ◽  
Susan E. Hoebee ◽  
Denise R. Fernando

The rainforest genus Gossia N.Snow & Guymer (Myrtaceae) occurs in Australia, Melanesia and Malesia, and is capable of hyperaccumulating the heavy metal manganese (Mn). Here, we used nuclear ribosomal and plastid spacer DNA-sequence data to reconstruct the phylogeny of 19 Australian species of Gossia and eight New Caledonian taxa. Our results indicated that the relationship between Gossia and Austromyrtus (Nied.) Burret is not fully resolved, and most Australian species were supported as monophyletic. Non-monophyly might be related to incomplete lineage sorting or inaccurate taxonomic classification. Bark type appears to be a morphological synapomorphy separating two groups of species, with more recently derived lineages having smooth and mottled ‘python’ bark. New Caledonian species were well resolved in a single clade, but were not the first diverging Gossia lineage, calling into doubt the results of a recent study that found Zealandia as the ancestral area of tribe Myrteae. Within Australia, the evolution of multiple clades has probably been driven by well-known biogeographic barriers. Some species with more widespread distributions have been able to cross these barriers by having a wide range of soil-substrate tolerances. Novel Mn-hyperaccumulating species were identified, and, although Mn hyperaccumulation was not strongly correlated with phylogenetic position, there appeared to be some difference in accumulation levels among clades. Our study is the first detailed phylogenetic investigation of Gossia and will serve as a reference for future studies seeking to understand the origin and extent of hyperaccumulation within the Myrteae and Myrtaceae more broadly.


2018 ◽  
Author(s):  
Zhi Yan ◽  
Peng Du ◽  
Matthew W. Hahn ◽  
Luay Nakhleh

AbstractThe multispecies coalescent (MSC) has emerged as a powerful and desirable framework for species tree inference in phylogenomic studies. Under this framework, the data for each locus is assumed to consist of orthologous, single-copy genes, and heterogeneity across loci is assumed to be due to incomplete lineage sorting (ILS). These assumptions have led biologists that use ILS-aware inference methods, whether based directly on the MSC or proven to be statistically consistent under it (collectively referred to here as MSC-based methods), to exclude all loci that are present in more than a single copy in any of the studied genomes. Furthermore, such analyses entail orthology assignment to avoid the potential of hidden paralogy in the data. The question we seek to answer in this study is: What happens if one runs such species tree inference methods on data where paralogy is present, in addition to or without ILS being present? Through simulation studies and analyses of two biological data sets, we show that running such methods on data with paralogs provide very accurate results, either by treating all gene copies within a family as alleles from multiple individuals or by randomly selecting one copy per species. Our results have significant implications for the use of MSC-based phylogenomic analyses, demonstrating that they do not have to be restricted to single-copy loci, thus greatly increasing the amount of data that can be used. [Multispecies coalescent; incomplete lineage sorting; gene duplication and loss; orthology; paralogy.]


DNA Barcodes ◽  
2015 ◽  
Vol 3 (1) ◽  
Author(s):  
S. Behrens-Chapuis ◽  
F. Herder ◽  
H. R. Esmaeili ◽  
J. Freyhof ◽  
N. A. Hamidan ◽  
...  

AbstractDNA barcoding is a fast and reliable tool for species identification, and has been successfully applied to a wide range of freshwater fishes. The limitations reported were mainly attributed to effects of geographic scale, taxon-sampling, incomplete lineage sorting, or mitochondrial introgression. However, the metrics for the success of assigning unknown samples to species or genera also depend on a suited taxonomic framework. A simultaneous use of the mitochondrial COI and the nuclear RHO gene turned out to be advantageous for the barcode efficiency in a few previous studies. Here, we examine 14 cyprinid fish genera, with a total of 74 species, where standard DNA barcoding failed to identify closely related species unambiguously. Eight of the genera (Acanthobrama, Alburnus, Chondrostoma, Gobio, Mirogrex, Phoxinus, Scardinius, and Squalius) contain species that exhibit very low interspecific divergence, or haplotype sharing (12 species pairs) with presumed introgression based on mtCOI data. We aimed to test the utility of the nuclear rhodopsin marker to uncover reasons for the high similarity and haplotype sharing in these different groups. The included labeonine species belonging to Crossocheilus, Hemigrammocapoeta, Tylognathus and Typhlogarra were found to be nested within the genus Garra based on mtCOI. This specific taxonomic uncertainty was also addressed by the use of the additional nuclear marker. As a measure of the delineation success we computed barcode gaps, which were present in 75% of the species based on mtCOI, but in only 39% based on nuclear rhodopsin sequences. Most cases where standard barcodes failed to offer unambiguous species identifications could not be resolved by adding the nuclear marker. However, in the labeonine cyprinids included, nuclear rhodopsin data generally supported the lineages as defined by the mitochondrial marker. This suggests that mitochondrial patterns were not mislead by introgression, but are caused by an inadequate taxonomy. Our findings support the transfer of the studied species of Crossocheilus, Hemigrammocapoeta, Tylognathus and Typhlogarra to Garra.


2020 ◽  
Author(s):  
Cody Coyotee Howard ◽  
Andrew A. Crowl ◽  
Timothy S. Harvey ◽  
Nico Cellinese

AbstractThe Ledebouriinae (Scilloideae, Asparagaceae) are a widespread group of bulbous geophytes found predominantly throughout seasonal climates in sub-Saharan Africa, with a handful of taxa in Madagascar, the Middle East, India and Sri Lanka. An understanding of the phylogenetic relationships within the group have been historically difficult to reconstruct, however. Here, we provide the first phylogenomic perspective into the Ledebouriinae. We use this renewed phylogenetic framework to hypothesize historical factors that have resulted in the topology recovered. Using the Angiosperms353 targeted enrichment probe set, we consistently recovered four major clades (i.e. two Ledebouria clades, Drimiopsis, and Resnova). The two Ledebouria clades closely align with geography, either consisting almost entirely of sub-Saharan African taxa (Ledebouria Clade A), or East African and non-African taxa (Ledebouria Clade B). Our results also suggest that the Ledebouriinae underwent a rapid radiation leading to rampant incomplete lineage sorting. [Asparagaceae; Drimiopsis; geophytes; Ledebouria; monocots; Resnova; Scilloideae.]


2020 ◽  
Author(s):  
Junyi Ding ◽  
Donglai Hua ◽  
James S. Borrell ◽  
Richard J.A. Buggs ◽  
Luwei Wang ◽  
...  

SummaryMolecular markers can allow us to differentiate species that occupy a morphological continuum, and detect patterns of allele sharing that can help us understand the dynamics of geographic zones where they meet. Betula microphylla is a declining wetland species in NW China that forms a continuum of leaf morphology with its relative Betula tianshanica.We use ecological niche models (ENM) to predict the distribution of B. microphylla, B. tianshanica and the more commonly occurring B. platyphylla. We use restriction-site associated DNA sequencing and SSRs to resolve their genetic structure and patterns of allele sharing.ENM predicted an expansion of suitable range of B. tianshanica into B. microphylla since the Last Glacial Maximum and the contraction of B. microphylla’s range in the future. We resolved the species identification of some intermediate morphotypes. We found signatures of bidirectional introgression between B. microphylla and B. tianshanica with SNPs showing more admixture than SSRs. Introgression from B. microphylla into B. tianshanica was greater in the Tianshan Mountains where the two species have occurred in proximity. Unexpectedly, introgression from B. tianshanica into B. microphylla was widespread in the Altay Mountains where there are no records of B. tianshanica occurrence.This presence of B. tianshanica-derived alleles far beyond the species’ current range could be due to unexpectedly high pollen flow, undiscovered populations of B. tianshanica in the region, incomplete lineage sorting, or selection for adaptive introgression in B. microphylla. These different interpretations have contrasting implications for the conservation of B. microphylla.


2019 ◽  
Vol 69 (3) ◽  
pp. 502-520 ◽  
Author(s):  
Frank T Burbrink ◽  
Felipe G Grazziotin ◽  
R Alexander Pyron ◽  
David Cundall ◽  
Steve Donnellan ◽  
...  

Abstract Genomics is narrowing uncertainty in the phylogenetic structure for many amniote groups. For one of the most diverse and species-rich groups, the squamate reptiles (lizards, snakes, and amphisbaenians), an inverse correlation between the number of taxa and loci sampled still persists across all publications using DNA sequence data and reaching a consensus on the relationships among them has been highly problematic. In this study, we use high-throughput sequence data from 289 samples covering 75 families of squamates to address phylogenetic affinities, estimate divergence times, and characterize residual topological uncertainty in the presence of genome-scale data. Importantly, we address genomic support for the traditional taxonomic groupings Scleroglossa and Macrostomata using novel machine-learning techniques. We interrogate genes using various metrics inherent to these loci, including parsimony-informative sites (PIS), phylogenetic informativeness, length, gaps, number of substitutions, and site concordance to understand why certain loci fail to find previously well-supported molecular clades and how they fail to support species-tree estimates. We show that both incomplete lineage sorting and poor gene-tree estimation (due to a few undesirable gene properties, such as an insufficient number of PIS), may account for most gene and species-tree discordance. We find overwhelming signal for Toxicofera, and also show that none of the loci included in this study supports Scleroglossa or Macrostomata. We comment on the origins and diversification of Squamata throughout the Mesozoic and underscore remaining uncertainties that persist in both deeper parts of the tree (e.g., relationships between Dibamia, Gekkota, and remaining squamates; among the three toxicoferan clades Iguania, Serpentes, and Anguiformes) and within specific clades (e.g., affinities among gekkotan, pleurodont iguanians, and colubroid families).


2021 ◽  
Author(s):  
Federica Valerio ◽  
Nicola Zadra ◽  
Omar Rota Stabelli ◽  
Lino Ometto

AbstractTrue fruit flies (Tephritidae) include several species that cause extensive damage to agriculture worldwide. Among them, species of the genus Bactrocera are widely studied to understand the traits associated to their invasiveness and ecology. Comparative approaches based on a reliable phylogenetic framework are particularly effective, but, to date, molecular phylogenies of Bactrocera are still controversial. Here, we employed a comprehensive genomic dataset to infer a robust backbone phylogeny of eleven representative Bactrocera species and two outgroups. We further provide the first genome scaled inference of their divergence using calibrated relaxed clock. The results of our analyses support a closer relationship of B. dorsalis to B. latifrons than to B. tryoni, in contrast to all mitochondrial-based phylogenies. By comparing different evolutionary models, we show that this incongruence likely derives from the fast and recent radiation of these species that occurred around 2 million years ago, which may be associated with incomplete lineage sorting and possibly (ongoing) hybridization. These results can serve as basis for future comparative analyses and highlight the utility of using large datasets and efficient phylogenetic approaches to study the evolutionary history of species of economic importance.


Botany ◽  
2009 ◽  
Vol 87 (2) ◽  
pp. 164-177 ◽  
Author(s):  
Rosalía Piñeiro ◽  
Andrea Costa ◽  
Javier Fuertes Aguilar ◽  
Gonzalo Nieto Feliner

Low-copy nuclear genes have been suggested as a promising source of independent phylogeographic markers in plants. However, the available studies at the intraspecific level have revealed that extracting information from them is frequently hampered by paralogy and lack of coalescence of alleles. It is thus relevant to test their utility with plants for which solid data from other markers are available. The aims of this study are to retrieve phylogeographic useful information in a low-copy nuclear gene by examining the congruence of the genetic variation with the geography, as well as with previous nuclear ribosomal, plastid, and amplified fragment length polymorphism (AFLP) markers. Seven combinations of primers have been assayed to characterize the structure of GapC (cytosolic glyceraldehyde 3-phosphate dehydrogenase) in Armeria pungens (Link) Hoffmanns. & Link, a linearly distributed Atlantic–Mediterranean disjunct sand-dune species. A matrix of 101 direct sequences from 71 individuals was analysed with statistical parsimony. To check the reliability of direct sequencing, 216 cloned sequences were also generated. Tests of recombination have also been attempted. By comparing nucleotide and amino acid sequences, three different paralogs (1, 2, 3) were identified of which paralog 2 was sampled for phylogeographic inference. Within this paralog, 13 alleles belonging in three different sequence types (I, II, III) were detected. These types are shown to correspond with lineages from the same locus whose splitting predates the origin of A. pungens, although type III could be a recent paralog. Allelic variation within types I and II followed a clear geographic trend supporting the two main genetic lineages detected in A. pungens with previous markers. This study suggests that information on the population history of a species can be retrieved, even if some uncertainty remains on the source of variation of low-copy nuclear gene sequences, either alleles from the same locus or paralogs.


Author(s):  
David A. Ansley

The coherence of the electron flux of a transmission electron microscope (TEM) limits the direct application of deconvolution techniques which have been used successfully on unmanned spacecraft programs. The theory assumes noncoherent illumination. Deconvolution of a TEM micrograph will, therefore, in general produce spurious detail rather than improved resolution.A primary goal of our research is to study the performance of several types of linear spatial filters as a function of specimen contrast, phase, and coherence. We have, therefore, developed a one-dimensional analysis and plotting program to simulate a wide 'range of operating conditions of the TEM, including adjustment of the:(1) Specimen amplitude, phase, and separation(2) Illumination wavelength, half-angle, and tilt(3) Objective lens focal length and aperture width(4) Spherical aberration, defocus, and chromatic aberration focus shift(5) Detector gamma, additive, and multiplicative noise constants(6) Type of spatial filter: linear cosine, linear sine, or deterministic


Sign in / Sign up

Export Citation Format

Share Document