scholarly journals A Time-calibrated Firefly (Coleoptera: Lampyridae) Phylogeny: Using Genomic Data for Divergence Time Estimation

2021 ◽  
Author(s):  
Sebastian Hoehna ◽  
Sarah E Lower ◽  
Pablo Duchen ◽  
Ana Catalan

Fireflies (Coleoptera: Lampyridae) consist of over 2,000 described extant species. A well-resolved phylogeny of fireflies is important for the study of their bioluminescence, evolution, and conservation. We used a recently published anchored hybrid enrichment dataset (AHE; 436 loci for 88 Lampyridae species and 10 outgroup species) and state-of-the-art statistical methods (the fossilized birth-death-range process implemented in a Bayesian framework) to estimate a time-calibrated phylogeny of Lampyridae. Unfortunately, estimating calibrated phylogenies using AHE and the latest and most robust time-calibration strategies is not possible because of computational constraints. As a solution, we subset the full dataset and applied three different strategies: using the most complete loci, the most homogeneous loci, and the loci with the highest accuracy to infer the well established Photinus clade. The estimated topology using the three data subsets agreed on almost all major clades and only showed minor discordance with less supported nodes. The estimated divergence times overlapped for all nodes that are shared between the topologies. Thus, divergence time estimation is robust as long as the topology inference is robust and any well selected data subset suffices. Additionally, we observed an unexpected amount of gene tree discordance between the 436 AHE loci. Our assessment of model adequacy showed that standard phylogenetic substitution models are not adequate for any of the 436 AHE loci which is likely to bias phylogenetic inferences. We performed a simulation study to explore the impact of (a) incomplete lineage sorting, (b) uniformly distributed and systematic missing data, and (c) systematic bias in the position of highly variable and conserved sites. For our simulated data, we observed less gene tree variation and hence the empirically observed amount of gene tree discordance for the AHE dataset is unexpected.

2021 ◽  
Vol 12 ◽  
Author(s):  
Jeffrey P. Rose ◽  
Ricardo Kriebel ◽  
Larissa Kahan ◽  
Alexa DiNicola ◽  
Jesús G. González-Gallegos ◽  
...  

Next-generation sequencing technologies have facilitated new phylogenomic approaches to help clarify previously intractable relationships while simultaneously highlighting the pervasive nature of incongruence within and among genomes that can complicate definitive taxonomic conclusions. Salvia L., with ∼1,000 species, makes up nearly 15% of the species diversity in the mint family and has attracted great interest from biologists across subdisciplines. Despite the great progress that has been achieved in discerning the placement of Salvia within Lamiaceae and in clarifying its infrageneric relationships through plastid, nuclear ribosomal, and nuclear single-copy genes, the incomplete resolution has left open major questions regarding the phylogenetic relationships among and within the subgenera, as well as to what extent the infrageneric relationships differ across genomes. We expanded a previously published anchored hybrid enrichment dataset of 35 exemplars of Salvia to 179 terminals. We also reconstructed nearly complete plastomes for these samples from off-target reads. We used these data to examine the concordance and discordance among the nuclear loci and between the nuclear and plastid genomes in detail, elucidating both broad-scale and species-level relationships within Salvia. We found that despite the widespread gene tree discordance, nuclear phylogenies reconstructed using concatenated, coalescent, and network-based approaches recover a common backbone topology. Moreover, all subgenera, except for Audibertia, are strongly supported as monophyletic in all analyses. The plastome genealogy is largely resolved and is congruent with the nuclear backbone. However, multiple analyses suggest that incomplete lineage sorting does not fully explain the gene tree discordance. Instead, horizontal gene flow has been important in both the deep and more recent history of Salvia. Our results provide a robust species tree of Salvia across phylogenetic scales and genomes. Future comparative analyses in the genus will need to account for the impacts of hybridization/introgression and incomplete lineage sorting in topology and divergence time estimation.


PLoS ONE ◽  
2011 ◽  
Vol 6 (11) ◽  
pp. e27138 ◽  
Author(s):  
Sebastián Duchêne ◽  
Frederick I. Archer ◽  
Julia Vilstrup ◽  
Susana Caballero ◽  
Phillip A. Morin

2020 ◽  
Author(s):  
Michael R. May ◽  
Dori L. Contreras ◽  
Michael A. Sundue ◽  
Nathalie S. Nagalingum ◽  
Cindy V. Looy ◽  
...  

AbstractPhylogenetic divergence-time estimation has been revolutionized by two recent developments: 1) total-evidence dating (or “tip-dating”) approaches that allow for the incorporation of fossils as tips in the analysis, with their phylogenetic and temporal relationships to the extant taxa inferred from the data, and 2) the fossilized birth-death (FBD) class of tree models that capture the processes that produce the tree (speciation, extinction, and fossilization), and thus provide a coherent and biologically interpretable tree prior. To explore the behaviour of these methods, we apply them to marattialean ferns, a group that was dominant in Carboniferous landscapes prior to declining to its modest extant diversity of slightly over 100 species. We show that tree models have a dramatic influence on estimates of both divergence times and topological relationships. This influence is driven by the strong, counter-intuitive informativeness of the uniform tree prior and the inherent nonidentifiability of divergence-time models. In contrast to the strong influence of the tree models, we find minor effects of differing the morphological transition model or the morphological clock model. We compare the performance of a large pool of candidate models using a combination of posterior-predictive simulation and Bayes factors. Notably, an FBD model with epoch-specific speciation and extinction rates was strongly favored by Bayes factors. Our best-fitting model infers stem and crown divergences for the Marattiales in the Middle Devonian and Upper Cretaceous, respectively, with elevated speciation rates in the Mississippian and elevated extinction rates in the Cisuralian leading to a peak diversity of ∼2800 species at the end of the Carboniferous, representing the heyday of the Psaroniaceae. This peak is followed by the rapid decline and ultimate extinction of the Psaroniaceae, with their descendants, the Marattiaceae, persisting at approximately stable levels of diversity until the present. This general diversification pattern appears to be insensitive to potential biases in the fossil record; despite the preponderance of available fossils being from Pennsylvanian coal balls, incorporating fossilization-rate variation does not improve model fit. In addition, by incorporating temporal data directly within the model and allowing for the inference of the phylogenetic position of the fossils, our study makes the surprising inference that the clade of extant Marattiales is relatively young, younger than any of the fossils historically thought to be congeneric with extant species. This result is a dramatic demonstration of the dangers of node-based approaches to divergence-time estimation, where the assignment of fossils to particular clades are made a priori (earlier node-based studies that constrained the minimum ages of extant genera based on these fossils resulted in much older age estimates than in our study) and of the utility of explicit models of morphological evolution and lineage diversification.


2019 ◽  
Vol 69 (4) ◽  
pp. 660-670 ◽  
Author(s):  
Tom Carruthers ◽  
Michael J Sanderson ◽  
Robert W Scotland

Abstract Rate variation adds considerable complexity to divergence time estimation in molecular phylogenies. Here, we evaluate the impact of lineage-specific rates—which we define as among-branch-rate-variation that acts consistently across the entire genome. We compare its impact to residual rates—defined as among-branch-rate-variation that shows a different pattern of rate variation at each sampled locus, and gene-specific rates—defined as variation in the average rate across all branches at each sampled locus. We show that lineage-specific rates lead to erroneous divergence time estimates, regardless of how many loci are sampled. Further, we show that stronger lineage-specific rates lead to increasing error. This contrasts to residual rates and gene-specific rates, where sampling more loci significantly reduces error. If divergence times are inferred in a Bayesian framework, we highlight that error caused by lineage-specific rates significantly reduces the probability that the 95% highest posterior density includes the correct value, and leads to sensitivity to the prior. Use of a more complex rate prior—which has recently been proposed to model rate variation more accurately—does not affect these conclusions. Finally, we show that the scale of lineage-specific rates used in our simulation experiments is comparable to that of an empirical data set for the angiosperm genus Ipomoea. Taken together, our findings demonstrate that lineage-specific rates cause error in divergence time estimates, and that this error is not overcome by analyzing genomic scale multilocus data sets. [Divergence time estimation; error; rate variation.]


Zootaxa ◽  
2017 ◽  
Vol 4250 (6) ◽  
pp. 577 ◽  
Author(s):  
MICHAEL J. GHEDOTTI ◽  
MATTHEW P. DAVIS

The fossils species †Fundulus detillae, †F. lariversi, and †F. nevadensis from localities in the western United States are represented by well-preserved material with date estimations. We combined morphological data for these fossil taxa with morphological and DNA-sequence data to conduct a phylogenetic analysis and a tip-based divergence-time estimation for the family Fundulidae. The resultant phylogeny is largely concordant with the prior total-evidence phylogeny. The fossil species do not form a monophyletic group, and do not represent a discrete western radiation of Fundulus as previously proposed. The genus Fundulus diverged into subgeneric clades likely in the Eocene or Oligocene (mean age 34.6 mya, 53–23 mya), and all subgeneric and most species-group clades had evolved by the middle Miocene. †Fundulus lariversi is a member of subgenus Fundulus in which all extant species are found only in eastern North America, demonstrating that fundulids had a complicated biogeographic history. We confirmed †Fundulus detillae as a member of the subgenus Plancterus. †F. nevadensis is not classified in a subgenus but likely is related to the subgenera Plancterus and Wileyichthys. 


2017 ◽  
Author(s):  
Stephen A. Smith ◽  
Joseph W. Brown ◽  
Joseph F. Walker

AbstractPhylogenomic datasets have been successfully used to address questions involving evolutionary relationships, patterns of genome structure, signatures of selection, and gene and genome duplications. However, despite the recent explosion in genomic and transcriptomic data, the utility of these data sources for efficient divergence-time inference remains unexamined. Phylogenomic datasets pose two distinct problems for divergence-time estimation: (i) the volume of data makes inference of the entire dataset intractable, and (ii) the extent of underlying topological and rate heterogeneity across genes makes model mis-specification a real concern. “Gene shopping”, wherein a phylogenomic dataset is winnowed to a set of genes with desirable properties, represents an alternative approach that holds promise in alleviating these issues. We implemented an approach for phylogenomic datasets (available in SortaDate) that filters genes by three criteria: (i) clock-likeness, (ii) reasonable tree length (i.e., discernible information content), and (iii) least topological conflict with a focal species tree (presumed to have already been inferred). Such a winnowing procedure ensures that errors associated with model (both clock and topology) mis-specification are minimized, therefore reducing error in divergence-time estimation. We demonstrated the efficacy of this approach through simulation and applied it to published animal (Aves, Diplopoda, and Hymenoptera) and plant (carnivorous Caryophyllales, broad Caryophyllales, and Vitales) phylogenomic datasets. By quantifying rate heterogeneity across both genes and lineages we found that every empirical dataset examined included genes with clock-like, or nearly clock-like, behavior. Moreover, many datasets had genes that were clock-like, exhibited reasonable evolutionary rates, and were mostly compatible with the species tree. We identified overlap in age estimates when analyzing these filtered genes under strict clock and uncorrelated lognormal (UCLN) models. However, this overlap was often due to imprecise estimates from the UCLN model. We find that “gene shopping” can be an efficient approach to divergence-time inference for phylogenomic datasets that may otherwise be characterized by extensive gene tree heterogeneity.


2015 ◽  
Vol 282 (1798) ◽  
pp. 20141013 ◽  
Author(s):  
Rachel C. M. Warnock ◽  
James F. Parham ◽  
Walter G. Joyce ◽  
Tyler R. Lyson ◽  
Philip C. J. Donoghue

Calibration is the rate-determining step in every molecular clock analysis and, hence, considerable effort has been expended in the development of approaches to distinguish good from bad calibrations. These can be categorized into a priori evaluation of the intrinsic fossil evidence, and a posteriori evaluation of congruence through cross-validation. We contrasted these competing approaches and explored the impact of different interpretations of the fossil evidence upon Bayesian divergence time estimation. The results demonstrate that a posteriori approaches can lead to the selection of erroneous calibrations. Bayesian posterior estimates are also shown to be extremely sensitive to the probabilistic interpretation of temporal constraints. Furthermore, the effective time priors implemented within an analysis differ for individual calibrations when employed alone and in differing combination with others. This compromises the implicit assumption of all calibration consistency methods, that the impact of an individual calibration is the same when used alone or in unison with others. Thus, the most effective means of establishing the quality of fossil-based calibrations is through a priori evaluation of the intrinsic palaeontological, stratigraphic, geochronological and phylogenetic data. However, effort expended in establishing calibrations will not be rewarded unless they are implemented faithfully in divergence time analyses.


2011 ◽  
Vol 8 (1) ◽  
pp. 156-159 ◽  
Author(s):  
Rachel C. M. Warnock ◽  
Ziheng Yang ◽  
Philip C. J. Donoghue

Calibration is a critical step in every molecular clock analysis but it has been the least considered. Bayesian approaches to divergence time estimation make it possible to incorporate the uncertainty in the degree to which fossil evidence approximates the true time of divergence. We explored the impact of different approaches in expressing this relationship, using arthropod phylogeny as an example for which we established novel calibrations. We demonstrate that the parameters distinguishing calibration densities have a major impact upon the prior and posterior of the divergence times, and it is critically important that users evaluate the joint prior distribution of divergence times used by their dating programmes. We illustrate a procedure for deriving calibration densities in Bayesian divergence dating through the use of soft maximum constraints.


2004 ◽  
Vol 359 (1450) ◽  
pp. 1477-1483 ◽  
Author(s):  
Thomas J. Near ◽  
Michael J. Sanderson

Estimates of species divergence times using DNA sequence data are playing an increasingly important role in studies of evolution, ecology and biogeography. Most work has centred on obtaining appropriate kinds of data and developing optimal estimation procedures, whereas somewhat less attention has focused on the calibration of divergences using fossils. Case studies with multiple fossil calibration points provide important opportunities to examine the divergence time estimation problem in new ways. We discuss two cross–validation procedures that address different aspects of inference in divergence time estimation. ‘Fossil cross–validation’ is a procedure used to identify the impact of different individual calibrations on overall estimation. This can identify fossils that have an exceptionally large error effect and may warrant further scrutiny. ‘Fossil–based model cross–validation’ is an entirely different procedure that uses fossils to identify the optimal model of molecular evolution in the context of rate smoothing or other inference methods. Both procedures were applied to two recent studies: an analysis of monocot angiosperms with eight fossil calibrations and an analysis of placental mammals with nine fossil calibrations. In each case, fossil calibrations could be ranked from most to least influential, and in one of the two studies, the fossils provided decisive evidence about the optimal molecular evolutionary model.


2022 ◽  
Vol 12 ◽  
Author(s):  
Chia-Lun Hsieh ◽  
Chih-Chieh Yu ◽  
Yu-Lan Huang ◽  
Kuo-Fang Chung

The early-diverging eudicot family Berberidaceae is composed of a morphologically diverse assemblage of disjunctly distributed genera long praised for their great horticultural and medicinal values. However, despite century-long studies, generic delimitation of Berberidaceae remains controversial and its tribal classification has never been formally proposed under a rigorous phylogenetic context. Currently, the number of accepted genera in Berberidaceae ranges consecutively from 13 to 19, depending on whether to define Berberis, Jeffersonia, and Podophyllum broadly, or to segregate these three genera further and recognize Alloberberis, Mahonia, and Moranothamnus, Plagiorhegma, and Dysosma, Diphylleia, and Sinopodophyllum, respectively. To resolve Berberidaceae’s taxonomic disputes, we newly assembled 23 plastomes and, together with 85 plastomes from the GenBank, completed the generic sampling of the family. With 4 problematic and 14 redundant plastome sequences excluded, robust phylogenomic relationships were reconstructed based on 93 plastomes representing all 19 genera of Berberidaceae and three outgroups. Maximum likelihood phylogenomic relationships corroborated with divergence time estimation support the recognition of three subfamilies Berberidoideae, Nandinoideae, and Podophylloideae, with tribes Berberideae and Ranzanieae, Leonticeae and Nandineae, and Podophylleae, Achlydeae, Bongardieae tr. nov., Epimedieae, and Jeffersonieae tr. nov. in the former three subfamilies, respectively. By applying specifically stated criteria, our phylogenomic data also support the classification of 19 genera, recognizing Alloberberis, Mahonia, and Moranothamnus, Plagiorhegma, and Diphylleia, Dysosma, and Sinopodophyllum that are morphologically and evolutionarily distinct from Berberis, Jeffersonia, and Podophyllum, respectively. Comparison of plastome structures across Berberidaceae confirms inverted repeat expansion in the tribe Berberideae and reveals substantial length variation in accD gene caused by repeated sequences in Berberidoideae. Comparison of plastome tree with previous studies and nuclear ribosomal DNA (nrDNA) phylogeny also reveals considerable conflicts at different phylogenetic levels, suggesting that incomplete lineage sorting and/or hybridization had occurred throughout the evolutionary history of Berberidaceae and that Alloberberis and Moranothamnus could have resulted from reciprocal hybridization between Berberis and Mahonia in ancient times prior to the radiations of the latter two genera.


Sign in / Sign up

Export Citation Format

Share Document