scholarly journals Phylogenomics supported by geometric morphometrics reveals delimitation of sexual species within the polyploid apomictic Ranunculus auricomus complex (Ranunculaceae)

Author(s):  
Kevin Karbstein ◽  
Salvatore Tomasello ◽  
Ladislav Hodac ◽  
Franz G. Dunkel ◽  
Mareike Daubert ◽  
...  

AbstractSpecies are the basic units of biodiversity and evolution. Nowadays, they are widely considered as ancestor-descendant lineages. Their definition remains a persistent challenge for taxonomists due to lineage evolutionary role and circumscription, i.e., persistence in time and space, ecological niche or a shared phenotype of a lineage. Recognizing and delimiting species is particularly methodically challenging in fast-evolving, evolutionary young species complexes often characterized by low genetic divergence, hybrid origin, introgression and incomplete lineage sorting (ILS). Ranunculus auricomus is a large Eurasian apomictic polyploid complex that probably has arisen from the hybridization of a few sexual progenitor species. However, even delimitation and relationships of diploid sexual progenitors have been unclearly ranging from two to twelve species. Here, we present an innovative workflow combining phylogenomic methods based on 86,782 parameter-optimized RADseq loci and target enrichment of 663 nuclear genes together with geometric morphometrics to delimit sexual species in this evolutionary young complex (< 1 Mya). For the first time, we revealed a fully resolved and well-supported maximum likelihood (ML) tree phylogeny congruent to neighbor-net network and STRUCTURE results based on RADseq data. In a few clades, we found evidence of discordant patterns indicated by quartet sampling (QS) and reticulation events in the neighbor-net network probably caused by introgression and ILS. Together with coalescent-based species delimitation approaches based on target enrichment data, we found five main genetic lineages, with an allopatric distribution in Central and Southern Europe. A concatenated geometric morphometric data set including basal and stem leaves, as well as receptacles, revealed the same five main clusters. We accept those five morphologically differentiated, geographically isolated, genetic main lineages as species: R. cassubicifolius s.l. (incl. R. carpaticola), R. flabellifolius, R. envalirensis s.l. (incl. R. cebennensis), R. marsicus and R. notabilis s.l. (incl. R. austroslovenicus, R. calapius, R. mediocompositus, R. peracris and R. subcarniolicus). Our comprehensive workflow combing phylogenomic methods supported by geometric morphometrics proved to be successful in delimiting closely related sexual taxa and applying an evolutionary species concept, which is also transferable to other evolutionarily young species complexes.

PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e6476 ◽  
Author(s):  
Andrinajoro R. Rakotoarivelo ◽  
Paul O’Donoghue ◽  
Michael W. Bruford ◽  
Yoshan Moodley

Background The bushbuck, Tragelaphus scriptus, is a widespread and ecologically diverse ungulate species complex within the spiral-horned antelopes. This species was recently found to consist of two genetically divergent but monophyletic lineages, which are paraphyletic at mitochondrial (mt)DNA owing to an ancient interspecific hybridization event. The Scriptus lineage (T. s. scriptus) inhabits the north-western half of the African continent while Sylvaticus (T. s. sylvaticus) is found in the south-eastern half. Here we test hypotheses of historical demography and adaptation in bushbuck using a higher-resolution framework, with four nuclear (MGF, PRKCI, SPTBN, and THY) and three new mitochondrial markers (cytochrome b, 12S rRNA, and 16S rRNA). Methods Genealogies were reconstructed for the mitochondrial and nuclear data sets, with the latter dated using fossil calibration points. We also inferred the demographic history of Scriptus and Sylvaticus using coalescent-based methods. To obtain an overview of the origins and ancestral colonisation routes of ancestral bushbuck sequences across geographic space, we conducted discrete Bayesian phylogeographic and statistical dispersal-vicariance analyses on our nuclear DNA data set. Results Both nuclear DNA and mtDNA support previous findings of two genetically divergent Sylvaticus and Scriptus lineages. The three mtDNA loci confirmed 15 of the previously defined haplogroups, including those with convergent phenotypes. However, the nuclear tree showed less phylogenetic resolution at the more derived parts of the genealogy, possibly due to incomplete lineage sorting of the slower evolving nuclear genome. The only exception to this was the montane Menelik’s bushbuck (Sylvaticus) of the Ethiopian highlands, which formed a monophyletic group at three of four nuclear DNA loci. We dated the coalescence of the two lineages to a common ancestor ∼2.54 million years ago. Both marker sets revealed similar demographic histories of constant population size over time. We show that the bushbuck likely originated in East Africa, with Scriptus dispersing to colonise suitable habitats west of the African Rift and Sylvaticus radiating from east of the Rift into southern Africa via a series of mainly vicariance events. Discussion Despite lower levels of genetic structure at nuclear loci, we confirmed the independent evolution of the Menelik’s bushbuck relative to the phenotypically similar montane bushbuck in East Africa, adding further weight to previous suggestions of convergent evolution within the bushbuck complex. Perhaps the most surprising result of our analysis was that both Scriptus and Sylvaticus populations remained relatively constant throughout the Pleistocene, which is remarkable given that this was a period of major climatic and tectonic change in Africa, and responsible for driving the evolution of much of the continent’s extant large mammalian diversity.


Author(s):  
Daniel Lukic ◽  
Jonas Eberle ◽  
Jana Thormann ◽  
Carolus Holzschuh ◽  
Dirk Ahrens

DNA-barcoding and DNA-based species delimitation are major tools in DNA taxonomy. Sampling has been a central debate in this context, because the geographical composition of samples affect the accuracy and performance of DNA-barcoding. Performance of complex DNA-based species delimitation is to be tested under simpler conditions in absence of geographic sampling bias. Here, we present an empirical data set sampled from a single locality in a Southeast-Asian biodiversity hotspot (Laos: Phou Pan mountain). We investigate the performance of various species delimitation approaches on a megadiverse assemblage of herbivore chafer beetles (Coleoptera: Scarabaeidae) to infer whether species delimitation suffers in the same way from exaggerate infraspecific variation despite the lack of geographic genetic variation that led to inconsistencies between entities from DNA-based and morphology-based species inference in previous studies. For this purpose, a 658 bp fragment of the mitochondrial cytochrome c oxidase subunit 1 (cox1) was analysed for a total of 186 individuals of 56 morphospecies. Tree based and distance based species delimitation methods were used. All approaches showed a rather limited match ratio (max. 77%) with morphospecies. PTP and TCS prevailingly over-splitted morphospecies, while 3% clustering and ABGD also lumped several species into one entity. ABGD revealed the highest congruence between molecular operational taxonomic units (MOTUs) and morphospecies. Disagreements between morphospecies and MOTUs were discussed in the context of historically acquired geographic genetic differentiation, incomplete lineage sorting, and hybridization. The study once again highlights how important morphology still is in order to correctly interpret the results of molecular species delimitation.


AoB Plants ◽  
2020 ◽  
Vol 12 (3) ◽  
Author(s):  
Nannie L Persson ◽  
Ingrid Toresen ◽  
Heidi Lie Andersen ◽  
Jenny E E Smedmark ◽  
Torsten Eriksson

Abstract The genus Potentilla (Rosaceae) has been subjected to several phylogenetic studies, but resolving its evolutionary history has proven challenging. Previous analyses recovered six, informally named, groups: the Argentea, Ivesioid, Fragarioides, Reptans, Alba and Anserina clades, but the relationships among some of these clades differ between data sets. The Reptans clade, which includes the type species of Potentilla, has been noticed to shift position between plastid and nuclear ribosomal data sets. We studied this incongruence by analysing four low-copy nuclear markers, in addition to chloroplast and nuclear ribosomal data, with a set of Bayesian phylogenetic and Multispecies Coalescent (MSC) analyses. A selective taxon removal strategy demonstrated that the included representatives from the Fragarioides clade, P. dickinsii and P. fragarioides, were the main sources of the instability seen in the trees. The Fragarioides species showed different relationships in each gene tree, and were only supported as a monophyletic group in a single marker when the Reptans clade was excluded from the analysis. The incongruences could not be explained by allopolyploidy, but rather by homoploid hybridization, incomplete lineage sorting or taxon sampling effects. When P. dickinsii and P. fragarioides were removed from the data set, a fully resolved, supported backbone phylogeny of Potentilla was obtained in the MSC analysis. Additionally, indications of autopolyploid origins of the Reptans and Ivesioid clades were discovered in the low-copy gene trees.


Author(s):  
Diego F Morales-Briones ◽  
Gudrun Kadereit ◽  
Delphine T Tefarikis ◽  
Michael J Moore ◽  
Stephen A Smith ◽  
...  

Abstract Gene tree discordance in large genomic data sets can be caused by evolutionary processes such as incomplete lineage sorting and hybridization, as well as model violation, and errors in data processing, orthology inference, and gene tree estimation. Species tree methods that identify and accommodate all sources of conflict are not available, but a combination of multiple approaches can help tease apart alternative sources of conflict. Here, using a phylotranscriptomic analysis in combination with reference genomes, we test a hypothesis of ancient hybridization events within the plant family Amaranthaceae s.l. that was previously supported by morphological, ecological, and Sanger-based molecular data. The data set included seven genomes and 88 transcriptomes, 17 generated for this study. We examined gene-tree discordance using coalescent-based species trees and network inference, gene tree discordance analyses, site pattern tests of introgression, topology tests, synteny analyses, and simulations. We found that a combination of processes might have generated the high levels of gene tree discordance in the backbone of Amaranthaceae s.l. Furthermore, we found evidence that three consecutive short internal branches produce anomalous trees contributing to the discordance. Overall, our results suggest that Amaranthaceae s.l. might be a product of an ancient and rapid lineage diversification, and remains, and probably will remain, unresolved. This work highlights the potential problems of identifiability associated with the sources of gene tree discordance including, in particular, phylogenetic network methods. Our results also demonstrate the importance of thoroughly testing for multiple sources of conflict in phylogenomic analyses, especially in the context of ancient, rapid radiations. We provide several recommendations for exploring conflicting signals in such situations. [Amaranthaceae; gene tree discordance; hybridization; incomplete lineage sorting; phylogenomics; species network; species tree; transcriptomics.]


2020 ◽  
Author(s):  
Michael J. Sanderson ◽  
Michelle M. McMahon ◽  
Mike Steel

AbstractTerraces in phylogenetic tree space are sets of trees with identical optimality scores for a given data set, arising from missing data. These were first described for multilocus phylogenetic data sets in the context of maximum parsimony inference and maximum likelihood inference under certain model assumptions. Here we show how the mathematical properties that lead to terraces extend to gene tree - species tree problems in which the gene trees are incomplete. Inference of species trees from either sets of gene family trees subject to duplication and loss, or allele trees subject to incomplete lineage sorting, can exhibit terraces in their solution space. First, we show conditions that lead to a new kind of terrace, which stems from subtree operations that appear in reconciliation problems for incomplete trees. Then we characterize when terraces of both types can occur when the optimality criterion for tree search is based on duplication, loss or deep coalescence scores. Finally, we examine the impact of assumptions about the causes of losses: whether they are due to imperfect sampling or true evolutionary deletion.


2021 ◽  
Author(s):  
Simone Cardoni ◽  
Roberta Piredda ◽  
Thomas Denk ◽  
Guido W. Grimm ◽  
Aristotelis C. Papageorgiou ◽  
...  

Standard models of speciation assume strictly dichotomous genealogies in which a species, the ancestor, is replaced by two offspring species. The reality is more complex: plant species can evolve from other species via isolation when genetic drift exceeds gene flow; lineage mixing can give rise to new species (hybrid taxa such as nothospecies and allopolyploids). The multi–copy, potentially multi–locus 5S rDNA is one of few gene regions conserving signal from dichotomous and reticulate evolutionary processes down to the level of intra-genomic recombination. Here, we provide the first high-throughput sequencing (HTS) 5S intergenic spacer (5S – IGS) data for a lineage of wind-pollinated subtropical to temperate trees, the Fagus crenata – F. sylvatica s.l. lineage, and its distant relative F. japonica. The observed 4,963 unique 5S – IGS variants reflect a long history of repeated incomplete lineage sorting and lineage mixing since the early Cenozoic of two or more paralogous-homoeologous 5S rDNA lineages. Extant species of Fagus are genetic mosaics and, at least to some part, of hybrid origin.


2017 ◽  
Author(s):  
Graham Jones

AbstractThis paper focuses on the problem of estimating a species tree from multilocus data in the presence of incomplete lineage sorting and migration. We develop a mathematical model similar to IMa2 (Hey 2010) for the relevant evolutionary processes which allows both the the population size parameters and the migration rates between pairs of species tree branches to be integrated out. We then describe a BEAST2 package DENIM which based on this model, and which uses an approximation to sample from the posterior. The approximation is based on the assumption that migrations are rare, and it only samples from certain regions of the posterior which seem likely given this assumption. The method breaks down if there is a lot of migration. Using simulations, Leaché et al 2014 showed migration causes problems for species tree inference using the multispecies coalescent when migration is present but ignored. We re-analyze this simulated data to explore DENIM’s performance, and demonstrate substantial improvements over *BEAST. We also re-analyze an empirical data set. [isolation-with-migration; incomplete lineage sorting; multispecies coalescent; species tree; phylogenetic analysis; Bayesian; Markov chain Monte Carlo]


2018 ◽  
Author(s):  
Andrinajoro R Rakotoarivelo ◽  
Yoshan Moodley

Background. The bushbuck, Tragelaphus scriptus, is the most widespread and ecologically diverse ungulate species complex within the spiral-horned antelopes, occurring in approximately 73% of the total land area of sub-Saharan Africa. This species was found to consist of two genetically divergent lineages based on the mitochondrial (mt)DNA control region. One lineage inhabited the north-western half of the African continent (T. scriptus) while the other lineage (T. sylvaticus) was found in the south-eastern half. The complex was also found to comprise an unprecedented example of 23 phylogenetically distinct groups (‘ecotypes’), with montane and desert phenotypes potentially resulting from convergent evolution. The current study aim to test hypotheses regarding historical demography and adaptation of bushbuck using a higher-resolution framework, with faster evolving nuclear markers(MGF, PRKCI, SPTBN, and THY) as well as three further mitochondrial markers (cytochrome b, 12S rRNA, and 16S rRNA). Methods. Genealogies were reconstructed for the nuclear and mitochondrial data sets and for each gene independently to test the non-monphyly of the bushbuck complexe in a multi loci framework. In addition, we reconstruct the phylogeographic history of the bushbuck complex by a Bayesian discrete phylogeographic approach of our nucDNA data set to investigate its geographic diffusion and ancestral sequence location. Results. We uncovered two evolutionarily divergent lineages and geographically restricted lineages (Sylvaticus and Scriptus) of bushbuck using phylogenetics. Molecular dating indicates that these lineages last shared a common ancestor ∼2.54 million years ago. Summary statistics and analysis of the frequency distributions of DNA polymorphisms do not have any support for expanding population. Both BSPs and EBSPs indicate that the Scriptus and Sylvaticus lineages have remained relatively stable during the last 225-450Kya. Discussion. Both nucDNA and mtDNA support previously findings of two genetically divergent Sylvaticus and Scriptus lineages, despite them coming into secondary contact in several geographic regions. The three mtDNA loci confirmed 15 of the previously defined ecotypes, including those with convergent phenotypes. However, the nuclear tree showed less phylogenetic resolution at the more derived parts of the genealogy, possibly due to incomplete lineage sorting of the slower evolving nuclear genome. The only exception to this was the montane ecotype meneliki of the Ethiopian highlands, which formed a monophyletic group at three of the four nucDNA loci. The independent evolution of this group relative to phenotypically similar montane ecotypes in Africa confirm previously suggestions of convergence within the bushbuck complex.


2019 ◽  
Author(s):  
Zhen Cao ◽  
Xinhao Liu ◽  
Huw A. Ogilvie ◽  
Zhi Yan ◽  
Luay Nakhleh

AbstractPhylogenetic networks extend trees to enable simultaneous modeling of both vertical and horizontal evolutionary processes. PhyloNet is a software package that has been under constant development for over 10 years and includes a wide array of functionalities for inferring and analyzing phylogenetic networks. These functionalities differ in terms of the input data they require, the criteria and models they employ, and the types of information they allow to infer about the networks beyond their topologies. Furthermore, PhyloNet includes functionalities for simulating synthetic data on phylogenetic networks, quantifying the topological differences between phylogenetic networks, and evaluating evolutionary hypotheses given in the form of phylogenetic networks.In this paper, we use a simulated data set to illustrate the use of several of PhyloNet’s functionalities and make recommendations on how to analyze data sets and interpret the results when using these functionalities. All inference methods that we illustrate are incomplete lineage sorting (ILS) aware; that is, they account for the potential of ILS in the data while inferring the phylogenetic network. While the models do not include gene duplication and loss, we discuss how the methods can be used to analyze data in the presence of polyploidy.The concept of species is irrelevant for the computational analyses enabled by PhyloNet in that species-individuals mappings are user-defined. Consequently, none of the functionalities in PhyloNet deals with the task of species delimitation. In this sense, the data being analyzed could come from different individuals within a single species, in which case population structure along with potential gene flow is inferred (assuming the data has sufficient signal), or from different individuals sampled from different species, in which case the species phylogeny is being inferred.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11843
Author(s):  
Carlos Prieto ◽  
Christophe Faynel ◽  
Robert Robbins ◽  
Axel Hausmann

Background With about 1,000 species in the Neotropics, the Eumaeini (Theclinae) are one of the most diverse butterfly tribes. Correct morphology-based identifications are challenging in many genera due to relatively little interspecific differences in wing patterns. Geographic infraspecific variation is sometimes more substantial than variation between species. In this paper we present a large DNA barcode dataset of South American Lycaenidae. We analyze how well DNA barcode BINs match morphologically delimited species. Methods We compare morphology-based species identifications with the clustering of molecular operational taxonomic units (MOTUs) delimitated by the RESL algorithm in BOLD, which assigns Barcode Index Numbers (BINs). We examine intra- and interspecific divergences for genera represented by at least four morphospecies. We discuss the existence of local barcode gaps in a genus by genus analysis. We also note differences in the percentage of species with barcode gaps in groups of lowland and high mountain genera. Results We identified 2,213 specimens and obtained 1,839 sequences of 512 species in 90 genera. Overall, the mean intraspecific divergence value of CO1 sequences was 1.20%, while the mean interspecific divergence between nearest congeneric neighbors was 4.89%, demonstrating the presence of a barcode gap. However, the gap seemed to disappear from the entire set when comparing the maximum intraspecific distance (8.40%) with the minimum interspecific distance (0.40%). Clear barcode gaps are present in many genera but absent in others. From the set of specimens that yielded COI fragment lengths of at least 650 bp, 75% of the a priori morphology-based identifications were unambiguously assigned to a single Barcode Index Number (BIN). However, after a taxonomic a posteriori review, the percentage of matched identifications rose to 85%. BIN splitting was observed for 17% of the species and BIN sharing for 9%. We found that genera that contain primarily lowland species show higher percentages of local barcode gaps and congruence between BINs and morphology than genera that contain exclusively high montane species. The divergence values to the nearest neighbors were significantly lower in high Andean species while the intra-specific divergence values were significantly lower in the lowland species. These results raise questions regarding the causes of observed low inter and high intraspecific genetic variation. We discuss incomplete lineage sorting and hybridization as most likely causes of this phenomenon, as the montane species concerned are relatively young and hybridization is probable. The release of our data set represents an essential baseline for a reference library for biological assessment studies of butterflies in mega diverse countries using modern high-throughput technologies an highlights the necessity of taxonomic revisions for various genera combining both molecular and morphological data.


Sign in / Sign up

Export Citation Format

Share Document