scholarly journals Comparative analyses of Mikania (Asteraceae: Eupatorieae) plastomes and impact of data partitioning and inference methods on phylogenetic relationships

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Verônica A. Thode ◽  
Caetano T. Oliveira ◽  
Benoît Loeuille ◽  
Carolina M. Siniscalchi ◽  
José R. Pirani

AbstractWe assembled new plastomes of 19 species of Mikania and of Ageratina fastigiata, Litothamnus nitidus, and Stevia collina, all belonging to tribe Eupatorieae (Asteraceae). We analyzed the structure and content of the assembled plastomes and used the newly generated sequences to infer phylogenetic relationships and study the effects of different data partitions and inference methods on the topologies. Most phylogenetic studies with plastomes ignore that processes like recombination and biparental inheritance can occur in this organelle, using the whole genome as a single locus. Our study sought to compare this approach with multispecies coalescent methods that assume that different parts of the genome evolve at different rates. We found that the overall gene content, structure, and orientation are very conserved in all plastomes of the studied species. As observed in other Asteraceae, the 22 plastomes assembled here contain two nested inversions in the LSC region. The plastomes show similar length and the same gene content. The two most variable regions within Mikania are rpl32-ndhF and rpl16-rps3, while the three genes with the highest percentage of variable sites are ycf1, rpoA, and psbT. We generated six phylogenetic trees using concatenated maximum likelihood and multispecies coalescent methods and three data partitions: coding and non-coding sequences and both combined. All trees strongly support that the sampled Mikania species form a monophyletic group, which is further subdivided into three clades. The internal relationships within each clade are sensitive to the data partitioning and inference methods employed. The trees resulting from concatenated analysis are more similar among each other than to the correspondent tree generated with the same data partition but a different method. The multispecies coalescent analysis indicate a high level of incongruence between species and gene trees. The lack of resolution and congruence among trees can be explained by the sparse sampling (~ 0.45% of the currently accepted species) and by the low number of informative characters present in the sequences. Our study sheds light into the impact of data partitioning and methods over phylogenetic resolution and brings relevant information for the study of Mikania diversity and evolution, as well as for the Asteraceae family as a whole.

Author(s):  
Jaruloj Chongstitvatana ◽  
Methus Bhirakit

Similarity join is necessary for many applications, such as text search and data preparation. Measuring the similarity between two strings is expensive because inexact match is allowed and strings in databases are long. To reduce the cost of similarity join, the filtering-and-verify approach reduces the number of string pairs which require the computation of the similarity function. Prefix filtering is a filterand- verify method that filters out dissimilar strings by examining only their prefixes. The effectiveness of prefix filtering depends on the length of the examined prefix. An adaptive method is proposed to find a suitable prefix length for filtering. Based on this concept, we propose to divide a dataset into partitions, and a suitable prefix length is determined for each partition. It also allows similarity join to run in parallel for each data partition. From our experiment, the proposed method achieves higher performance because the number of candidates is reduced and the program can execute in parallel. Moreover, the performance of the proposed method depends on the number of data partitions. If the data partition is too small, the chosen prefix length for each partition may not be optimal.


2017 ◽  
Author(s):  
Guillaume Bernard ◽  
Paul Greenfield ◽  
Mark A. Ragan ◽  
Cheong Xin Chan

AbstractAlignment-free (AF) methods have recently been adopted to infer phylogenetic trees. However, the evolutionary relationships among microbes, impacted by common phenomena such as lateral genetic transfer and rearrangement, cannot be adequately captured in a strictly tree-like structure. Bacterial and archaeal genomes consist of highly conserved regions, e.g. ribosomal RNA genes (commonly used as phylogenetic markers), more-variable regions and extrachromosomal elements, i.e. plasmids (that contain genes critical under a selective condition e.g. antibiotic resistance). The impact of these elements on genome-scale inference of microbial phylogeny remains little known. Here, using an AF approach, we inferred phylogenomic networks of microbial life based on 2785 completely sequenced bacterial and archaeal genomes, and systematically assessed the impact of ribosomal RNA genes and plasmid sequences in this network. Our results indicate that k-mer similarity can correlate with taxonomic rank of microbes. Using a relational database approach, we linked the implicatedk-mers to annotated genomic regions (thus functions), and defined core functions in specific phyletic groups and genera. We found that, in most phyla, highly conserved functions are often related to Amino acid metabolism and transport, and Energy production and conversion. Our findings indicate that AF phylogenomics can be used to infer reticulate relationships in a scalable manner and provide new perspective into microbial biology and evolution.


Author(s):  
Hesam Montazeri ◽  
Susan Little ◽  
Mozhgan Mozaffarilegha ◽  
Niko Beerenwinkel ◽  
Victor DeGruttola

AbstractGenetic sequence data of pathogens are increasingly used to investigate transmission dynamics in both endemic diseases and disease outbreaks. Such research can aid in the development of appropriate interventions and in the design of studies to evaluate them. Several computational methods have been proposed to infer transmission chains from sequence data; however, existing methods do not generally reliably reconstruct transmission trees because genetic sequence data or inferred phylogenetic trees from such data contain insufficient information for accurate estimation of transmission chains. Here, we show by simulation studies that incorporating infection times, even when they are uncertain, can greatly improve the accuracy of reconstruction of transmission trees. To achieve this improvement, we propose a Bayesian inference methods using Markov chain Monte Carlo that directly draws samples from the space of transmission trees under the assumption of complete sampling of the outbreak. The likelihood of each transmission tree is computed by a phylogenetic model by treating its internal nodes as transmission events. By a simulation study, we demonstrate that accuracy of the reconstructed transmission trees depends mainly on the amount of information available on times of infection; we show superiority of the proposed method to two alternative approaches when infection times are known up to specified degrees of certainty. In addition, we illustrate the use of a multiple imputation framework to study features of epidemic dynamics, such as the relationship between characteristics of nodes and average number of outbound edges or inbound edges, signifying possible transmission events from and to nodes. We apply the proposed method to a transmission cluster in San Diego and to a dataset from the 2014 Sierra Leone Ebola virus outbreak and investigate the impact of biological, behavioral, and demographic factors.


Insects ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 668
Author(s):  
Tinghao Yu ◽  
Yalin Zhang

More studies are using mitochondrial genomes of insects to explore the sequence variability, evolutionary traits, monophyly of groups and phylogenetic relationships. Controversies remain on the classification of the Mileewinae and the phylogenetic relationships between Mileewinae and other subfamilies remain ambiguous. In this study, we present two newly completed mitogenomes of Mileewinae (Mileewa rufivena Cai and Kuoh 1997 and Ujna puerana Yang and Meng 2010) and conduct comparative mitogenomic analyses based on several different factors. These species have quite similar features, including their nucleotide content, codon usage of protein genes and the secondary structure of tRNA. Gene arrangement is identical and conserved, the same as the putative ancestral pattern of insects. All protein-coding genes of U. puerana began with the start codon ATN, while 5 Mileewa species had the abnormal initiation codon TTG in ND5 and ATP8. Moreover, M. rufivena had an intergenic spacer of 17 bp that could not be found in other mileewine species. Phylogenetic analysis based on three datasets (PCG123, PCG12 and AA) with two methods (maximum likelihood and Bayesian inference) recovered the Mileewinae as a monophyletic group with strong support values. All results in our study indicate that Mileewinae has a closer phylogenetic relationship to Typhlocybinae compared to Cicadellinae. Additionally, six species within Mileewini revealed the relationship (U. puerana + (M. ponta + (M. rufivena + M. alara) + (M. albovittata + M. margheritae))) in most of our phylogenetic trees. These results contribute to the study of the taxonomic status and phylogenetic relationships of Mileewinae.


2019 ◽  
Vol 1 (1) ◽  
Author(s):  
D C Blackburn ◽  
G Giribet ◽  
D E Soltis ◽  
E L Stanley

Abstract Although our inventory of Earth’s biodiversity remains incomplete, we still require analyses using the Tree of Life to understand evolutionary and ecological patterns. Because incomplete sampling may bias our inferences, we must evaluate how future additions of newly discovered species might impact analyses performed today. We describe an approach that uses taxonomic history and phylogenetic trees to characterize the impact of past species discoveries on phylogenetic knowledge using patterns of branch-length variation, tree shape, and phylogenetic diversity. This provides a framework for assessing the relative completeness of taxonomic knowledge of lineages within a phylogeny. To demonstrate this approach, we use recent large phylogenies for amphibians, reptiles, flowering plants, and invertebrates. Well-known clades exhibit a decline in the mean and range of branch lengths that are added each year as new species are described. With increased taxonomic knowledge over time, deep lineages of well-known clades become known such that most recently described new species are added close to the tips of the tree, reflecting changing tree shape over the course of taxonomic history. The same analyses reveal other clades to be candidates for future discoveries that could dramatically impact our phylogenetic knowledge. Our work reveals that species are often added non-randomly to the phylogeny over multiyear time-scales in a predictable pattern of taxonomic maturation. Our results suggest that we can make informed predictions about how new species will be added across the phylogeny of a given clade, thus providing a framework for accommodating unsampled undescribed species in evolutionary analyses.


Cladistics ◽  
2017 ◽  
Vol 34 (1) ◽  
pp. 57-77 ◽  
Author(s):  
Limin Lu ◽  
Cymon J. Cox ◽  
Sarah Mathews ◽  
Wei Wang ◽  
Jun Wen ◽  
...  

2022 ◽  
Author(s):  
XiaoXu Pang ◽  
Da-Yong Zhang

The species studied in any evolutionary investigation generally constitute a very small proportion of all the species currently existing or that have gone extinct. It is therefore likely that introgression, which is widespread across the tree of life, involves "ghosts," i.e., unsampled, unknown, or extinct lineages. However, the impact of ghost introgression on estimations of species trees has been rarely studied and is thus poorly understood. In this study, we use mathematical analysis and simulations to examine the robustness of species tree methods based on a multispecies coalescent model under gene flow sourcing from an extant or ghost lineage. We found that very low levels of extant or ghost introgression can result in anomalous gene trees (AGTs) on three-taxon rooted trees if accompanied by strong incomplete lineage sorting (ILS). In contrast, even massive introgression, with more than half of the recipient genome descending from the donor lineage, may not necessarily lead to AGTs. In cases involving an ingroup lineage (defined as one that diverged no earlier than the most basal species under investigation) acting as the donor of introgression, the time of root divergence among the investigated species was either underestimated or remained unaffected, but for the cases of outgroup ghost lineages acting as donors, the divergence time was generally overestimated. Under many conditions of ingroup introgression, the stronger the ILS was, the higher was the accuracy of estimating the time of root divergence, although the topology of the species tree is more prone to be biased by the effect of introgression.


2020 ◽  
Author(s):  
Christopher Kay ◽  
Tom A Williams ◽  
Wendy Gibson

Abstract Background: Trypanosomes are single-celled eukaryotic parasites characterised by the unique biology of their mitochondrial DNA (mtDNA). African livestock trypanosomes impose a major burden on agriculture across sub-Saharan Africa, but are poorly understood compared to those that cause sleeping sickness and Chagas disease in humans. Here we explore the potential of trypanosome mtDNA to study the evolutionary history of trypanosomes and the molecular evolution of their mtDNAs.Results: We used long-read sequencing to completely assemble mtDNAs from four previously uncharacterized African trypanosomes, and leveraged these assemblies to scaffold and assemble a further 103 trypanosome mtDNAs from published short-read data. While synteny was largely conserved, there were repeated, independent losses of Complex I genes. Comparison of edited and non-edited genes revealed the impact of RNA editing on nucleotide composition, with non-edited genes approaching the limits of GC loss. African tsetse-transmitted trypanosomes showed high levels of RNA editing compared to other trypanosomes. Whole mtDNA coding regions were used to construct time-resolved phylogenetic trees, revealing deep divergence events among isolates of the pathogens Trypanosoma brucei and T. congolense .Conclusions: Our mtDNA data represents a new resource for experimental and evolutionary analyses of trypanosome phylogeny, molecular evolution and function. Molecular clock analyses yielded a timescale for trypanosome evolution congruent with major biogeographical events in Africa and revealed the recent emergence of Trypanosoma brucei gambiense and T. equiperdum , major human and animal pathogens.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e8450 ◽  
Author(s):  
Sunan Huang ◽  
Xuejun Ge ◽  
Asunción Cano ◽  
Betty Gaby Millán Salazar ◽  
Yunfei Deng

The genus Dicliptera (Justicieae, Acanthaceae) consists of approximately 150 species distributed throughout the tropical and subtropical regions of the world. Newly obtained chloroplast genomes (cp genomes) are reported for five species of Dilciptera (D. acuminata, D. peruviana, D. montana, D. ruiziana and D. mucronata) in this study. These cp genomes have circular structures of 150,689–150,811 bp and exhibit quadripartite organizations made up of a large single copy region (LSC, 82,796–82,919 bp), a small single copy region (SSC, 17,084–17,092 bp), and a pair of inverted repeat regions (IRs, 25,401–25,408 bp). Guanine-Cytosine (GC) content makes up 37.9%–38.0% of the total content. The complete cp genomes contain 114 unique genes, including 80 protein-coding genes, 30 transfer RNA (tRNA) genes, and four ribosomal RNA (rRNA) genes. Comparative analyses of nucleotide variability (Pi) reveal the five most variable regions (trnY-GUA-trnE-UUC, trnG-GCC, psbZ-trnG-GCC, petN-psbM, and rps4-trnL-UUA), which may be used as molecular markers in future taxonomic identification and phylogenetic analyses of Dicliptera. A total of 55-58 simple sequence repeats (SSRs) and 229 long repeats were identified in the cp genomes of the five Dicliptera species. Phylogenetic analysis identified a close relationship between D. ruiziana and D. montana, followed by D. acuminata, D. peruviana, and D. mucronata. Evolutionary analysis of orthologous protein-coding genes within the family Acanthaceae revealed only one gene, ycf15, to be under positive selection, which may contribute to future studies of its adaptive evolution. The completed genomes are useful for future research on species identification, phylogenetic relationships, and the adaptive evolution of the Dicliptera species.


2012 ◽  
Vol 3 (1) ◽  
pp. 55-73 ◽  
Author(s):  
Ismail Ali ◽  
Sandro Moiron ◽  
Martin Fleury ◽  
Mohammed Ghanbari

This paper examines the impact of data partitioning form on wireless network access control and proposes a selective dropping scheme based on dropping the partition carrying intra-coded macroblocks. Data partitioning is an error resiliency technique that allows unequal error protection for transmission over ‘lossy’ channels. Including a per-picture, cyclic intra-refresh macroblock line guards against temporal error propagation. The authors show that when congestion occurs, it is possible to gain up to 2 dB in video quality over assigning a stream to a single IEEE 802.11e access category. The scheme is consistently advantageous in indoor and outdoor wireless scenarios over other ways of assigning the partitioned data packets to different access categories. This counter-intuitive scheme for access control purposes reverses the priority usually given to partition-B data packets over that of partition-C.


Sign in / Sign up

Export Citation Format

Share Document