substitution saturation
Recently Published Documents


TOTAL DOCUMENTS

17
(FIVE YEARS 10)

H-INDEX

5
(FIVE YEARS 1)

2022 ◽  
Vol 16 (1) ◽  
pp. 102
Author(s):  
Nur Alifah Ilyana Mohamad Naim ◽  
Nabihah Raihanah Tajul Anuar ◽  
Lyena Watty Zuraine Ahmad ◽  
Roziah Kambol ◽  
Sharifah Aminah Syed Mohamad ◽  
...  

The 16S rRNA gene is a housekeeping genetic marker that is available in almost all bacterial species and it is used in bacterial phylogeny and taxonomy studies. In many studies, the 16S rRNA gene is used in identification of certain bacterial species. Being a less conserved genetic marker, certain studies found it is a useful tool to infer the genome-wide similarity levels among the closely related prokaryotic organisms. Thus, this study aimed to compare the variation in the 16S rRNA partial region of Burkholderia spp. that infect the panicle of rice from eight different geographical areas. 58 sequences with total of 688 base pairs (bp) of 16S rRNA gene in B. glumae and B. gladioli were retrieved from public database based on several countries namely United State, Panama, Ecuador, Thailand, China, India, Korea and Malaysia. Then, the data sequences were analysed and validated using MEGAX and ABGD software respectively. The result of phylogenetic tree confirmed that B. glumae and B. gladioli were species that present in the panicle blight of rice. However, Data Analysis in Molecular Biology and Evolution (DAMBE) and Automatic Barcode Gap Discovery (ABGD) software were not able to detect substitution saturation and divergence between B. glumae and B. gladioli respectively based on the 58 sequences of the 16S rRNA partial region. Hence, it proves that 16S rRNA gene is an ineffective genetic marker to be used to differentiate the closely related species of bacteria from similar genus.


Animals ◽  
2022 ◽  
Vol 12 (2) ◽  
pp. 148
Author(s):  
Watcharaporn Thapana ◽  
Nattakan Ariyaraphong ◽  
Parinya Wongtienchai ◽  
Nararat Laopichienpong ◽  
Worapong Singchat ◽  
...  

Duplicate control regions (CRs) have been observed in the mitochondrial genomes (mitogenomes) of most varanids. Duplicate CRs have evolved in either concerted or independent evolution in vertebrates, but whether an evolutionary pattern exists in varanids remains unknown. Therefore, we conducted this study to analyze the evolutionary patterns and phylogenetic utilities of duplicate CRs in 72 individuals of Varanus salvator macromaculatus and other varanids. Sequence analyses and phylogenetic relationships revealed that divergence between orthologous copies from different individuals was lower than in paralogous copies from the same individual, suggesting an independent evolution of the two CRs. Distinct trees and recombination testing derived from CR1 and CR2 suggested that recombination events occurred between CRs during the evolutionary process. A comparison of substitution saturation showed the potential of CR2 as a phylogenetic marker. By contrast, duplicate CRs of the four examined varanids had similar sequences within species, suggesting typical characteristics of concerted evolution. The results provide a better understanding of the molecular evolutionary processes related to the mitogenomes of the varanid lineage.


2022 ◽  
Vol 69 (1) ◽  
pp. 1-18
Author(s):  
Xin-Ran Li

In spite of big data and new techniques, the phylogeny and timing of cockroaches remain in dispute. Apart from sequencing more species, an alternative way to improve the phylogenetic inference and time estimation is to improve the quality of data, calibrations and analytical procedure. This study emphasizes the completeness of data, the reliability of genes (judged via alignment ambiguity and substitution saturation), and the justification for fossil calibrations. Based on published mitochondrial genomes, the Bayesian phylogeny of cockroaches and termites is recovered as: Corydiinae + (((Cryptocercidae + Isoptera) + ((Anaplectidae + Lamproblattidae) + (Tryonicidae + Blattidae))) + (Pseudophyllodromiinae + (Ectobiinae + (Blattellinae + Blaberidae)))). With two fossil calibrations, namely, Valditermes brenanae and Piniblattella yixianensis, this study dates the crown Dictyoptera to early Jurassic, and crown Blattodea to middle Jurassic. Using the ambiguous ‘roachoid’ fossils to calibrate Dictyoptera+sister pushes these times back to Permian and Triassic. This study also shows that appropriate fossil calibrations are rarer than considered in previous studies.


2021 ◽  
Author(s):  
David A Duchêne ◽  
Niklas Mather ◽  
Cara Van Der Wal ◽  
Simon Y W Ho

Abstract The historical signal in nucleotide sequences becomes eroded over time by substitutions occurring repeatedly at the same sites. This phenomenon, known as substitution saturation, is recognized as one of the primary obstacles to deep-time phylogenetic inference using genome-scale data sets. We present a new test of substitution saturation and demonstrate its performance in simulated and empirical data. For some of the 36 empirical phylogenomic data sets that we examined, we detect substitution saturation in around 50% of loci. We found that saturation tends to be flagged as problematic in loci with highly discordant phylogenetic signals across sites. Within each data set, the loci with smaller numbers of informative sites are more likely to be flagged as containing problematic levels of saturation. The entropy saturation test proposed here is sensitive to high evolutionary rates relative to the evolutionary timeframe, while also being sensitive to several factors known to mislead phylogenetic inference, including short internal branches relative to external branches, short nucleotide sequences, and tree imbalance. Our study demonstrates that excluding loci with substitution saturation can be an effective means of mitigating the negative impact of multiple substitutions on phylogenetic inferences. [Phylogenetic model performance; phylogenomics; substitution model; substitution saturation; test statistics.]


2021 ◽  
Author(s):  
David A. Duchêne ◽  
Niklas Mather ◽  
Cara Van Der Wal ◽  
Simon Y.W. Ho

AbstractThe historical signal in nucleotide sequences becomes eroded over time by substitutions occurring repeatedly at the same sites. This phenomenon, known as substitution saturation, is recognized as one of the primary obstacles to deep-time phylogenetic inference using genome-scale data sets. We present a new test of substitution saturation and demonstrate its performance in simulated and empirical data. For some of the 36 empirical phylogenomic data sets that we examined, we detect substitution saturation in around 50% of loci. We found that saturation tends to be flagged as problematic in loci with highly discordant phylogenetic signals across sites. Within each data set, the loci with smaller numbers of informative sites are more likely to be flagged as containing problematic levels of saturation. The entropy saturation test proposed here is sensitive to high evolutionary rates relative to the evolutionary timeframe, while also being sensitive to several factors known to mislead phylogenetic inference, including short internal branches relative to external branches, short nucleotide sequences, and tree imbalance. Our study demonstrates that excluding loci with substitution saturation can be an effective means of mitigating the negative impact of multiple substitutions on phylogenetic inferences.


2020 ◽  
Author(s):  
Edyth Parker ◽  
Alvin Han ◽  
Lieke Brouwer ◽  
Katja Wolthers ◽  
Kimberley Benschop ◽  
...  

AbstractHuman parechoviruses (PeV-A) can cause severe sepsis and neurological syndromes in neonates and children and are currently classified into 19 genotypes based on genetic divergence in the VP1 gene. However, the genotyping system has notable limitations including an arbitrary distance threshold and reliance on insufficiently robust phylogenetic reconstruction approaches leading to inconsistent genotype definitions. In order to improve the genotyping system, we investigated the molecular epidemiology of human parechoviruses, including the evolutionary history of the different PeV-A lineages as far as is possible. We found that PeV-A lineages suffer from severe substitution saturation in the VP1 gene which limit the inference of deep evolutionary timescales among the extant PeV-A and suggest that the degree of evolutionary divergence among current PeV-A lineages has been substantially underestimated, further confounding the current genotyping system. We propose an alternative nomenclature system based on robust, amino-acid level phylogenetic reconstruction and clustering with the PhyCLIP algorithm which delineates highly divergent currently designated genotypes more informatively. We also describe a dynamic nomenclature framework that combines PhyCLIP’s progressive clustering with phylogenetic placement for genotype assignment.


2019 ◽  
Author(s):  
L. Thibério Rangel ◽  
Gregory P. Fournier

AbstractThe trimming of fast-evolving sites, often known as “slow-fast” analysis, is broadly used in microbial phylogenetic reconstruction under assumption that fast-evolving sites do not retain accurate phylogenetic signal due to substitution saturation. Therefore, removing sites that have experienced multiple substitutions would improve the signal-to-noise ratio in phylogenetic analyses, with the remaining slower-evolving sites preserving a more reliable record of evolutionary relationships. Here we show that, contrary to this assumption, even the fastest evolving sites, present in conserved proteins often used in Tree of Life studies, contain reliable and valuable phylogenetic information, and that the trimming of such sites can negatively impact the accuracy of phylogenetic reconstruction. Simulated alignments modeled after ribosomal protein datasets used in Tree of Life studies consistently show that slow-evolving sites are less likely to recover true bipartitions than even the fastest-evolving sites. Furthermore, site specific substitution-rates are positively correlated with the frequency of accurately recovered short-branched bipartitions, as slowly evolving sites are less likely to have experienced substitutions along these intervals. Using published Tree of Life sequence alignment datasets, we additionally show that both slow-and fast-evolving sites contain similarly inconsistent phylogenetic signals, and that, for fast-evolving sites, this inconsistency can be attributed to poor alignment quality. Furthermore, trimming fast sites, slow sites, or both is shown to have substantial impact on phylogenetic reconstruction across multiple evolutionary models. This is perhaps most evident in the resulting placements of Eukarya and Asgardarchaeota groups, which are especially sensitive to the implementation of different trimming schemes.Significance StatementIt is common practice among comprehensive microbial phylogenetic studies to trim fast-evolving sites from the source alignment in the expectation to increase the signal to noise ratio. Here we show that despite fast-evolving sites being more sensitive to parameter misspecifications than mid-rate evolving sites, such sensitivity is comparable, if not smaller, than what we observe among slow-evolving sites. Through the use of both empirical and simulated datasets we also show that, besides the lack of evidences regarding the noisy nature of fast-evolving sites, such sites are of core importance for the reliable the reconstruction of short-branched bipartitions. Such points are exemplified by the variations in the Eukarya+Archaea Tree of Life when subjective alignment trimming strategies are employed.


2019 ◽  
Author(s):  
Alexandra M. Hernandez ◽  
Joseph F. Ryan

AbstractSix-state amino acid recoding strategies are commonly applied to combat the effects of compositional heterogeneity and substitution saturation in phylogenetic analyses. While these methods have been endorsed from a theoretical perspective, their performance has never been extensively tested. Here, we test the effectiveness of 6-state recoding approaches by comparing the performance of analyses on recoded and non-recoded datasets that have been simulated under gradients of compositional heterogeneity or saturation. In all of our simulation analyses, non-recoding approaches greatly outperformed 6-state recoding approaches. Our results suggest that 6-state recoding strategies are not effective in the face of high saturation. Further, while recoding strategies do buffer the effects of compositional heterogeneity, the loss of information that accompanies 6-state recoding outweighs its benefits, even in the most compositionally heterogeneous datasets. In addition, we evaluate recoding schemes with 9, 12, 15, and 18 states and show that these all outperform 6-state recoding. Our results have important implications for the more than 70 published papers that have incorporated 6-state recoding, many of which have significant bearing on relationships across the tree of life.


2019 ◽  
Author(s):  
Mingrui Wang ◽  
Dapeng Wang ◽  
Jun Yu ◽  
Shi Huang

AbstractProteins were first used in the early 1960s to discover the molecular clock dating method and remain in common usage today in phylogenetic inferences based on neutral variations. To avoid substitution saturation, it is necessary to use slow evolving genes. However, it remains unclear whether fixed and standing missense changes in such genes may qualify as neutral. Here, based on the evolutionary rates as inferred from identity scores between orthologs in human and Macaca monkey, we found that the fraction of conservative amino acid mismatches between species was significantly higher in slow evolving proteins. We also examined the single nucleotide polymorphisms (SNPs) by using the 1000 genomes project data and found that missense SNPs in slow evolving proteins also had higher fraction of conservative changes, especially for common SNPs, consistent with more natural selection for SNPs, particularly rare ones, in fast evolving proteins. These results suggest that fixed and standing missense variations in slow evolving proteins are more likely to be neutral and hence better qualified for use in phylogenetic inferences.


2019 ◽  
Author(s):  
Chong He ◽  
Dan Liang ◽  
Peng Zhang

AbstractThe neutral theory of molecular evolution suggests that the constancy of the molecular clock relies on the neutral condition. Thus, purifying selection, the most common type of natural selection, could influence the constancy of the molecular clock, and the use of genes/sites under purifying selection may produce less reliable molecular dating results. However, in current practices of species-level molecular dating, some researchers prefer to select slowly evolving genes/sites to avoid the potential impact of substitution saturation. These genes/sites are generally under a strong influence of purifying selection. Here, from the data of 23 published mammal genomes, we constructed datasets under various selective constraints. We compared the differences in branch lengths and time estimates among these datasets to investigate the impact of purifying selection on species-level molecular dating. We found that as the selective constraint increases, terminal branches are extended, which introduces biases into the result of species-level molecular dating. This result suggests that in species-level molecular dating, the impact of purifying selection should be taken into consideration, and researchers should be more cautious with the use of genes/sites under purifying selection.


Sign in / Sign up

Export Citation Format

Share Document