scholarly journals Optimal Rates for Phylogenetic Inference and Experimental Design in the Era of Genome-Scale Data Sets

2018 ◽  
Vol 68 (1) ◽  
pp. 145-156 ◽  
Author(s):  
Alex Dornburg ◽  
Zhuo Su ◽  
Jeffrey P Townsend
2021 ◽  
Author(s):  
David A. Duchêne ◽  
Niklas Mather ◽  
Cara Van Der Wal ◽  
Simon Y.W. Ho

AbstractThe historical signal in nucleotide sequences becomes eroded over time by substitutions occurring repeatedly at the same sites. This phenomenon, known as substitution saturation, is recognized as one of the primary obstacles to deep-time phylogenetic inference using genome-scale data sets. We present a new test of substitution saturation and demonstrate its performance in simulated and empirical data. For some of the 36 empirical phylogenomic data sets that we examined, we detect substitution saturation in around 50% of loci. We found that saturation tends to be flagged as problematic in loci with highly discordant phylogenetic signals across sites. Within each data set, the loci with smaller numbers of informative sites are more likely to be flagged as containing problematic levels of saturation. The entropy saturation test proposed here is sensitive to high evolutionary rates relative to the evolutionary timeframe, while also being sensitive to several factors known to mislead phylogenetic inference, including short internal branches relative to external branches, short nucleotide sequences, and tree imbalance. Our study demonstrates that excluding loci with substitution saturation can be an effective means of mitigating the negative impact of multiple substitutions on phylogenetic inferences.


IMA Fungus ◽  
2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Felix Grewe ◽  
Claudio Ametrano ◽  
Todd J. Widhelm ◽  
Steven Leavitt ◽  
Isabel Distefano ◽  
...  

AbstractParmeliaceae is the largest family of lichen-forming fungi with a worldwide distribution. We used a target enrichment data set and a qualitative selection method for 250 out of 350 genes to infer the phylogeny of the major clades in this family including 81 taxa, with both subfamilies and all seven major clades previously recognized in the subfamily Parmelioideae. The reduced genome-scale data set was analyzed using concatenated-based Bayesian inference and two different Maximum Likelihood analyses, and a coalescent-based species tree method. The resulting topology was strongly supported with the majority of nodes being fully supported in all three concatenated-based analyses. The two subfamilies and each of the seven major clades in Parmelioideae were strongly supported as monophyletic. In addition, most backbone relationships in the topology were recovered with high nodal support. The genus Parmotrema was found to be polyphyletic and consequently, it is suggested to accept the genus Crespoa to accommodate the species previously placed in Parmotrema subgen. Crespoa. This study demonstrates the power of reduced genome-scale data sets to resolve phylogenetic relationships with high support. Due to lower costs, target enrichment methods provide a promising avenue for phylogenetic studies including larger taxonomic/specimen sampling than whole genome data would allow.


2010 ◽  
Vol 25 (5) ◽  
pp. 372-380 ◽  
Author(s):  
Michael E. Hughes ◽  
John B. Hogenesch ◽  
Karl Kornacker

1993 ◽  
Vol 20 (3) ◽  
pp. 188-190 ◽  
Author(s):  
John F. Walsh

An SAS program permits instructors to provide students with simulated questionnaire-style data for use in courses in statistics and experimental design. The program produces data that approximate the distributions of nominal or categorical variables, rating scale data, and normally distributed scores. The data sets provide a simulated research experience that can be used to incorporate the development of research items as well as their statistical analysis and interpretation.


2007 ◽  
Vol 05 (04) ◽  
pp. 977-986 ◽  
Author(s):  
BEATRIZ STRANSKY ◽  
JUNIOR BARRERA ◽  
LUCILA OHNO-MACHADO ◽  
SANDRO J. DE SOUZA

The last 10 years have seen the rise of many technologies that produce an unprecedented amount of genome-scale data from many organisms. Although the research community has been successful in exploring these data, many challenges still persist. One of them is the effective integration of such data sets directly into approaches based on mathematical modeling of biological systems. Applications in cancer are a good example. The bridge between information and modeling in cancer can be achieved by two major types of complementary strategies. First, there is a bottom–up approach, in which data generates information about structure and relationship between components of a given system. In addition, there is a top–down approach, where cybernetic and systems–theoretical knowledge are used to create models that describe mechanisms and dynamics of the system. These approaches can also be linked to yield multi-scale models combining detailed mechanism and wide biological scope. Here we give an overall picture of this field and discuss possible strategies to approach the major challenges ahead.


2002 ◽  
Vol 12 (10) ◽  
pp. 1564-1573 ◽  
Author(s):  
M. Werner-Washburne

2006 ◽  
Vol 55 (3) ◽  
pp. 426-440 ◽  
Author(s):  
J. Gordon Burleigh ◽  
Amy C. Driskell ◽  
Michael J. Sanderson
Keyword(s):  

2016 ◽  
Author(s):  
K. Jun Tong ◽  
Nathan Lo ◽  
Simon Y W Ho

Reconstructing the timescale of the Tree of Life is one of the principal aims of evolutionary biology. This has been greatly aided by the development of the molecular clock, which enables evolutionary timescales to be estimated from genetic data. In recent years, high-throughput sequencing technology has led to an increase in the feasibility and availability of genome-scale data sets. These represent a rich source of biological information, but they also bring a set of analytical challenges. In this review, we provide an overview of phylogenomic dating and describe the challenges associated with analysing genome-scale data. We also report on recent phylogenomic estimates of the evolutionary timescales of mammals, birds, and insects.


2016 ◽  
Author(s):  
K. Jun Tong ◽  
Nathan Lo ◽  
Simon Y W Ho

Reconstructing the timescale of the Tree of Life is one of the principal aims of evolutionary biology. This has been greatly aided by the development of the molecular clock, which enables evolutionary timescales to be estimated from genetic data. In recent years, high-throughput sequencing technology has led to an increase in the feasibility and availability of genome-scale data sets. These represent a rich source of biological information, but they also bring a set of analytical challenges. In this review, we provide an overview of phylogenomic dating and describe the challenges associated with analysing genome-scale data. We also report on recent phylogenomic estimates of the evolutionary timescales of mammals, birds, and insects.


Sign in / Sign up

Export Citation Format

Share Document