scholarly journals Extracting phylogenetic signal and accounting for bias in whole-genome data sets supports the Ctenophora as sister to remaining Metazoa

BMC Genomics ◽  
2015 ◽  
Vol 16 (1) ◽  
Author(s):  
Marek L. Borowiec ◽  
Ernest K. Lee ◽  
Joanna C. Chiu ◽  
David C. Plachetzki
2015 ◽  
Author(s):  
Marek L Borowiec ◽  
Ernest K Lee ◽  
Joanna C Chiu ◽  
David C Plachetzki

Transcriptome-enabled phylogenetic analyses have dramatically improved our understanding of metazoan phylogeny in recent years, although several important questions remain. The branching order near the base of the tree is one such outstanding issue. To address this question we assemble a novel data set comprised of 1,080 orthologous loci derived from 36 publicly available genomes and dissect the phylogenetic signal present in each individual partition. The size of this data set allows for a closer look at the potential biases and sources of non-phylogenetic signal. We assessed a range of measures for each data partition including information content, saturation, rate of evolution, long-branch score, and taxon occupancy and explored how each of these characteristics impacts phylogeny estimation. We then used these data to prepare a reduced set of partitions that fit an optimal set of criteria and are amenable to the most appropriate and computationally intensive analyses using site-heterogeneous models of sequence evolution. We also employed several strategies to examine the potential for long-branch attraction to bias our inferences. All of our analyses support Ctenophora as the sister lineage to other Metazoa, although support for this relationship varies among analyses. We find no support for the traditional view uniting the ctenophores and Cnidaria (jellies, anemones, corals, and kin). We also examine phylogenetic placement of myriapods (centipedes and millipedes) and find it more sensitive to the type of analysis and data used. Our study provides a workflow for minimizing systematic bias in whole genome-based phylogenetic analyses.


2003 ◽  
Vol 358 (1429) ◽  
pp. 223-230 ◽  
Author(s):  
Jason Raymond ◽  
Olga Zhaxybayeva ◽  
J. Peter Gogarten ◽  
Robert E. Blankenship

Reconstructing the early evolution of photosynthesis has been guided in part by the geological record, but the complexity and great antiquity of these early events require molecular genetic techniques as the primary tools of inference. Recent genome sequencing efforts have made whole genome data available from representatives of each of the five phyla of bacteria with photosynthetic members, allowing extensive phylogenetic comparisons of these organisms. Here, we have undertaken whole genome comparisons using maximum likelihood to compare 527 unique sets of orthologous genes from all five photosynthetic phyla. Substantiating recent whole genome analyses of other prokaryotes, our results indicate that horizontal gene transfer (HGT) has played a significant part in the evolution of these organisms, resulting in genomes with mosaic evolutionary histories. A small plurality phylogenetic signal was observed, which may be a core of remnant genes not subject to HGT, or may result from a propensity for gene exchange between two or more of the photosynthetic organisms compared.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Tao Zhao ◽  
Arthur Zwaenepoel ◽  
Jia-Yu Xue ◽  
Shu-Min Kao ◽  
Zhen Li ◽  
...  

AbstractPlant genomes vary greatly in size, organization, and architecture. Such structural differences may be highly relevant for inference of genome evolution dynamics and phylogeny. Indeed, microsynteny—the conservation of local gene content and order—is recognized as a valuable source of phylogenetic information, but its use for the inference of large phylogenies has been limited. Here, by combining synteny network analysis, matrix representation, and maximum likelihood phylogenetic inference, we provide a way to reconstruct phylogenies based on microsynteny information. Both simulations and use of empirical data sets show our method to be accurate, consistent, and widely applicable. As an example, we focus on the analysis of a large-scale whole-genome data set for angiosperms, including more than 120 available high-quality genomes, representing more than 50 different plant families and 30 orders. Our ‘microsynteny-based’ tree is largely congruent with phylogenies proposed based on more traditional sequence alignment-based methods and current phylogenetic classifications but differs for some long-contested and controversial relationships. For instance, our synteny-based tree finds Vitales as early diverging eudicots, Saxifragales within superasterids, and magnoliids as sister to monocots. We discuss how synteny-based phylogenetic inference can complement traditional methods and could provide additional insights into some long-standing controversial phylogenetic relationships.


2009 ◽  
Vol 39 (8) ◽  
pp. 1231-1235 ◽  
Author(s):  
R. Keers ◽  
A. E. Farmer ◽  
K. J. Aitchison

There is significant unmet need for more effective treatments for bipolar disorder. The drug discovery process is becoming prohibitively expensive. Hence, biomarker clues to assist or shortcut this process are now widely sought. Using the publicly available data from the whole genome association study conducted by the Wellcome Trust Case Control Consortium, we sought to identify groups of genetic markers (single nucleotide polymorphisms) in which each marker was independently associated with bipolar disorder, with a less stringent threshold than that set by the original investigators (p⩽1×10−4). We identified a group of markers occurring within the CACNA1C gene (encoding the alpha subunit of the calcium channel Cav1.2). We then ascertained that this locus had been previously associated with the disorder in both a smaller and a whole genome study, and that a number of drugs blocking this channel (including verapamil and diltiazem) had been trialled in the treatment of bipolar disorder. The dihydropyridine-based blockers such as nimodipine that bind specifically to Cav1.2 and are more penetrant to the central nervous system have shown some promising early results; however, further trials are indicated. In addition, migraine is commonly seen in affective disorder, and calcium channel antagonists are successfully used in the treatment of migraine. One such agent, flunarizine, is structurally related to other first-generation derivatives of antihistamines such as antipsychotics. This implies that flunarizine could be useful in the treatment of bipolar disorder, and, furthermore, that other currently licensed drugs should be investigated for antagonism of Cav1.2.


2017 ◽  
Author(s):  
Ross Mounce

In this thesis I attempt to gather together a wide range of cladistic analyses of fossil and extant taxa representing a diverse array of phylogenetic groups. I use this data to quantitatively compare the effect of fossil taxa relative to extant taxa in terms of support for relationships, number of most parsimonious trees (MPTs) and leaf stability. In line with previous studies I find that the effects of fossil taxa are seldom different to extant taxa – although I highlight some interesting exceptions. I also use this data to compare the phylogenetic signal within vertebrate morphological data sets, by choosing to compare cranial data to postcranial data. Comparisons between molecular data and morphological data have been previously well explored, as have signals between different molecular loci. But comparative signal within morphological data sets is much less commonly characterized and certainly not across a wide array of clades. With this analysis I show that there are many studies in which the evidence provided by cranial data appears to be be significantly incongruent with the postcranial data – more than one would expect to see just by the effect of chance and noise alone. I devise and implement a modification to a rarely used measure of homoplasy that will hopefully encourage its wider usage. Previously it had some undesirable bias associated with the distribution of missing data in a dataset, but my modification controls for this. I also take an in-depth and extensive review of the ILD test, noting it is often misused or reported poorly, even in recent studies. Finally, in attempting to collect data and metadata on a large scale, I uncovered inefficiencies in the research publication system that obstruct re-use of data and scientific progress. I highlight the importance of replication and reproducibility – even simple reanalysis of high profile papers can turn up some very different results. Data is highly valuable and thus it must be retained and made available for further re-use to maximize the overall return on research investment.


2021 ◽  
Vol 111 (1) ◽  
pp. 8-11
Author(s):  
Remco Stam ◽  
Pierre Gladieux ◽  
Boris A. Vinatzer ◽  
Erica M. Goss ◽  
Neha Potnis ◽  
...  

Population genetics has been a key discipline in phytopathology for many years. The recent rise in cost-effective, high-throughput DNA sequencing technologies, allows sequencing of dozens, if not hundreds of specimens, turning population genetics into population genomics and opening up new, exciting opportunities as described in this Focus Issue . Without the limitations of genetic markers and the availability of whole or near whole-genome data, population genomics can give new insights into the biology, evolution and adaptation, and dissemination patterns of plant-associated microbes.


2021 ◽  
Author(s):  
Helgi Hilmarsson ◽  
Arvind S. Kumar ◽  
Richa Rastogi ◽  
Carlos D. Bustamante ◽  
Daniel Mas Montserrat ◽  
...  

ABSTRACTAs genome-wide association studies and genetic risk prediction models are extended to globally diverse and admixed cohorts, ancestry deconvolution has become an increasingly important tool. Also known as local ancestry inference (LAI), this technique identifies the ancestry of each region of an individual’s genome, thus permitting downstream analyses to account for genetic effects that vary between ancestries. Since existing LAI methods were developed before the rise of massive, whole genome biobanks, they are computationally burdened by these large next generation datasets. Current LAI algorithms also fail to harness the potential of whole genome sequences, falling well short of the accuracy that such high variant densities can enable. Here we introduce Gnomix, a set of algorithms that address each of these points, achieving higher accuracy and swifter computational performance than any existing LAI method, while also enabling portable models that are particularly useful when training data are not shareable due to privacy or other restrictions. We demonstrate Gnomix (and its swift phase correction counterpart Gnofix) on worldwide whole-genome data from both humans and canids and utilize its high resolution accuracy to identify the location of ancient New World haplotypes in the Xoloitzcuintle, dating back over 100 generations. Code is available at https://github.com/AI-sandbox/gnomix.


2016 ◽  
pp. gkw955 ◽  
Author(s):  
Nicolas Dierckxsens ◽  
Patrick Mardulyn ◽  
Guillaume Smits

Sign in / Sign up

Export Citation Format

Share Document