scholarly journals Comparative Cladistics: Fossils, Morphological Data Partitions and Lost Branches in the Fossil Tree of Life

2017 ◽  
Author(s):  
Ross Mounce

In this thesis I attempt to gather together a wide range of cladistic analyses of fossil and extant taxa representing a diverse array of phylogenetic groups. I use this data to quantitatively compare the effect of fossil taxa relative to extant taxa in terms of support for relationships, number of most parsimonious trees (MPTs) and leaf stability. In line with previous studies I find that the effects of fossil taxa are seldom different to extant taxa – although I highlight some interesting exceptions. I also use this data to compare the phylogenetic signal within vertebrate morphological data sets, by choosing to compare cranial data to postcranial data. Comparisons between molecular data and morphological data have been previously well explored, as have signals between different molecular loci. But comparative signal within morphological data sets is much less commonly characterized and certainly not across a wide array of clades. With this analysis I show that there are many studies in which the evidence provided by cranial data appears to be be significantly incongruent with the postcranial data – more than one would expect to see just by the effect of chance and noise alone. I devise and implement a modification to a rarely used measure of homoplasy that will hopefully encourage its wider usage. Previously it had some undesirable bias associated with the distribution of missing data in a dataset, but my modification controls for this. I also take an in-depth and extensive review of the ILD test, noting it is often misused or reported poorly, even in recent studies. Finally, in attempting to collect data and metadata on a large scale, I uncovered inefficiencies in the research publication system that obstruct re-use of data and scientific progress. I highlight the importance of replication and reproducibility – even simple reanalysis of high profile papers can turn up some very different results. Data is highly valuable and thus it must be retained and made available for further re-use to maximize the overall return on research investment.

Author(s):  
Nicolás Mongiardino Koch ◽  
Jeffrey R Thompson

Abstract Phylogenomic and paleontological data constitute complementary resources for unraveling the phylogenetic relationships and divergence times of lineages, yet few studies have attempted to fully integrate them. Several unique properties of echinoids (sea urchins) make them especially useful for such synthesizing approaches, including a remarkable fossil record that can be incorporated into explicit phylogenetic hypotheses. We revisit the phylogeny of crown group Echinoidea using a total-evidence dating approach that combines the largest phylogenomic data set for the clade, a large-scale morphological matrix with a dense fossil sampling, and a novel compendium of tip and node age constraints. To this end, we develop a novel method for subsampling phylogenomic data sets that selects loci with high phylogenetic signal, low systematic biases, and enhanced clock-like behavior. Our results demonstrate that combining different data sources increases topological accuracy and helps resolve conflicts between molecular and morphological data. Notably, we present a new hypothesis for the origin of sand dollars, and restructure the relationships between stem and crown echinoids in a way that implies a long stretch of undiscovered evolutionary history of the crown group in the late Paleozoic. Our efforts help bridge the gap between phylogenomics and phylogenetic paleontology, providing a model example of the benefits of combining the two. [Echinoidea; fossils; paleontology; phylogenomics; time calibration; total evidence.]


2021 ◽  
Author(s):  
Andrew J Kavran ◽  
Aaron Clauset

Abstract Background: Large-scale biological data sets are often contaminated by noise, which can impede accurate inferences about underlying processes. Such measurement noise can arise from endogenous biological factors like cell cycle and life history variation, and from exogenous technical factors like sample preparation and instrument variation.Results: We describe a general method for automatically reducing noise in large-scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “filtered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network filter may be applied to an entire system, or the system may be first decomposed into distinct modules and a different filter applied to each. Applied to synthetic data with known network structure and signal, network filters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancerous tissues, network filtering prior to training increases accuracy up to 43% compared to using unfiltered data.Conclusions: Network filters are a general way to denoise biological data and can account for both correlation and anti-correlation between different measurements. Furthermore, we find that partitioning a network prior to filtering can significantly reduce errors in networks with heterogenous data and correlation patterns, and this approach outperforms existing diffusion based methods. Our results on proteomics data indicate the broad potential utility of network filters to applications in systems biology.


Author(s):  
Sacha J. van Albada ◽  
Jari Pronold ◽  
Alexander van Meegen ◽  
Markus Diesmann

AbstractWe are entering an age of ‘big’ computational neuroscience, in which neural network models are increasing in size and in numbers of underlying data sets. Consolidating the zoo of models into large-scale models simultaneously consistent with a wide range of data is only possible through the effort of large teams, which can be spread across multiple research institutions. To ensure that computational neuroscientists can build on each other’s work, it is important to make models publicly available as well-documented code. This chapter describes such an open-source model, which relates the connectivity structure of all vision-related cortical areas of the macaque monkey with their resting-state dynamics. We give a brief overview of how to use the executable model specification, which employs NEST as simulation engine, and show its runtime scaling. The solutions found serve as an example for organizing the workflow of future models from the raw experimental data to the visualization of the results, expose the challenges, and give guidance for the construction of an ICT infrastructure for neuroscience.


2021 ◽  
Author(s):  
Theresa A Harbig ◽  
Sabrina Nusrat ◽  
Tali Mazor ◽  
Qianwen Wang ◽  
Alexander Thomson ◽  
...  

Molecular profiling of patient tumors and liquid biopsies over time with next-generation sequencing technologies and new immuno-profile assays are becoming part of standard research and clinical practice. With the wealth of new longitudinal data, there is a critical need for visualizations for cancer researchers to explore and interpret temporal patterns not just in a single patient but across cohorts. To address this need we developed OncoThreads, a tool for the visualization of longitudinal clinical and cancer genomics and other molecular data in patient cohorts. The tool visualizes patient cohorts as temporal heatmaps and Sankey diagrams that support the interactive exploration and ranking of a wide range of clinical and molecular features. This allows analysts to discover temporal patterns in longitudinal data, such as the impact of mutations on response to a treatment, e.g. emergence of resistant clones. We demonstrate the functionality of OncoThreads using a cohort of 23 glioma patients sampled at 2-4 timepoints. OncoThreads is freely available at http://oncothreads.gehlenborglab.org and implemented in Javascript using the cBioPortal web API as a backend.


2018 ◽  
Vol 2 ◽  
pp. e26539 ◽  
Author(s):  
Paul J. Morris ◽  
James Hanken ◽  
David Lowery ◽  
Bertram Ludäscher ◽  
James Macklin ◽  
...  

As curators of biodiversity data in natural science collections, we are deeply concerned with data quality, but quality is an elusive concept. An effective way to think about data quality is in terms of fitness for use (Veiga 2016). To use data to manage physical collections, the data must be able to accurately answer questions such as what objects are in the collections, where are they and where are they from. Some research uses aggregate data across collections, which involves exchange of data using standard vocabularies. Some research uses require accurate georeferences, collecting dates, and current identifications. It is well understood that the costs of data capture and data quality improvement increase with increasing time from the original observation. These factors point towards two engineering principles for software that is intended to maintain or enhance data quality: build small modular data quality tests that can be easily assembled in suites to assess the fitness of use of data for some particular need; and produce tools that can be applied by users with a wide range of technical skill levels at different points in the data life cycle. In the Kurator project, we have produced code (e.g. Wieczorek et al. 2017, Morris 2016) which consists of small modules that can be incorporated into data management processes as small libraries that address particular data quality tests. These modules can be combined into customizable data quality scripts, which can be run on single computers or scalable architecture and can be incorporated into other software, run as command line programs, or run as suites of canned workflows through a web interface. Kurator modules can be integrated into early stage data capture applications, run to help prepare data for aggregation by matching it to standard vocabularies, be run for quality control or quality assurance on data sets, and can report on data quality in terms of a fitness-for-use framework (Veiga et al. 2017). One of our goals is simple tests usable by anyone anywhere.


2009 ◽  
Vol 34 (3) ◽  
pp. 580-594 ◽  
Author(s):  
Anthony R. Magee ◽  
Ben-Erik van Wyk ◽  
Patricia M. Tilney ◽  
Stephen R. Downie

Generic circumscriptions and phylogenetic relationships of the Cape genera Capnophyllum, Dasispermum, and Sonderina are explored through parsimony and Bayesian inference analyses of nrDNA ITS and cpDNA rps16 intron sequences, morphology, and combined molecular and morphological data. The relationship of these genera with the North African genera Krubera and Stoibrax is also assessed. Analyses of both molecular data sets place Capnophyllum, Dasispermum, Sonderina, and the only southern African species of Stoibrax (S. capense) within the newly recognized Lefebvrea clade of tribe Tordylieae. Capnophyllum is strongly supported as monophyletic and is distantly related to Krubera. The monotypic genus Dasispermum and Stoibrax capense are embedded within a paraphyletic Sonderina. This complex is distantly related to the North African species of Stoibrax in tribe Apieae, in which the type species, Stoibrax dichotomum, occurs. Consequently, Dasispermum is expanded to include both Sonderina and Stoibrax capense. New combinations are formalized for Dasispermum capense, D. hispidum, D. humile, and D. tenue. An undescribed species from the Tanqua Karoo in South Africa is also closely related to Capnophyllum and the Dasispermum–Sonderina complex. The genus Scaraboides is described herein to accommodate the new species, S. manningii. This monotypic genus shares the dorsally compressed fruit and involute marginal wings with Capnophyllum, but is easily distinguished by its erect branching habit, green leaves, scabrous umbels, and fruit with indistinct median and lateral ribs, additional solitary vittae in each marginal wing, and parallel, closely spaced commissural vittae. Despite the marked fruit similarities with Capnophyllum, analyses of DNA sequence data place Scaraboides closer to the Dasispermum–Sonderina complex, with which it shares the erect habit, green (nonglaucous) leaves, and scabrous umbels.


2017 ◽  
Vol 44 (2) ◽  
pp. 203-229 ◽  
Author(s):  
Javier D Fernández ◽  
Miguel A Martínez-Prieto ◽  
Pablo de la Fuente Redondo ◽  
Claudio Gutiérrez

The publication of semantic web data, commonly represented in Resource Description Framework (RDF), has experienced outstanding growth over the last few years. Data from all fields of knowledge are shared publicly and interconnected in active initiatives such as Linked Open Data. However, despite the increasing availability of applications managing large-scale RDF information such as RDF stores and reasoning tools, little attention has been given to the structural features emerging in real-world RDF data. Our work addresses this issue by proposing specific metrics to characterise RDF data. We specifically focus on revealing the redundancy of each data set, as well as common structural patterns. We evaluate the proposed metrics on several data sets, which cover a wide range of designs and models. Our findings provide a basis for more efficient RDF data structures, indexes and compressors.


Zootaxa ◽  
2004 ◽  
Vol 680 (1) ◽  
pp. 1 ◽  
Author(s):  
ARNE NYGREN

Autolytinae is revised based on available types, and newly collected specimens. Out of 170 nominal species, 18 are considered as incertae sedis, 43 are regarded as junior synonyms, and 25 are referred to as nomina dubia. The relationships of Autolytinae is assessed from 51 morphological characters and 211 states for 76 ingroup-taxa, and 460 molecular characters from mitochondrial 16S rDNA and nuclear 18S rDNA for 31 ingroup-taxa; outgroups include 12 non-autolytine syllid polychaetes. Two analyses are provided, one including morphological data only, and one with combined morphological and molecular data sets. The resulting strict consensus tree from the combined data is chosen for a reclassification. Three main clades are identified: Procerini trib. n., Autolytini Grube, 1850, and Epigamia gen. n. Proceraea Ehlers, 1864 and Myrianida Milne Edwards, 1845 are referred to as nomen protectum, while Scolopendra Slabber, 1781, Podonereis Blainville, 1818, Amytis Savigny, 1822, Polynice Savigny, 1822, and Nereisyllis Blainville, 1828 are considered


Zootaxa ◽  
2007 ◽  
Vol 1423 (1) ◽  
pp. 1-26 ◽  
Author(s):  
JEFFREY H. SKEVINGTON ◽  
CHRISTIAN KEHLMAIER ◽  
GUNILLA STÅHLS

Sequence data from 658 base pairs of mitochondrial cytochrome c oxidase I (cox1) were analysed for 28 described species of Pipunculidae (Diptera) in an effort to test the concept of DNA Barcoding on this family. Two recently revised but distantly related pipunculid lineages with presumed different evolutionary histories were used for the test (Clistoabdominalis Skevington, 2001 and Nephrocerus Zetterstedt, 1838). An effort was made to test the concept using sister taxa and morphologically similar sibling species swarms in these two genera. Morphological species concepts for Clistoabdominalis taxa were either supported by cox1 data or found to be too broad. Most of the discordance could be accounted for after reassessing morphological characters. In these cases, the molecular data were invaluable in assisting taxonomic decision-making. The radiation of Nearctic species of Nephrocerus could not be diagnosed using cox1. The ability of cox1 to recover phylogenetic signal was also tested on Clistoabdominalis. Morphological data for Clistoabdominalis were combined with the molecular data set. The pipunculid phylogeny from molecular data closely resembles the published phylogeny based on morphology. Partitioned Bremer support is used to localize areas of conflict between the datasets.


2021 ◽  
Author(s):  
Robin M. D. Beck ◽  
Robert Voss ◽  
Sharon Jansa

The current literature on marsupial phylogenetics includes numerous studies based on analyses of morphological data with relatively limited sampling of Recent and fossil taxa, and many studies based on analyses of molecular data that include a dense sampling of Recent taxa, but relatively few that combine both data types. Another dichotomy in the marsupial phylogenetic literature is between studies that focus on New World taxa, others that focus on Sahulian taxa. To date, there has been no attempt to assess the phylogenetic relationships of the global marsupial fauna, based on combined analyses of morphology and molecular sequences, for a dense sampling of Recent and fossil taxa. For this report, we compiled morphological and molecular data from an unprecedented number of Recent and fossil marsupials. Our morphological data consist of 180 craniodental characters that we scored for 97 species representing every currently recognized Recent genus, 42 additional ingroup (crown-clade marsupial) taxa represented by well-preserved fossils, and 5 outgroups (non-marsupial metatherians). Our molecular data comprise 24.5 kb of DNA sequences from whole-mitochondrial genomes and six nuclear loci (APOB, BRCA1, GHR, RAG1, RBP3 and VWF) for 97 marsupial terminals (the same Recent taxa scored for craniodental morphology) and several placental and monotreme outgroups. The results of separate and combined analyses of these data using a wide range of phylogenetic methods support many currently accepted hypotheses of ingroup (marsupial) relationships, but they also underscore the difficulty of placing fossils with key missing data (e.g., †Evolestes), and the unique difficulty of placing others that exhibit mosaics of plesiomorphic and autapomorphic traits (e.g., †Yalkaparidon). Unique contributions of our study are (1) critical discussions and illustrations of marsupial craniodental morphology, including descriptions and illustrations of some features never previously coded for phylogenetic analysis; (2) critical assessments of relative support for many suprageneric clades; (3) estimates of divergence times derived from tip-and-node dating based on uniquely taxon-dense analyses; and (4) a revised, higher-order classification of marsupials accompanied by lists of supporting craniodental synapomorphies. Far from the last word on these topics, this report lays the foundation for future research that may be enabled by the discovery of new fossil taxa, better-preserved material of previously described taxa, novel morphological characters, and improved methods of phylogenetic analysis.


Sign in / Sign up

Export Citation Format

Share Document