Supplementary material to "crestr An R package to perform probabilistic climate reconstructions using fossil proxies"

Author(s):  
Manuel Chevalier
2021 ◽  
Author(s):  
Manuel Chevalier

Abstract. Statistical climate reconstruction techniques are practical tools to study past climate variability from fossil proxy data. In particular, the methods based on probability density functions (PDFs) are powerful at producing robust results from various environments and proxies. However, accessing and curating the necessary calibration data, as well as the complexity of interpreting probabilistic results, often limit their use in palaeoclimatological studies. To address these problems, I present a new R package (crestr) to apply the CREST method (Climate REconstruction SofTware) on diverse palaeoecological datasets. crestr includes a globally curated calibration dataset for six common climate proxies (i.e. plants, beetles, chironomids, rodents, foraminifera, and dinoflagellate cysts) that enables its use in most terrestrial and marine regions. The package can also be used with private data collections instead of, or in combination with, the provided dataset. It also includes a suite of graphical diagnostic tools to represent the data at each step of the reconstruction process and provide insights into the effect of the different modelling assumptions and external factors that underlie a reconstruction. With this R package, the CREST method can now be used in a scriptable environment, thus simplifying its use and integration in existing workflows. It is hoped that crestr will contribute to producing the much-needed quantified records from the many regions where climate reconstructions are currently lacking, despite the existence of suitable fossil records.


2018 ◽  
Author(s):  
Daniel Commenges ◽  
Chariff Alkhassim ◽  
Raphael Gottardo ◽  
Boris Hejblum ◽  
Rodolphe Thiébaut

AbstractMotivationFlow cytometry is a powerful technology that allows the high-throughput quantification of dozens of surface and intracellular proteins at the single-cell level. It has become the most widely used technology for immunophenotyping of cells over the past three decades. Due to the increasing complexity of cytometry experiments (more cells and more markers), traditional manual flow cytometry data analysis has become untenable due to its subjectivity and time-consuming nature.ResultsWe present a new unsupervised algorithm called “cytometree” to perform automated population discovery (aka gating) in flow cytometry. cytometree is based on the construction of a binary tree, the nodes of which are subpopulations of cells. At each node, the marker distributions are modeled by mixtures of normal distribution. Node splitting is done according to a normalized difference of Akaike information criteria (AIC) between the two models. Post-processing of the tree structure and derived populations allows us to complete the annotation of the derived populations. The algorithm is shown to perform better than the state-of-the-art unsupervised algorithms previously proposed on panels introduced by the Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP I) project. The algorithm is also applied to a T-cell panel proposed by the Human Immunology Project Consortium (HIPC) program; it also outperforms the best unsupervised open-source available algorithm while requiring the shortest computation time.AvailabilityAn R package named “cytometree” is available on the CRAN [email protected]; [email protected] informationSupplementary data are available.


2018 ◽  
Author(s):  
Martin Pirkl ◽  
Niko Beerenwinkel

AbstractMotivationNew technologies allow for the elaborate measurement of different traits of single cells. These data promise to elucidate intra-cellular networks in unprecedented detail and further help to improve treatment of diseases like cancer. However, cell populations can be very heterogeneous.ResultsWe developed a mixture of Nested Effects Models (M&NEM) for single-cell data to simultaneously identify different cellular sub-populations and their corresponding causal networks to explain the heterogeneity in a cell population. For inference, we assign each cell to a network with a certain probability and iteratively update the optimal networks and cell probabilities in an Expectation Maximization scheme. We validate our method in the controlled setting of a simulation study and apply it to three data sets of pooled CRISPR screens generated previously by two novel experimental techniques, namely Crop-Seq and Perturb-Seq.AvailabilityThe mixture Nested Effects Model (M&NEM) is available as the R-package mnem at https://github.com/cbgethz/mnem/[email protected], [email protected] informationSupplementary data are available.online.


2018 ◽  
Author(s):  
Andrew M. Dolman ◽  
Thomas Laepple

Abstract. Climate reconstructions based on proxy records recovered from marine sediments, such as alkenone records or geochemical parameters measured on foraminifera, play an important role in our understanding of the climate system. They provide information about the state of the ocean ranging back hundreds to millions of years and form the backbone of paleo-oceanography. However, there are many sources of uncertainty associated with the signal recovered from sediment archived proxies. These include seasonal or depth habitat biases in the recorded signal, a frequency dependent reduction in the amplitude of the recorded signal due to bioturbation of the sediment, aliasing of high frequency climate variation onto a nominally annual, decadal or centennial resolution signal, and additional sample processing and measurement error introduced when the proxy signal is recovered. Here we present a forward model for sediment archived proxies that jointly models the above processes, so that the magnitude of their separate and combined effects can be investigated. Applications include the interpretation and analysis of uncertainty in existing proxy records, parameter sensitivity analysis to optimize future studies, and the generation of pseudo-proxy records that can be used to test reconstruction methods. We provide examples, such as the simulation of individual foraminifera records, that demonstrate the usefulness of the forward model for paleoclimate studies. The model is implemented as a user-friendly R package, sedproxy, the use of which we hope will contribute to a better understanding of both the limitations and potential of marine sediment proxies to inform about past climate.


2017 ◽  
Author(s):  
Bo Wang ◽  
Daniele Ramazzotti ◽  
Luca De Sano ◽  
Junjie Zhu ◽  
Emma Pierson ◽  
...  

AbstractMotivationWe here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a cell-to-cell similarity measure from single-cell RNA-seq data. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of cells. SIMLR was benchmarked against state-of-the-art methods for these three tasks on several public datasets, showing it to be scalable and capable of greatly improving clustering performance, as well as providing valuable insights by making the data more interpretable via better a visualization.Availability and ImplementationSIMLR is available on GitHub in both R and MATLAB implementations. Furthermore, it is also available as an R package on [email protected] or [email protected] InformationSupplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Carlos Martínez-Mira ◽  
Ana Conesa ◽  
Sonia Tarazona

AbstractMotivationAs new integrative methodologies are being developed to analyse multi-omic experiments, validation strategies are required for benchmarking. In silico approaches such as simulated data are popular as they are fast and cheap. However, few tools are available for creating synthetic multi-omic data sets.ResultsMOSim is a new R package for easily simulating multi-omic experiments consisting of gene expression data, other regulatory omics and the regulatory relationships between them. MOSim supports different experimental designs including time series data.AvailabilityThe package is freely available under the GPL-3 license from the Bitbucket repository (https://bitbucket.org/ConesaLab/mosim/)[email protected] informationSupplementary material is available at bioRxiv online.


2019 ◽  
Vol 35 (19) ◽  
pp. 3870-3872 ◽  
Author(s):  
Nathan D Olson ◽  
Nidhi Shah ◽  
Jayaram Kancherla ◽  
Justin Wagner ◽  
Joseph N Paulson ◽  
...  

Abstract Summary We developed the metagenomeFeatures R Bioconductor package along with annotation packages for three 16S rRNA databases (Greengenes, RDP and SILVA) to facilitate working with 16S rRNA databases and marker-gene survey feature data. The metagenomeFeatures package defines two classes, MgDb for working with 16S rRNA sequence databases, and mgFeatures for marker-gene survey feature data. The associated annotation packages provide a consistent interface to the different databases facilitating database comparison and exploration. The mgFeatures-class represents a crucial step in the development of a common data structure for working with 16S marker-gene survey data in R. Availability and implementation https://bioconductor.org/packages/release/bioc/html/metagenomeFeatures.html. Supplementary information Supplementary material is available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document