scholarly journals Sedproxy: a forward model for sediment archived climate proxies

Author(s):  
Andrew M. Dolman ◽  
Thomas Laepple

Abstract. Climate reconstructions based on proxy records recovered from marine sediments, such as alkenone records or geochemical parameters measured on foraminifera, play an important role in our understanding of the climate system. They provide information about the state of the ocean ranging back hundreds to millions of years and form the backbone of paleo-oceanography. However, there are many sources of uncertainty associated with the signal recovered from sediment archived proxies. These include seasonal or depth habitat biases in the recorded signal, a frequency dependent reduction in the amplitude of the recorded signal due to bioturbation of the sediment, aliasing of high frequency climate variation onto a nominally annual, decadal or centennial resolution signal, and additional sample processing and measurement error introduced when the proxy signal is recovered. Here we present a forward model for sediment archived proxies that jointly models the above processes, so that the magnitude of their separate and combined effects can be investigated. Applications include the interpretation and analysis of uncertainty in existing proxy records, parameter sensitivity analysis to optimize future studies, and the generation of pseudo-proxy records that can be used to test reconstruction methods. We provide examples, such as the simulation of individual foraminifera records, that demonstrate the usefulness of the forward model for paleoclimate studies. The model is implemented as a user-friendly R package, sedproxy, the use of which we hope will contribute to a better understanding of both the limitations and potential of marine sediment proxies to inform about past climate.

2018 ◽  
Vol 14 (12) ◽  
pp. 1851-1868 ◽  
Author(s):  
Andrew M. Dolman ◽  
Thomas Laepple

Abstract. Climate reconstructions based on proxy records recovered from marine sediments, such as alkenone records or geochemical parameters measured on foraminifera, play an important role in our understanding of the climate system. They provide information about the state of the ocean ranging back hundreds to millions of years and form the backbone of paleo-oceanography. However, there are many sources of uncertainty associated with the signal recovered from sediment-archived proxies. These include seasonal or depth-habitat biases in the recorded signal; a frequency-dependent reduction in the amplitude of the recorded signal due to bioturbation of the sediment; aliasing of high-frequency climate variation onto a nominally annual, decadal, or centennial resolution signal; and additional sample processing and measurement error introduced when the proxy signal is recovered. Here we present a forward model for sediment-archived proxies that jointly models the above processes so that the magnitude of their separate and combined effects can be investigated. Applications include the interpretation and analysis of uncertainty in existing proxy records, parameter sensitivity analysis to optimize future studies, and the generation of pseudo-proxy records that can be used to test reconstruction methods. We provide examples, such as the simulation of individual foraminifera records, that demonstrate the usefulness of the forward model for paleoclimate studies. The model is implemented as an open-source R package, sedproxy, to which we welcome collaborative contributions. We hope that use of sedproxy will contribute to a better understanding of both the limitations and potential of marine sediment proxies to inform researchers about earth's past climate.


2021 ◽  
Vol 22 (3) ◽  
pp. 1399
Author(s):  
Salim Ghannoum ◽  
Waldir Leoncio Netto ◽  
Damiano Fantini ◽  
Benjamin Ragan-Kelley ◽  
Amirabbas Parizadeh ◽  
...  

The growing attention toward the benefits of single-cell RNA sequencing (scRNA-seq) is leading to a myriad of computational packages for the analysis of different aspects of scRNA-seq data. For researchers without advanced programing skills, it is very challenging to combine several packages in order to perform the desired analysis in a simple and reproducible way. Here we present DIscBIO, an open-source, multi-algorithmic pipeline for easy, efficient and reproducible analysis of cellular sub-populations at the transcriptomic level. The pipeline integrates multiple scRNA-seq packages and allows biomarker discovery with decision trees and gene enrichment analysis in a network context using single-cell sequencing read counts through clustering and differential analysis. DIscBIO is freely available as an R package. It can be run either in command-line mode or through a user-friendly computational pipeline using Jupyter notebooks. We showcase all pipeline features using two scRNA-seq datasets. The first dataset consists of circulating tumor cells from patients with breast cancer. The second one is a cell cycle regulation dataset in myxoid liposarcoma. All analyses are available as notebooks that integrate in a sequential narrative R code with explanatory text and output data and images. R users can use the notebooks to understand the different steps of the pipeline and will guide them to explore their scRNA-seq data. We also provide a cloud version using Binder that allows the execution of the pipeline without the need of downloading R, Jupyter or any of the packages used by the pipeline. The cloud version can serve as a tutorial for training purposes, especially for those that are not R users or have limited programing skills. However, in order to do meaningful scRNA-seq analyses, all users will need to understand the implemented methods and their possible options and limitations.


2021 ◽  
Vol 22 (S6) ◽  
Author(s):  
Yasmine Mansour ◽  
Annie Chateau ◽  
Anna-Sophie Fiston-Lavier

Abstract Background Meiotic recombination is a vital biological process playing an essential role in genome's structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organisms, especially for newly sequenced ones. Hence, we miss accurate local recombination rates necessary to address evolutionary questions. Results Here, we propose an automated computational tool, based on the Marey maps method, allowing to identify heterochromatin boundaries along chromosomes and estimating local recombination rates. Our method, called BREC (heterochromatin Boundaries and RECombination rate estimates) is non-genome-specific, running even on non-model genomes as long as genetic and physical maps are available. BREC is based on pure statistics and is data-driven, implying that good input data quality remains a strong requirement. Therefore, a data pre-processing module (data quality control and cleaning) is provided. Experiments show that BREC handles different markers' density and distribution issues. Conclusions BREC's heterochromatin boundaries have been validated with cytological equivalents experimentally generated on the fruit fly Drosophila melanogaster genome, for which BREC returns congruent corresponding values. Also, BREC's recombination rates have been compared with previously reported estimates. Based on the promising results, we believe our tool has the potential to help bring data science into the service of genome biology and evolution. We introduce BREC within an R-package and a Shiny web-based user-friendly application yielding a fast, easy-to-use, and broadly accessible resource. The BREC R-package is available at the GitHub repository https://github.com/GenomeStructureOrganization.


2018 ◽  
Vol 2 ◽  
pp. e25564
Author(s):  
Tomer Gueta ◽  
Vijay Barve ◽  
Thiloshon Nagarajah ◽  
Ashwin Agrawal ◽  
Yohay Carmel

A new R package for biodiversity data cleaning, 'bdclean', was initiated in the Google Summer of Code (GSoC) 2017 and is available on github. Several R packages have great data validation and cleaning functions, but 'bdclean' provides features to manage a complete pipeline for biodiversity data cleaning; from data quality explorations, to cleaning procedures and reporting. Users are able go through the quality control process in a very structured, intuitive, and effective way. A modular approach to data cleaning functionality should make this package extensible for many biodiversity data cleaning needs. Under GSoC 2018, 'bdclean' will go through a comprehensive upgrade. New features will be highlighted in the demonstration.


2019 ◽  
Author(s):  
Cheynna Crowley ◽  
Yuchen Yang ◽  
Yunjiang Qiu ◽  
Benxia Hu ◽  
Armen Abnousi ◽  
...  

AbstractHi-C experiments have been widely adopted to study chromatin spatial organization, which plays an essential role in genome function. We have recently identified frequently interacting regions (FIREs) and found that they are closely associated with cell-type-specific gene regulation. However, computational tools for detecting FIREs from Hi-C data are still lacking. In this work, we present FIREcaller, a stand-alone, user-friendly R package for detecting FIREs from Hi-C data. FIREcaller takes raw Hi-C contact matrices as input, performs within-sample and cross-sample normalization, and outputs continuous FIRE scores, dichotomous FIREs, and super-FIREs. Applying FIREcaller to Hi-C data from various human tissues, we demonstrate that FIREs and super-FIREs identified, in a tissue-specific manner, are closely related to gene regulation, are enriched for enhancer-promoter (E-P) interactions, tend to overlap with regions exhibiting epigenomic signatures of cis-regulatory roles, and aid the interpretation or GWAS variants. The FIREcaller package is implemented in R and freely available at https://yunliweb.its.unc.edu/FIREcaller.Highlights– Frequently Interacting Regions (FIREs) can be used to identify tissue and cell-type-specific cis-regulatory regions.– An R software, FIREcaller, has been developed to identify FIREs and clustered FIREs into super-FIREs.


2021 ◽  
Author(s):  
Manuel Chevalier

Abstract. Statistical climate reconstruction techniques are practical tools to study past climate variability from fossil proxy data. In particular, the methods based on probability density functions (PDFs) are powerful at producing robust results from various environments and proxies. However, accessing and curating the necessary calibration data, as well as the complexity of interpreting probabilistic results, often limit their use in palaeoclimatological studies. To address these problems, I present a new R package (crestr) to apply the CREST method (Climate REconstruction SofTware) on diverse palaeoecological datasets. crestr includes a globally curated calibration dataset for six common climate proxies (i.e. plants, beetles, chironomids, rodents, foraminifera, and dinoflagellate cysts) that enables its use in most terrestrial and marine regions. The package can also be used with private data collections instead of, or in combination with, the provided dataset. It also includes a suite of graphical diagnostic tools to represent the data at each step of the reconstruction process and provide insights into the effect of the different modelling assumptions and external factors that underlie a reconstruction. With this R package, the CREST method can now be used in a scriptable environment, thus simplifying its use and integration in existing workflows. It is hoped that crestr will contribute to producing the much-needed quantified records from the many regions where climate reconstructions are currently lacking, despite the existence of suitable fossil records.


2021 ◽  
Author(s):  
Magnus Dehli Vigeland ◽  
Thore Egeland

Abstract We address computational and statistical aspects of DNA-based identification of victims in the aftermath of disasters. Current methods and software for such identification typically consider each victim individually, leading to suboptimal power of identification and potential inconsistencies in the statistical summary of the evidence. We resolve these problems by performing joint identification of all victims, using the complete genetic data set. Individual identification probabilities, conditional on all available information, are derived from the joint solution in the form of posterior pairing probabilities. A closed formula is obtained for the a priori number of possible joint solutions to a given DVI problem. This number increases quickly with the number of victims and missing persons, posing computational challenges for brute force approaches. We address this complexity with a preparatory sequential step aiming to reduce the search space. The examples show that realistic cases are handled efficiently. User-friendly implementations of all methods are provided in the R package dvir, freely available on all platforms.


Author(s):  
Matthew Carlucci ◽  
Algimantas Kriščiūnas ◽  
Haohan Li ◽  
Povilas Gibas ◽  
Karolis Koncevičius ◽  
...  

Abstract Motivation Biological rhythmicity is fundamental to almost all organisms on Earth and plays a key role in health and disease. Identification of oscillating signals could lead to novel biological insights, yet its investigation is impeded by the extensive computational and statistical knowledge required to perform such analysis. Results To address this issue, we present DiscoRhythm (Discovering Rhythmicity), a user-friendly application for characterizing rhythmicity in temporal biological data. DiscoRhythm is available as a web application or an R/Bioconductor package for estimating phase, amplitude, and statistical significance using four popular approaches to rhythm detection (Cosinor, JTK Cycle, ARSER, and Lomb-Scargle). We optimized these algorithms for speed, improving their execution times up to 30-fold to enable rapid analysis of -omic-scale datasets in real-time. Informative visualizations, interactive modules for quality control, dimensionality reduction, periodicity profiling, and incorporation of experimental replicates make DiscoRhythm a thorough toolkit for analyzing rhythmicity. Availability and Implementation The DiscoRhythm R package is available on Bioconductor (https://bioconductor.org/packages/DiscoRhythm), with source code available on GitHub (https://github.com/matthewcarlucci/DiscoRhythm) under a GPL-3 license. The web application is securely deployed over HTTPS (https://disco.camh.ca) and is freely available for use worldwide. Local instances of the DiscoRhythm web application can be created using the R package or by deploying the publicly available Docker container (https://hub.docker.com/r/mcarlucci/discorhythm). Supplementary information Supplementary data are available at Bioinformatics online.


PLoS ONE ◽  
2019 ◽  
Vol 14 (5) ◽  
pp. e0216471 ◽  
Author(s):  
Davide Bolognini ◽  
Niccolò Bartalucci ◽  
Alessandra Mingrino ◽  
Alessandro Maria Vannucchi ◽  
Alberto Magi

2015 ◽  
Vol 2015 ◽  
pp. 1-23 ◽  
Author(s):  
Bo Bi ◽  
Bo Han ◽  
Weimin Han ◽  
Jinping Tang ◽  
Li Li

Diffuse optical tomography is a novel molecular imaging technology for small animal studies. Most known reconstruction methods use the diffusion equation (DA) as forward model, although the validation of DA breaks down in certain situations. In this work, we use the radiative transfer equation as forward model which provides an accurate description of the light propagation within biological media and investigate the potential of sparsity constraints in solving the diffuse optical tomography inverse problem. The feasibility of the sparsity reconstruction approach is evaluated by boundary angular-averaged measurement data and internal angular-averaged measurement data. Simulation results demonstrate that in most of the test cases the reconstructions with sparsity regularization are both qualitatively and quantitatively more reliable than those with standardL2regularization. Results also show the competitive performance of the split Bregman algorithm for the DOT image reconstruction with sparsity regularization compared with other existingL1algorithms.


Sign in / Sign up

Export Citation Format

Share Document