scholarly journals SNPfiltR: an R package for interactive and reproducible SNP filtering

Author(s):  
Devon DeRaad

Here I describe the novel R package SNPfiltR and demonstrate its functionalities as the backbone of a customizable, reproducible SNP filtering pipeline implemented exclusively via the widely adopted R programming language. SNPfiltR extends existing SNP filtering functionalities by automating the visualization of key parameters such as depth, quality, and missing data, then allowing users to set filters based on optimized thresholds, all within a single, cohesive working environment. All SNPfiltR functions require a vcfR object as input, which can be easily generated by reading a SNP dataset stored as a standard vcf file into an R working environment using the function read.vcfR() from the R package vcfR. Performance benchmarking reveals that for moderately sized SNP datasets (up to 50M genotypes with associated quality information), SNPfiltR performs filtering with comparable efficiency to current state of the art command-line-based programs. These benchmarking results indicate that for most reduced-representation genomic datasets, SNPfiltR is an ideal choice for investigating, visualizing, and filtering SNPs as part of a cohesive and easily documentable bioinformatic pipeline. The SNPfiltR package can be downloaded from CRAN with the command [install.packages(“SNPfiltR”)], and a development version is available from GitHub at: (github.com/DevonDeRaad/SNPfiltR). Additionally, thorough documentation for SNPfiltR, including multiple comprehensive vignettes, is available at the website: (devonderaad.github.io/SNPfiltR/).

2018 ◽  
Author(s):  
Richèl J.C. Bilderbeek ◽  
Rampal S. Etienne

SummaryIn the field of phylogenetics, BEAST2 is one of the most widely used software tools. It comes with the graphical user interfaces BEAUti 2, DensiTree and Tracer, to create BEAST2 configuration files and to interpret BEAST2’s output files. However, when many different alignments or model setups are required, a workflow of graphical user interfaces is cumbersome.Here, we present a free, libre and open-source package, babette: ‘BEAUti 2, BEAST2 and Tracer for R’, for the R programming language. babette creates BEAST2 input files, runs BEAST2 and parses its results, all from an R function call.We describe babette’s usage and the novel functionality it provides compared to the original tools and we give some examples.As babette is designed to be of high quality and extendable, we conclude by describing the further development of the package.


Author(s):  
Edoardo Barba ◽  
Luigi Procopio ◽  
Caterina Lacerra ◽  
Tommaso Pasini ◽  
Roberto Navigli

Recently, generative approaches have been used effectively to provide definitions of words in their context. However, the opposite, i.e., generating a usage example given one or more words along with their definitions, has not yet been investigated. In this work, we introduce the novel task of Exemplification Modeling (ExMod), along with a sequence-to-sequence architecture and a training procedure for it. Starting from a set of (word, definition) pairs, our approach is capable of automatically generating high-quality sentences which express the requested semantics. As a result, we can drive the creation of sense-tagged data which cover the full range of meanings in any inventory of interest, and their interactions within sentences. Human annotators agree that the sentences generated are as fluent and semantically-coherent with the input definitions as the sentences in manually-annotated corpora. Indeed, when employed as training data for Word Sense Disambiguation, our examples enable the current state of the art to be outperformed, and higher results to be achieved than when using gold-standard datasets only. We release the pretrained model, the dataset and the software at https://github.com/SapienzaNLP/exmod.


Aerospace ◽  
2019 ◽  
Vol 6 (9) ◽  
pp. 98 ◽  
Author(s):  
Zachary Lewis ◽  
Joshua Ten Eyck ◽  
Kyle Baker ◽  
Eryn Culton ◽  
Jonathan Lang ◽  
...  

The novel contribution in this manuscript is an expansion of the current state-of-the-art in the geometric installation of control moment gyroscopes beyond the benchmark symmetric skewed arrays and the four asymmetric arrays presented in recent literature. The benchmark pyramid symmetrically skewed at 54.73 degrees mandates significant attention to singularity avoidance, escape, and penetration, while the most recent four asymmetric arrays are strictly useful in instances where space is available to mount at least one gyro orthogonal to the others. Skewed arrays of gyros and the research-benchmark are introduced, followed by the present-day box-90 and “roof” configurations, where the roof configuration is the first prevalently used asymmetric geometry. Six other asymmetric options in the most recent literature are introduced, where four of the six options are obviously quite useful. From this inspiration, several dozen discrete options for asymmetric installations are critically evaluated using two figures of merit: maximum momentum (saturation) and maximum singularity-free momentum. Furthermore, continuous surface plots are presented to provide readers with countless (infinite) options for geometric installations. The manuscript firmly establishes many useful options for engineers who learn that the physical space on their spacecraft is insufficient to permit standard installations.


Author(s):  
Jesper Beltoft Lund ◽  
Weilong Li ◽  
Afsaneh Mohammadnejad ◽  
Shuxia Li ◽  
Jan Baumbach ◽  
...  

Abstract Summary Epigenome-Wide Association Study (EWAS) has become a powerful approach to identify epigenetic variations associated with diseases or health traits. Sex is an important variable to include in EWAS to ensure unbiased data processing and statistical analysis. We introduce the R-package EWASex, which allows for fast and highly accurate sex-estimation using DNA methylation data on a small set of CpG sites located on the X-chromosome under stable X-chromosome inactivation in females. Results We demonstrate that EWASex outperforms the current state of the art tools by using different EWAS datasets. With EWASex, we offer an efficient way to predict and to verify sex that can be easily implemented in any EWAS using blood samples or even other tissue types. It comes with pre-trained weights to work without prior sex labels and without requiring access to RAW data, which is a necessity for all currently available methods. Availability and implementation The EWASex R-package along with tutorials, documentation and source code are available at https://github.com/Silver-Hawk/EWASex. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 32 (12) ◽  
pp. 1574-1576 ◽  
Author(s):  
Austin G. McCoy ◽  
Zachary Noel ◽  
Adam H. Sparks ◽  
Martin Chilvers

Phytophthora sojae is a significant pathogen of soybean worldwide. Pathotype surveys for Phytophthora sojae are conducted to monitor resistance gene efficacy and determine if new resistance genes are needed. Valuable measurements for pathotype analysis include the distribution of susceptible reactions, pathotype complexity, pathotype frequency, and diversity indices for pathotype distributions. Previously the Habgood-Gilmour Spreadsheet (HaGiS), written in Microsoft Excel, was used for data analysis. However, the growing popularity of the R programming language in plant pathology and desire for reproducible research made HaGiS a prime candidate for conversion into an R package. Here we report on the development and use of an R package, hagis, that can be used to produce all outputs from the HaGiS Excel sheet for P. sojae or other gene-for-gene pathosystem studies.


2021 ◽  
Author(s):  
Daniel Lüdecke ◽  
Indrajeet Patil ◽  
Mattan S. Ben-Shachar ◽  
Brenton M. Wiernik ◽  
Philip Waggoner ◽  
...  

The see package is embedded in the easystats ecosystem, a collection of R packages that operate in synergy to provide a consistent and intuitive syntax when working with statistical models in the R programming language (R Core Team, 2021). Most easystats packages return comprehensive numeric summaries of model parameters and performance. The see package complements these numeric summaries with a host of functions and tools to produce a range of publication-ready visualizations for model parameters, predictions, and performance diagnostics. As a core pillar of easystats, the see package helps users to utilize visualization for more informative, communicable, and well-rounded scientific reporting.


2020 ◽  
Author(s):  
Maxime Meylan ◽  
Etienne Becht ◽  
Catherine Sautès-Fridman ◽  
Aurélien de Reyniès ◽  
Wolf H. Fridman ◽  
...  

AbstractSummaryWe previously reported MCP-counter and mMCP-counter, methods that allow precise estimation of the immune and stromal composition of human and murine samples from bulk transcriptomic data, but they were only distributed as R packages. Here, we report webMCP-counter, a user-friendly web interface to allow all users to use these methods, regardless of their proficiency in the R programming language.Availability and ImplementationFreely available from http://134.157.229.105:3838/webMCP/. Website developed with the R package shiny. Source code available from GitHub: https://github.com/FPetitprez/webMCP-counter.


2020 ◽  
Vol 12 (20) ◽  
pp. 3285
Author(s):  
Alessandro Battaglia ◽  
Giulia Panegrossi

The quantification of global snowfall by the current observing system remains challenging, with the CloudSat 94 GHz Cloud Profiling Radar (CPR) providing the current state-of-the-art snow climatology, especially at high latitudes. This work explores the potential of the novel Level-2 CloudSat 94 GHz Brightness Temperature Product (2B-TB94), developed in recent years by processing the noise floor data contained in the 1B-CPR product; the focus of the study is on the characterization of snow systems over the ice-free ocean, which has well constrained emissivity and backscattering properties. When used in combination with the path integrated attenuation (PIA), the radiometric mode can provide crucial information on the presence/amount of supercooled layers and on the contribution of the ice to the total attenuation. Radiative transfer simulations show that the location of the supercooled layers and the snow density are important factors affecting the warming caused by supercooled emission and the cooling induced by ice scattering. Over the ice-free ocean, the inclusion of the 2B-TB94 observations to the standard CPR observables (reflectivity profile and PIA) is recommended, should more sophisticated attenuation corrections be implemented in the snow CloudSat product to mitigate its well-known underestimation at large snowfall rates. Similar approaches will also be applicable to the upcoming EarthCARE mission. The findings of this paper are relevant for the design of future missions targeting precipitation in the polar regions.


2021 ◽  
pp. 204589402110372
Author(s):  
Francois Potus ◽  
Andrea Frump ◽  
Soban Umar ◽  
Rebecca Vanderpool ◽  
Imad Al Ghouleh ◽  
...  

Each year the American Thoracic Society (ATS) Conference brings together scientists who conduct basic, translational and clinical research to present on the recent advances in the field of respirology. Due to the Coronavirus Disease of 2019 (COVID-19) pandemic, the ATS2020 Conference was held online in a series of virtual meetings. In this review, we focus on the breakthroughs in pulmonary hypertension (PH) research. We have selected ten of the best basic science abstracts which were presented at the ATS2020 Assembly on Pulmonary Circulation mini-symposium “What’s new in Pulmonary Arterial Hypertension (PAH) and Right Ventricular (RV) Signaling: Lessons from the Best Abstracts”, reflecting the current state-of-the-art and associated challenges in PH. Particular emphasis is placed on understanding the mechanisms underlying RV failure, the regulation of inflammation, and the novel therapeutic targets that emerged from preclinical research. The pathologic interactions between PH, RV function and COVID-19 are also discussed.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 139 ◽  
Author(s):  
Zhian N. Kamvar ◽  
Jun Cai ◽  
Juliet R.C. Pulliam ◽  
Jakob Schumacher ◽  
Thibaut Jombart

The epidemiological curve (epicurve) is one of the simplest yet most useful tools used by field epidemiologists, modellers, and decision makers for assessing the dynamics of infectious disease epidemics. Here, we present the free, open-source package incidence for the R programming language, which allows users to easily compute, handle, and visualise epicurves from unaggregated linelist data. This package was built in accordance with the development guidelines of the R Epidemics Consortium (RECON), which aim to ensure robustness and reliability through extensive automated testing, documentation, and good coding practices. As such, it fills an important gap in the toolbox for outbreak analytics using the R software, and provides a solid building block for further developments in infectious disease modelling. incidence is available from https://www.repidemicsconsortium.org/incidence.


Sign in / Sign up

Export Citation Format

Share Document