scholarly journals PIMBA: a PIpeline for MetaBarcoding Analysis

2021 ◽  
Author(s):  
Renato R. M. Oliveira ◽  
Raissa L S Silva ◽  
Gisele L. Nunes ◽  
Guilherme Oliveira

DNA metabarcoding is an emerging monitoring method capable of assessing biodiversity from environmental samples (eDNA). Advances in computational tools have been required due to the increase of Next-Generation Sequencing data. Tools for DNA metabarcoding analysis, such as MOTHUR, QIIME, Obitools, and mBRAVE have been widely used in ecological studies. However, some difficulties are encountered when there is a need to use custom databases. Here we present PIMBA, a PIpeline for MetaBarcoding Analysis, which allows the use of customized databases, as well as other reference databases used by the softwares mentioned here. PIMBA is an open-source and user-friendly pipeline that consolidates all analyses in just three command lines.

2018 ◽  
Vol 116 (3) ◽  
pp. 950-959 ◽  
Author(s):  
Patrick Maffucci ◽  
Benedetta Bigio ◽  
Franck Rapaport ◽  
Aurélie Cobat ◽  
Alessandro Borghesi ◽  
...  

Computational analyses of human patient exomes aim to filter out as many nonpathogenic genetic variants (NPVs) as possible, without removing the true disease-causing mutations. This involves comparing the patient’s exome with public databases to remove reported variants inconsistent with disease prevalence, mode of inheritance, or clinical penetrance. However, variants frequent in a given exome cohort, but absent or rare in public databases, have also been reported and treated as NPVs, without rigorous exploration. We report the generation of a blacklist of variants frequent within an in-house cohort of 3,104 exomes. This blacklist did not remove known pathogenic mutations from the exomes of 129 patients and decreased the number of NPVs remaining in the 3,104 individual exomes by a median of 62%. We validated this approach by testing three other independent cohorts of 400, 902, and 3,869 exomes. The blacklist generated from any given cohort removed a substantial proportion of NPVs (11–65%). We analyzed the blacklisted variants computationally and experimentally. Most of the blacklisted variants corresponded to false signals generated by incomplete reference genome assembly, location in low-complexity regions, bioinformatic misprocessing, or limitations inherent to cohort-specific private alleles (e.g., due to sequencing kits, and genetic ancestries). Finally, we provide our precalculated blacklists, together with ReFiNE, a program for generating customized blacklists from any medium-sized or large in-house cohort of exome (or other next-generation sequencing) data via a user-friendly public web server. This work demonstrates the power of extracting variant blacklists from private databases as a specific in-house but broadly applicable tool for optimizing exome analysis.


2017 ◽  
Author(s):  
Hyun-Hwan Jeong ◽  
Seon Young Kim ◽  
Maxime WC Rosseaux ◽  
Huda Y Zoghbi ◽  
Zhandong Liu

AbstractWe present a user-friendly, cloud-based, data analysis pipeline for the deconvolution of pooled screening data. This tool, termed SAVE for Screening Analysis Visual Explorer, serves a dual purpose of extracting, clustering and analyzing raw next generation sequencing files derived from pooled screening experiments while at the same time presenting them in a user-friendly way on a secure web-based platform. Moreover, SAVE serves as a useful web-based analysis pipeline for reanalysis of pooled CRISPR screening datasets. Taken together, the framework described in this study is expected to accelerate development of web-based bioinformatics tool for handling all studies which include next generation sequencing data. SAVE is available at http://save.nrihub.org.


2016 ◽  
Author(s):  
Siwei Chen ◽  
Juan Felipe Beltrán ◽  
Xiaomu Wei ◽  
Steven Lipkin ◽  
Clara Esteban-Jurado ◽  
...  

Integrative analysis of whole-genome/exome-sequencing data has been challenging, especially for the non-programming research community, as it requires leveraging an inordinate number of computational tools. Even computational biologists find it unexpectedly difficult to reproduce results from others or optimize their own strategies in an end-to-end workflow. We introduce Germline Mutation Scoring Tool fOr Next-generation sEquencing data (GeMSTONE), a cloud- based variant prioritization tool with high-level customization and a comprehensive collection of bioinformatics tools and data libraries (http://gemstone.yulab.org/). GeMSTONE generates and readily accepts a sharable 'recipe' file for each run to either replicate existing results or analyze new data with identical parameters.


Author(s):  
Anne Krogh Nøhr ◽  
Kristian Hanghøj ◽  
Genis Garcia Erill ◽  
Zilong Li ◽  
Ida Moltke ◽  
...  

Abstract Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low-depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low-depth NGS data, which takes the uncertainty of the genotypes into account via genotype likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C ++ in a multi-threaded software and is freely available on Github https://github.com/KHanghoj/NGSremix.


Sign in / Sign up

Export Citation Format

Share Document