scholarly journals Varstation: a complete and efficient tool to support NGS data analysis

2019 ◽  
Author(s):  
ACO Faria ◽  
MP Caraciolo ◽  
RM Minillo ◽  
TF Almeida ◽  
SM Pereira ◽  
...  

AbstractSummaryVarstation is a cloud-based NGS data processor and analyzer for human genetic variation. This resource provides a customizable, centralized, safe and clinically validated environment aiming to improve and optimize the flow of NGS analyses and reports related with clinical and research genetics.Availability and implementationVarstation is freely available at http://varstation.com, for academic [email protected] informationSupplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Jouni Sirén ◽  
Erik Garrison ◽  
Adam M. Novak ◽  
Benedict Paten ◽  
Richard Durbin

AbstractMotivationThe variation graph toolkit (VG) represents genetic variation as a graph. Although each path in the graph is a potential haplotype, most paths are nonbiological, unlikely recombinations of true haplotypes.ResultsWe augment the VG model with haplotype information to identify which paths are more likely to exist in nature. For this purpose, we develop a scalable implementation of the graph extension of the positional Burrows–Wheelertransform (GBWT). We demonstrate the scalability of the new implementation by building a whole-genome index of the 5,008 haplotypes of the 1000 Genomes Project, and an index of all 108,070 TOPMed Freeze 5 chromosome 17 haplotypes. We also develop an algorithm for simplifying variation graphs for k-mer indexing without losing any k-mers in the haplotypes.AvailabilityOur software is available at https://github.com/vgteam/vg, https://github.com/jltsiren/gbwt, and https://github.com/jltsiren/[email protected] informationSupplementary data are available.


2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
A Biricik ◽  
V Bianchi ◽  
F Lecciso ◽  
M Surdo ◽  
M Manno ◽  
...  

Abstract Study question To explore ploidy concordance between invasive and non-invasive PGTA (niPGT-A) at different embryo culture time. Summary answer High level (>84%) of concordance rate for ploidy and sex, sensitivity (>88%), and specificity (76%) were obtained for both day6/7 samples and day5 samples. What is known already The analysis of embryo cell free DNA (cfDNA) that are released into culture media during in vitro embryo development has the potential to evaluate embryo ploidy status. However, obtaining sufficient quality and quantity of cfDNA is essential to achieve interpretable results for niPGT-A. More culture time is expected to be directly proportional to the release of more cfDNA. But embryo culture time is limited due to in-vitro embryo survival potential. Therefore, it is important to estimate the duration of the culture that will provide the maximum cfDNA that can be obtained without adversely affecting the development of the embryo. Study design, size, duration A total of 105 spent culture media (SCM) from day5-day7 blastocyst stage embryos have been included in this cohort study. The cfDNA of SCM samples were amplified and analyzed for niPGT-A by NGS analysis. The SCM samples were divided into 2 subgroups according the embryo culture hours (Day5 and Day6/7 group). The DNA concentration, informativity and euploidy results have then been compared with their corresponding embryos after trophectoderm biopsy (TE) and PGT-A analysis by NGS Participants/materials, setting, methods Embryos cultured until Day3 washed and cultured again in 20µl fresh culture media until embryo biopsy on Day5, 6, or 7. After biopsy SCM samples were immediately collected in PCR tubes and conserved at –20 °C until whole genome amplification by MALBAC® (Yicon Genomics). The TE and SCM samples were analyzed by next-generation sequencing (NGS) using Illumina MiSeq® System. NGS data analysis has been done by Bluefuse Multi Software 4.5 (Illumina) for SCM and TE samples Main results and the role of chance Only the SCM samples which have an embryo with a conclusive result were included in this cohort (n = 105). Overall 97.1% (102/105) of SCM samples gave a successful DNA amplification with a concentration ranging 32.4–128.5ng/µl. Non-informative (NI) results including a chaotic profile (>5 chromosome aneuploidies) were observed in 17 samples, so 83.3%(85/102) of SCM samples were informative for NGS data analysis. Ploidy concordance rate with the corresponding TE biopsies (euploid vs euploid, aneuploid vs aneuploid) was 84.7% (72/85). Sensitivity and specificity were 92,8% and 76,7%, respectively with no significant difference for all parameters for day 6/7 samples compared with day 5 samples. The false-negative rate was 3.5% (3/85), and false-positive rate was 11.7% (10/85). Limitations, reasons for caution The sample size is relatively small. Larger prospective studies are needed. As this is a single-center study, the impact of the variations in embryo culture conditions can be underestimated. Maternal DNA contamination risk cannot be revealed in SCM, therefore the use of molecular markers would increase the reliability. Wider implications of the findings: Non-invasive analysis of embryo cfDNA analyzed in spent culture media demonstrates high concordance with TE biopsy results in both early and late culture time. A non-invasive approach for aneuploidy screening offers important advantages such as avoiding invasive embryo biopsy and decreased cost, potentially increasing accessibility for a wider patient population. Trial registration number Not applicable


2017 ◽  
Author(s):  
Baekdoo Kim ◽  
Thahmina Ali ◽  
Carlos Lijeron ◽  
Enis Afgan ◽  
Konstantinos Krampis

ABSTRACTBackgroundProcessing of Next-Generation Sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized post-analysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers, towards seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform.FindingsWe present an approach for abstracting the complex data operations of multi-step, bioinformatics pipelines for NGS data analysis. As examples, we have deployed two pipelines for RNAseq and CHIPseq, pre-configured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines is as simple as running a single bioinformatics tool. This is achieved through a “meta-script” that automatically starts the Bio-Docklets, and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface (API). The pipelne output is post-processed using the Visual Omics Explorer (VOE) framework, providing interactive data visualizations that users can access through a web browser.ConclusionsThe goal of our approach is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts, on any computing environment whether a laboratory workstation, university computer cluster, or a cloud service provider,. Besides end-users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets.


2018 ◽  
Author(s):  
Leandro Gabriel Roser ◽  
Fernán Agüero ◽  
Daniel Oscar Sánchez

AbstractBackgroundExploration and processing of FASTQ files are the first steps in state-of-the-art data analysis workflows of Next Generation Sequencing (NGS) platforms. The large amount of data generated by these technologies has put a challenge in terms of rapid analysis and visualization of sequencing information. Recent integration of the R data analysis platform with web visual frameworks has stimulated the development of user-friendly, powerful, and dynamic NGS data analysis applications.ResultsThis paper presents FastqCleaner, a Bioconductor visual application for both quality-control (QC) and pre-processing of FASTQ files. The interface shows diagnostic information for the input and output data and allows to select a series of filtering and trimming operations in an interactive framework. FastqCleaner combines the technology of Bioconductor for NGS data analysis with the data visualization advantages of a web environment.ConclusionsFastqCleaner is an user-friendly, offline-capable tool that enables access to advanced Bioconductor infrastructure. The novel concept of a Bioconductor interactive application that can be used without the need for programming skills, makes FastqCleaner a valuable resource for NGS data analysis.


2022 ◽  
Author(s):  
Andreas B Diendorfer ◽  
Kseniya.Khamina not provided ◽  
marianne.pultar not provided

miND is a NGS data analysis pipeline for smallRNA sequencing data. In this protocol, the pipeline is setup and run on an AWS EC2 instance with example data from a public repository. Please see the publication paper on F1000 for more details on the pipeline and how to use it.


2019 ◽  
Vol 36 (5) ◽  
pp. 1647-1648 ◽  
Author(s):  
Bilal Wajid ◽  
Hasan Iqbal ◽  
Momina Jamil ◽  
Hafsa Rafique ◽  
Faria Anwar

Abstract Motivation Metabolomics is a data analysis and interpretation field aiming to study functions of small molecules within the organism. Consequently Metabolomics requires researchers in life sciences to be comfortable in downloading, installing and scripting of software that are mostly not user friendly and lack basic GUIs. As the researchers struggle with these skills, there is a dire need to develop software packages that can automatically install software pipelines truly speeding up the learning curve to build software workstations. Therefore, this paper aims to provide MetumpX, a software package that eases in the installation of 103 software by automatically resolving their individual dependencies and also allowing the users to choose which software works best for them. Results MetumpX is a Ubuntu-based software package that facilitate easy download and installation of 103 tools spread across the standard metabolomics pipeline. As far as the authors know MetumpX is the only solution of its kind where the focus lies on automating development of software workstations. Availability and implementation https://github.com/hasaniqbal777/MetumpX-bin. Supplementary information Supplementary data are available at Bioinformatics online.


2015 ◽  
Vol 51 ◽  
pp. 2859-2863 ◽  
Author(s):  
Milko Krachunov ◽  
Dimitar Vassilev ◽  
Maria Nisheva ◽  
Ognyan Kulev ◽  
Valeriya Simeonova ◽  
...  

2017 ◽  
Author(s):  
Robert J. Vickerstaff ◽  
Richard J. Harrison

AbstractSummaryCrosslink is genetic mapping software for outcrossing species designed to run efficiently on large datasets by combining the best from existing tools with novel approaches. Tests show it runs much faster than several comparable programs whilst retaining a similar accuracy.Availability and implementationAvailable under the GNU General Public License version 2 from https://github.com/eastmallingresearch/[email protected] informationSupplementary data are available at Bioinformatics online and from https://github.com/eastmallingresearch/crosslink/releases/tag/v0.5.


2018 ◽  
Author(s):  
John A Lees ◽  
Marco Galardini ◽  
Stephen D Bentley ◽  
Jeffrey N Weiser ◽  
Jukka Corander

AbstractSummaryGenome-wide association studies (GWAS) in microbes face different challenges to eukaryotes and have been addressed by a number of different methods. pyseer brings these techniques together in one package tailored to microbial GWAS, allows greater flexibility of the input data used, and adds new methods to interpret the association results.Availability and Implementationpyseer is written in python and is freely available at https://github.com/mgalardini/pyseer, or can be installed through pip. Documentation and a tutorial are available at http://[email protected] and [email protected] informationSupplementary data are available online.


Sign in / Sign up

Export Citation Format

Share Document