scholarly journals Bio-Docklets: Virtualization Containers for Single-Step Execution of NGS Pipelines

2017 ◽  
Author(s):  
Baekdoo Kim ◽  
Thahmina Ali ◽  
Carlos Lijeron ◽  
Enis Afgan ◽  
Konstantinos Krampis

ABSTRACTBackgroundProcessing of Next-Generation Sequencing (NGS) data requires significant technical skills, involving installation, configuration, and execution of bioinformatics data pipelines, in addition to specialized post-analysis visualization and data mining software. In order to address some of these challenges, developers have leveraged virtualization containers, towards seamless deployment of preconfigured bioinformatics software and pipelines on any computational platform.FindingsWe present an approach for abstracting the complex data operations of multi-step, bioinformatics pipelines for NGS data analysis. As examples, we have deployed two pipelines for RNAseq and CHIPseq, pre-configured within Docker virtualization containers we call Bio-Docklets. Each Bio-Docklet exposes a single data input and output endpoint and from a user perspective, running the pipelines is as simple as running a single bioinformatics tool. This is achieved through a “meta-script” that automatically starts the Bio-Docklets, and controls the pipeline execution through the BioBlend software library and the Galaxy Application Programming Interface (API). The pipelne output is post-processed using the Visual Omics Explorer (VOE) framework, providing interactive data visualizations that users can access through a web browser.ConclusionsThe goal of our approach is to enable easy access to NGS data analysis pipelines for nonbioinformatics experts, on any computing environment whether a laboratory workstation, university computer cluster, or a cloud service provider,. Besides end-users, the Bio-Docklets also enables developers to programmatically deploy and run a large number of pipeline instances for concurrent analysis of multiple datasets.

2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
A Biricik ◽  
V Bianchi ◽  
F Lecciso ◽  
M Surdo ◽  
M Manno ◽  
...  

Abstract Study question To explore ploidy concordance between invasive and non-invasive PGTA (niPGT-A) at different embryo culture time. Summary answer High level (>84%) of concordance rate for ploidy and sex, sensitivity (>88%), and specificity (76%) were obtained for both day6/7 samples and day5 samples. What is known already The analysis of embryo cell free DNA (cfDNA) that are released into culture media during in vitro embryo development has the potential to evaluate embryo ploidy status. However, obtaining sufficient quality and quantity of cfDNA is essential to achieve interpretable results for niPGT-A. More culture time is expected to be directly proportional to the release of more cfDNA. But embryo culture time is limited due to in-vitro embryo survival potential. Therefore, it is important to estimate the duration of the culture that will provide the maximum cfDNA that can be obtained without adversely affecting the development of the embryo. Study design, size, duration A total of 105 spent culture media (SCM) from day5-day7 blastocyst stage embryos have been included in this cohort study. The cfDNA of SCM samples were amplified and analyzed for niPGT-A by NGS analysis. The SCM samples were divided into 2 subgroups according the embryo culture hours (Day5 and Day6/7 group). The DNA concentration, informativity and euploidy results have then been compared with their corresponding embryos after trophectoderm biopsy (TE) and PGT-A analysis by NGS Participants/materials, setting, methods Embryos cultured until Day3 washed and cultured again in 20µl fresh culture media until embryo biopsy on Day5, 6, or 7. After biopsy SCM samples were immediately collected in PCR tubes and conserved at –20 °C until whole genome amplification by MALBAC® (Yicon Genomics). The TE and SCM samples were analyzed by next-generation sequencing (NGS) using Illumina MiSeq® System. NGS data analysis has been done by Bluefuse Multi Software 4.5 (Illumina) for SCM and TE samples Main results and the role of chance Only the SCM samples which have an embryo with a conclusive result were included in this cohort (n = 105). Overall 97.1% (102/105) of SCM samples gave a successful DNA amplification with a concentration ranging 32.4–128.5ng/µl. Non-informative (NI) results including a chaotic profile (>5 chromosome aneuploidies) were observed in 17 samples, so 83.3%(85/102) of SCM samples were informative for NGS data analysis. Ploidy concordance rate with the corresponding TE biopsies (euploid vs euploid, aneuploid vs aneuploid) was 84.7% (72/85). Sensitivity and specificity were 92,8% and 76,7%, respectively with no significant difference for all parameters for day 6/7 samples compared with day 5 samples. The false-negative rate was 3.5% (3/85), and false-positive rate was 11.7% (10/85). Limitations, reasons for caution The sample size is relatively small. Larger prospective studies are needed. As this is a single-center study, the impact of the variations in embryo culture conditions can be underestimated. Maternal DNA contamination risk cannot be revealed in SCM, therefore the use of molecular markers would increase the reliability. Wider implications of the findings: Non-invasive analysis of embryo cfDNA analyzed in spent culture media demonstrates high concordance with TE biopsy results in both early and late culture time. A non-invasive approach for aneuploidy screening offers important advantages such as avoiding invasive embryo biopsy and decreased cost, potentially increasing accessibility for a wider patient population. Trial registration number Not applicable


2018 ◽  
Author(s):  
Leandro Gabriel Roser ◽  
Fernán Agüero ◽  
Daniel Oscar Sánchez

AbstractBackgroundExploration and processing of FASTQ files are the first steps in state-of-the-art data analysis workflows of Next Generation Sequencing (NGS) platforms. The large amount of data generated by these technologies has put a challenge in terms of rapid analysis and visualization of sequencing information. Recent integration of the R data analysis platform with web visual frameworks has stimulated the development of user-friendly, powerful, and dynamic NGS data analysis applications.ResultsThis paper presents FastqCleaner, a Bioconductor visual application for both quality-control (QC) and pre-processing of FASTQ files. The interface shows diagnostic information for the input and output data and allows to select a series of filtering and trimming operations in an interactive framework. FastqCleaner combines the technology of Bioconductor for NGS data analysis with the data visualization advantages of a web environment.ConclusionsFastqCleaner is an user-friendly, offline-capable tool that enables access to advanced Bioconductor infrastructure. The novel concept of a Bioconductor interactive application that can be used without the need for programming skills, makes FastqCleaner a valuable resource for NGS data analysis.


2022 ◽  
Author(s):  
Andreas B Diendorfer ◽  
Kseniya.Khamina not provided ◽  
marianne.pultar not provided

miND is a NGS data analysis pipeline for smallRNA sequencing data. In this protocol, the pipeline is setup and run on an AWS EC2 instance with example data from a public repository. Please see the publication paper on F1000 for more details on the pipeline and how to use it.


2015 ◽  
Vol 51 ◽  
pp. 2859-2863 ◽  
Author(s):  
Milko Krachunov ◽  
Dimitar Vassilev ◽  
Maria Nisheva ◽  
Ognyan Kulev ◽  
Valeriya Simeonova ◽  
...  

Pathogens ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 1026
Author(s):  
Jane Shen-Gunther ◽  
Qingqing Xia ◽  
Hong Cai ◽  
Yufeng Wang

Next-generation sequencing (NGS) has actualized the human papillomavirus (HPV) virome profiling for in-depth investigation of viral evolution and pathogenesis. However, viral computational analysis remains a bottleneck due to semantic discrepancies between computational tools and curated reference genomes. To address this, we developed and tested automated workflows for HPV taxonomic profiling and visualization using a customized papillomavirus database in the CLC Microbial Genomics Module. HPV genomes from Papilloma Virus Episteme were customized and incorporated into CLC “ready-to-use” workflows for stepwise data processing to include: (1) Taxonomic Analysis, (2) Estimate Alpha/Beta Diversities, and (3) Map Reads to Reference. Low-grade (n = 95) and high-grade (n = 60) Pap smears were tested with ensuing collective runtimes: Taxonomic Analysis (36 min); Alpha/Beta Diversities (5 s); Map Reads (45 min). Tabular output conversion to visualizations entailed 1–2 keystrokes. Biodiversity analysis between low- (LSIL) and high-grade squamous intraepithelial lesions (HSIL) revealed loss of species richness and gain of dominance by HPV-16 in HSIL. Integrating clinically relevant, taxonomized HPV reference genomes within automated workflows proved to be an ultra-fast method of virome profiling. The entire process named “HPV DeepSeq” provides a simple, accurate and practical means of NGS data analysis for a broad range of applications in viral research.


2020 ◽  
Vol 21 (11) ◽  
pp. 3828
Author(s):  
Omer An ◽  
Kar-Tong Tan ◽  
Ying Li ◽  
Jia Li ◽  
Chan-Shuo Wu ◽  
...  

Next-generation sequencing (NGS) has been a widely-used technology in biomedical research for understanding the role of molecular genetics of cells in health and disease. A variety of computational tools have been developed to analyse the vastly growing NGS data, which often require bioinformatics skills, tedious work and a significant amount of time. To facilitate data processing steps minding the gap between biologists and bioinformaticians, we developed CSI NGS Portal, an online platform which gathers established bioinformatics pipelines to provide fully automated NGS data analysis and sharing in a user-friendly website. The portal currently provides 16 standard pipelines for analysing data from DNA, RNA, smallRNA, ChIP, RIP, 4C, SHAPE, circRNA, eCLIP, Bisulfite and scRNA sequencing, and is flexible to expand with new pipelines. The users can upload raw data in FASTQ format and submit jobs in a few clicks, and the results will be self-accessible via the portal to view/download/share in real-time. The output can be readily used as the final report or as input for other tools depending on the pipeline. Overall, CSI NGS Portal helps researchers rapidly analyse their NGS data and share results with colleagues without the aid of a bioinformatician. The portal is freely available at: https://csibioinfo.nus.edu.sg/csingsportal.


2012 ◽  
Vol 17 (B) ◽  
pp. 15
Author(s):  
Jose R. Valverde ◽  
Jose M. Rodríguez ◽  
Alexandro Rodriguez-Rojas ◽  
Alejandro Couce ◽  
Jesus Blazquez

2020 ◽  
Vol 17 (1) ◽  
Author(s):  
Valeria Caputo ◽  
Roberta Antonia Diotti ◽  
Enzo Boeri ◽  
Hamid Hasson ◽  
Michela Sampaolo ◽  
...  

2020 ◽  
Vol 244 ◽  
pp. 19-20
Author(s):  
Nikhil Sahajpal ◽  
Ashis Mondal ◽  
Allan Njau ◽  
Meenakshi Ahluwalia ◽  
Nwogbo Okechukwu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document