scholarly journals FastqCleaner: an interactive Bioconductor application for quality-control, filtering and trimming of FASTQ files

2018 ◽  
Author(s):  
Leandro Gabriel Roser ◽  
Fernán Agüero ◽  
Daniel Oscar Sánchez

AbstractBackgroundExploration and processing of FASTQ files are the first steps in state-of-the-art data analysis workflows of Next Generation Sequencing (NGS) platforms. The large amount of data generated by these technologies has put a challenge in terms of rapid analysis and visualization of sequencing information. Recent integration of the R data analysis platform with web visual frameworks has stimulated the development of user-friendly, powerful, and dynamic NGS data analysis applications.ResultsThis paper presents FastqCleaner, a Bioconductor visual application for both quality-control (QC) and pre-processing of FASTQ files. The interface shows diagnostic information for the input and output data and allows to select a series of filtering and trimming operations in an interactive framework. FastqCleaner combines the technology of Bioconductor for NGS data analysis with the data visualization advantages of a web environment.ConclusionsFastqCleaner is an user-friendly, offline-capable tool that enables access to advanced Bioconductor infrastructure. The novel concept of a Bioconductor interactive application that can be used without the need for programming skills, makes FastqCleaner a valuable resource for NGS data analysis.

2020 ◽  
Vol 21 (11) ◽  
pp. 3828
Author(s):  
Omer An ◽  
Kar-Tong Tan ◽  
Ying Li ◽  
Jia Li ◽  
Chan-Shuo Wu ◽  
...  

Next-generation sequencing (NGS) has been a widely-used technology in biomedical research for understanding the role of molecular genetics of cells in health and disease. A variety of computational tools have been developed to analyse the vastly growing NGS data, which often require bioinformatics skills, tedious work and a significant amount of time. To facilitate data processing steps minding the gap between biologists and bioinformaticians, we developed CSI NGS Portal, an online platform which gathers established bioinformatics pipelines to provide fully automated NGS data analysis and sharing in a user-friendly website. The portal currently provides 16 standard pipelines for analysing data from DNA, RNA, smallRNA, ChIP, RIP, 4C, SHAPE, circRNA, eCLIP, Bisulfite and scRNA sequencing, and is flexible to expand with new pipelines. The users can upload raw data in FASTQ format and submit jobs in a few clicks, and the results will be self-accessible via the portal to view/download/share in real-time. The output can be readily used as the final report or as input for other tools depending on the pipeline. Overall, CSI NGS Portal helps researchers rapidly analyse their NGS data and share results with colleagues without the aid of a bioinformatician. The portal is freely available at: https://csibioinfo.nus.edu.sg/csingsportal.


Author(s):  
Ömer An ◽  
Kar-Tong Tan ◽  
Ying Li ◽  
Jia Li ◽  
Chan-Shuo Wu ◽  
...  

Next-generation sequencing (NGS) has been a widely-used technology in biomedical research for understanding the role of molecular genetics of cells in health and disease. A variety of computational tools have been developed to analyse the vastly growing NGS data, however, they often require bioinformatics skills and tedious work to handle with. Moreover, processing raw data such as genome alignment and expression quantification consume a significant amount of time before having the data ready for downstream analyses. To facilitate data processing steps minding the gap between biologists and bioinformaticians, we developed CSI NGS Portal, an online platform which gathers established bioinformatics pipelines to provide fully automated NGS data analysis and sharing in a user-friendly website, developed in PHP and JavaScript with an integrated MariaDB database running on a dedicated Linux server. The portal currently provides 14 standard pipelines for analysing NGS data from DNA, RNA, smallRNA, ChIP, RIP, 4C, SHAPE, circRNA, eCLIP and Bisulfite sequencing, and is flexible to expand with new and customised pipelines. The users can upload raw data in fastq format and submit jobs for the desired analyses in a few clicks, and the results will be self-accessible via the portal to view/download/share in real-time. The output can be readily used as the final report or as an input for other tools depending on the pipeline. Overall, CSI NGS Portal helps researchers rapidly analyse their NGS data, share with colleagues and keep it organised without the aid of a bioinformatician. The website is freely available at: https://csibioinfo.nus.edu.sg/csingsportal


Author(s):  
Ömer An ◽  
Kar-Tong Tan ◽  
Ying Li ◽  
Jia Li ◽  
Chan-Shuo Wu ◽  
...  

Next-generation sequencing (NGS) has been a widely-used technology in biomedical research for understanding the role of molecular genetics of cells in health and disease. A variety of computational tools have been developed to analyse the vastly growing NGS data, which often require bioinformatics skills, tedious work and significant amount of time. To facilitate data processing steps minding the gap between biologists and bioinformaticians, we developed CSI NGS Portal, an online platform which gathers established bioinformatics pipelines to provide fully automated NGS data analysis and sharing in a user-friendly website. The portal currently provides 16 standard pipelines for analysing data from DNA, RNA, smallRNA, ChIP, RIP, 4C, SHAPE, circRNA, eCLIP, Bisulfite and scRNA sequencing, and is flexible to expand with new pipelines. The users can upload raw data in fastq format and submit jobs in a few clicks, and the results will be self-accessible via the portal to view/download/share in real-time. The output can be readily used as the final report or as input for other tools depending on the pipeline. Overall, CSI NGS Portal helps researchers rapidly analyse their NGS data and share results with colleagues without the aid of a bioinformatician. The portal is freely available at: https://csibioinfo.nus.edu.sg/csingsportal


2018 ◽  
Author(s):  
Yi Zhang ◽  
Mohith Manjunath ◽  
Yeonsung Kim ◽  
Joerg Heintz ◽  
Jun S. Song

AbstractNext-generation sequencing (NGS) techniques are revolutionizing biomedical research by providing powerful methods for generating genomic and epigenomic profiles. The rapid progress is posing an acute challenge to students and researchers to stay acquainted with the numerous available methods. We have developed an interactive online educational resource called SequencEnG (acronym for Sequencing Techniques Engine for Genomics) to provide a tree-structured knowledge base of 66 different sequencing techniques and step-by-step NGS data analysis pipelines comparing popular tools. SequencEnG is designed to facilitate barrier-free learning of current NGS techniques and provides a user-friendly interface for searching through experimental and analysis methods. SequencEnG is part of the project KnowEnG (Knowledge Engine for Genomics) and is freely available at http://education.knoweng.org/sequenceng/.


PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0243241
Author(s):  
Sebastian Hupfauf ◽  
Mohammad Etemadi ◽  
Marina Fernández-Delgado Juárez ◽  
María Gómez-Brandón ◽  
Heribert Insam ◽  
...  

In recent years, there has been a veritable boost in next-generation sequencing (NGS) of gene amplicons in biological and medical studies. Huge amounts of data are produced and need to be analyzed adequately. Various online and offline analysis tools are available; however, most of them require extensive expertise in computer science or bioinformatics, and often a Linux-based operating system. Here, we introduce “CoMA–Comparative Microbiome Analysis” as a free and intuitive analysis pipeline for amplicon-sequencing data, compatible with any common operating system. Moreover, the tool offers various useful services including data pre-processing, quality checking, clustering to operational taxonomic units (OTUs), taxonomic assignment, data post-processing, data visualization, and statistical appraisal. The workflow results in highly esthetic and publication-ready graphics, as well as output files in standardized formats (e.g. tab-delimited OTU-table, BIOM, NEWICK tree) that can be used for more sophisticated analyses. The CoMA output was validated by a benchmark test, using three mock communities with different sample characteristics (primer set, amplicon length, diversity). The performance was compared with that of Mothur, QIIME and QIIME2-DADA2, popular packages for NGS data analysis. Furthermore, the functionality of CoMA is demonstrated on a practical example, investigating microbial communities from three different soils (grassland, forest, swamp). All tools performed well in the benchmark test and were able to reveal the majority of all genera in the mock communities. Also for the soil samples, the results of CoMA were congruent to those of the other pipelines, in particular when looking at the key microbial players.


Sign in / Sign up

Export Citation Format

Share Document