scholarly journals CoGe LoadExp+: A web-based suite that integrates next-gen sequencing data analysis workflows and visualization

2017 ◽  
Author(s):  
Jeffrey W. Grover ◽  
Matthew Bomhoff ◽  
Sean Davey ◽  
Brian D. Gregory ◽  
Rebecca A. Mosher ◽  
...  

AbstractTo make genomic and epigenomic analyses more widely available to the biological research community, we have created LoadExp+, a suite of bioinformatics workflows integrated with the web-based comparative genomics platform, CoGe. LoadExp+ allows users to perform transcriptomic (RNA-seq), epigenomic (bisulfite-seq), chromatin-binding (ChIP-seq), variant identification (SNPs), and population genetics analyses against any genome in CoGe, including genomes integrated by users themselves. Through LoadExp+’s integration with CoGe’s existing features, all analyses are available for visualization and additional downstream processing, and are available for export to CyVerse’s data management and analysis platforms. LoadExp+ provides easy-to-use functionality to manage genomics and epigenomics data throughout its entire lifecycle and facilitates greater accessibility of genomics analyses to researchers of all skill levels. LoadExp+ can be accessed at https://genomevolution.org.


2019 ◽  
Author(s):  
Ayman Yousif ◽  
Nizar Drou ◽  
Jillian Rowe ◽  
Mohammed Khalfan ◽  
Kristin C Gunsalus

AbstractBackgroundAs high-throughput sequencing applications continue to evolve, the rapid growth in quantity and variety of sequence-based data calls for the development of new software libraries and tools for data analysis and visualization. Often, effective use of these tools requires computational skills beyond those of many researchers. To ease this computational barrier, we have created a dynamic web-based platform, NASQAR (Nucleic Acid SeQuence Analysis Resource).ResultsNASQAR offers a collection of custom and publicly available open-source web applications that make extensive use of a variety of R packages to provide interactive data analysis and visualization. The platform is publicly accessible at http://nasqar.abudhabi.nyu.edu/. Open-source code is on GitHub at https://github.com/nasqar/NASQAR, and the system is also available as a Docker image at https://hub.docker.com/r/aymanm/nasqarall. NASQAR is a collaboration between the core bioinformatics teams of the NYU Abu Dhabi and NYU New York Centers for Genomics and Systems Biology.ConclusionsNASQAR empowers non-programming experts with a versatile and intuitive toolbox to easily and efficiently explore, analyze, and visualize their Transcriptomics data interactively. Popular tools for a variety of applications are currently available, including Transcriptome Data Preprocessing, RNA-seq Analysis (including Single-cell RNA-seq), Metagenomics, and Gene Enrichment.



2017 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbing Huang ◽  
Jun Chen

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.



BMC Genomics ◽  
2016 ◽  
Vol 17 (1) ◽  
Author(s):  
Weizhong Li ◽  
R. Alexander Richter ◽  
Yunsup Jung ◽  
Qiyun Zhu ◽  
Robert W. Li


Author(s):  
Konstantinos Krampis ◽  
Claudia Wultsch

Abstract Research in biology has entered a digital era, where next-generation sequencing instruments generate multiple terabytes of data but are equipped with minimal computational and storage capacity that is not sufficient for large-scale, post-sequencing data analysis. Therefore, scientific value cannot be obtained from investment in a sequencing instrument, unless it is also combined with a significant expense for informatics infrastructure. An alternative option for laboratories is to outsource their informatics infrastructure, by leasing computational cycles and storage capacity from cloud computing services. Development of cloud-based bioinformatics tool suites can provide users with access to pre-configured software and on-demand computing resources for genomic data analysis, while at the same time lower the barrier for working with sequencing datasets, leading to broader adoption of genomic technologies for basic biological research. We conclude that along with the democratization of genome sequencing through the availability of lowcost, bench-top sequencers, cloud computing can in turn democratize access to computational capacity and informatics infrastructures required for sequencing data analysis.



Cancers ◽  
2021 ◽  
Vol 13 (13) ◽  
pp. 3175
Author(s):  
Luiza Handschuh ◽  
Pawel Wojciechowski ◽  
Maciej Kazmierczak ◽  
Krzysztof Lewandowski

The expression of apoptosis-related BCL2 family genes, fine-tuned in normal cells, is dysregulated in many neoplasms. In acute myeloid leukemia (AML), this problem has not been studied comprehensively. To address this issue, RNA-seq data were used to analyze the expression of 26 BCL2 family members in 27 AML FAB M1 and M2 patients, divided into subgroups differently responding to chemotherapy. A correlation analysis, analysis of variance, and Kaplan-Meier analysis were applied to associate the expression of particular genes with other gene expression, clinical features, and the presence of mutations detected by exome sequencing. The expression of BCL2 family genes was dysregulated in AML, as compared to healthy controls. An upregulation of anti-apoptotic and downregulation of pro-apoptotic genes was observed, though only a decrease in BMF, BNIP1, and HRK was statistically significant. In a group of patients resistant to chemotherapy, overexpression of BCL2L1 was manifested. In agreement with the literature data, our results reveal that BCL2L1 is one of the key players in apoptosis regulation in different types of tumors. An exome sequencing data analysis indicates that BCL2 family genes are not mutated in AML, but their expression is correlated with the mutational status of other genes, including those recurrently mutated in AML and splicing-related. High levels of some BCL2 family members, in particular BIK and BCL2L13, were associated with poor outcome.



2019 ◽  
Author(s):  
Kuan-Hao Chao ◽  
Yi-Wen Hsiao ◽  
Yi-Fang Lee ◽  
Chien-Yueh Lee ◽  
Liang-Chuan Lai ◽  
...  

RNA-Seq analysis has revolutionized researchers' understanding of the transcriptome in biological research. Assessing the differences in transcriptomic profiles between tissue samples or patient groups enables researchers to explore the underlying biological impact of transcription. RNA-Seq analysis requires multiple processing steps and huge computational capabilities. There are many well-developed R packages for individual steps; however, there are few R/Bioconductor packages that integrate existing software tools into a comprehensive RNA-Seq analysis and provide fundamental end-to-end results in pure R environment so that researchers can quickly and easily get fundamental information in big sequencing data. To address this need, we have developed the open source R/Bioconductor package, RNASeqR. It allows users to run an automated RNA-Seq analysis with only six steps, producing essential tabular and graphical results for further biological interpretation. The features of RNASeqR include: six-step analysis, comprehensive visualization, background execution version, and the integration of both R and command-line software. RNASeqR provides fast, light-weight, and easy-to-run RNA-Seq analysis pipeline in pure R environment. It allows users to efficiently utilize popular software tools, including both R/Bioconductor and command-line tools, without predefining the resources or environments. RNASeqR is freely available for Linux and macOS operating systems from Bioconductor (https://bioconductor.org/packages/release/bioc/html/RNASeqR.html).



2018 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbing Huang ◽  
Xuefeng Wang ◽  
...  

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.



2021 ◽  
Vol 12 ◽  
Author(s):  
Ryan Musich ◽  
Lance Cadle-Davidson ◽  
Michael V. Osier

Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of common algorithms and programs in a way that should be approachable to biologists with limited experience in bioinformatics. We will only in passing consider the effects of data cleanup, a precursor analysis to most alignment tools, and no consideration will be given to downstream processing of the aligned fragments. To compare aligners [Bowtie2, Burrows Wheeler Aligner (BWA), HISAT2, MUMmer4, STAR, and TopHat2], an RNA-seq dataset was used containing data from 48 geographically distinct samples of the grapevine powdery mildew fungus Erysiphe necator. Based on alignment rate and gene coverage, all aligners performed well with the exception of TopHat2, which HISAT2 superseded. BWA perhaps had the best performance in these metrics, except for longer transcripts (>500 bp) for which HISAT2 and STAR performed well. HISAT2 was ~3-fold faster than the next fastest aligner in runtime, which we consider a secondary factor in most alignments. At the end, this direct comparison of commonly used aligners illustrates key considerations when choosing which tool to use for the specific sequencing data and objectives. No single tool meets all needs for every user, and there are many quality aligners available.



2017 ◽  
Author(s):  
Li Chen ◽  
James Reeve ◽  
Lujun Zhang ◽  
Shengbin Huang ◽  
Jun Chen

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero inflation remain largely undeveloped. Here we propose GMPR - a simple but effective normalization method - for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.



2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Ayman Yousif ◽  
Nizar Drou ◽  
Jillian Rowe ◽  
Mohammed Khalfan ◽  
Kristin C. Gunsalus


Sign in / Sign up

Export Citation Format

Share Document