A highly parallel next-generation DNA sequencing data analysis pipeline in Hadoop

Author(s):  
Kareem S. Aggour ◽  
Vijay S. Kumar ◽  
Dipen P. Sangurdekar ◽  
Lee A. Newberg ◽  
Chinnappa D. Kodira
2017 ◽  
Vol 45 (21) ◽  
pp. 12140-12151 ◽  
Author(s):  
Xiaogang Wu ◽  
Taek-Kyun Kim ◽  
David Baxter ◽  
Kelsey Scherler ◽  
Aaron Gordon ◽  
...  

PeerJ ◽  
2015 ◽  
Vol 3 ◽  
pp. e1419 ◽  
Author(s):  
Jose E. Kroll ◽  
Jihoon Kim ◽  
Lucila Ohno-Machado ◽  
Sandro J. de Souza

Motivation.Alternative splicing events (ASEs) are prevalent in the transcriptome of eukaryotic species and are known to influence many biological phenomena. The identification and quantification of these events are crucial for a better understanding of biological processes. Next-generation DNA sequencing technologies have allowed deep characterization of transcriptomes and made it possible to address these issues. ASEs analysis, however, represents a challenging task especially when many different samples need to be compared. Some popular tools for the analysis of ASEs are known to report thousands of events without annotations and/or graphical representations. A new tool for the identification and visualization of ASEs is here described, which can be used by biologists without a solid bioinformatics background.Results.A software suite namedSplicing Expresswas created to perform ASEs analysis from transcriptome sequencing data derived from next-generation DNA sequencing platforms. Its major goal is to serve the needs of biomedical researchers who do not have bioinformatics skills.Splicing Expressperforms automatic annotation of transcriptome data (GTF files) using gene coordinates available from the UCSC genome browser and allows the analysis of data from all available species. The identification of ASEs is done by a known algorithm previously implemented in another tool namedSplooce. As a final result,Splicing Expresscreates a set of HTML files composed of graphics and tables designed to describe the expression profile of ASEs among all analyzed samples. By using RNA-Seq data from the Illumina Human Body Map and the Rat Body Map, we show thatSplicing Expressis able to perform all tasks in a straightforward way, identifying well-known specific events.Availability and Implementation.Splicing Expressis written in Perl and is suitable to run only in UNIX-like systems. More details can be found at:http://www.bioinformatics-brazil.org/splicingexpress.


2022 ◽  
Author(s):  
Andreas B Diendorfer ◽  
Kseniya.Khamina not provided ◽  
marianne.pultar not provided

miND is a NGS data analysis pipeline for smallRNA sequencing data. In this protocol, the pipeline is setup and run on an AWS EC2 instance with example data from a public repository. Please see the publication paper on F1000 for more details on the pipeline and how to use it.


2014 ◽  
Vol 7 (1) ◽  
pp. 314 ◽  
Author(s):  
Getiria Onsongo ◽  
Jesse Erdmann ◽  
Michael D Spears ◽  
John Chilton ◽  
Kenneth B Beckman ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document