scholarly journals Atropos: specific, sensitive, and speedy trimming of sequencing reads

Author(s):  
John P Didion ◽  
Francis S Collins

A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves a four-fold increase in trimming accuracy and a decrease in execution time of ~50% (using 16 parallel execution threads). Furthermore, Atropos maintains high accuracy even when trimming simulated data with a high rate of error. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of most current-generation sequencing data sets. Atropos is open source and free software written in Python and available at https://github.com/jdidion/atropos.

2017 ◽  
Author(s):  
John P Didion ◽  
Marcel Martin ◽  
Francis S Collins

A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves a four-fold increase in trimming accuracy and a decrease in execution time of ~50% (using 16 parallel execution threads). Furthermore, Atropos maintains high accuracy even when trimming simulated data with a high rate of error. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of most current-generation sequencing data sets. Atropos is open source and free software written in Python and available at https://github.com/jdidion/atropos.


2016 ◽  
Author(s):  
John P Didion ◽  
Francis S Collins

A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves a four-fold increase in trimming accuracy and a decrease in execution time of ~50% (using 16 parallel execution threads). Furthermore, Atropos maintains high accuracy even when trimming simulated data with a high rate of error. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of most current-generation sequencing data sets. Atropos is open source and free software written in Python and available at https://github.com/jdidion/atropos.


2020 ◽  
Vol 2 (1) ◽  
Author(s):  
Sen Zhao ◽  
Andreas M Hoff ◽  
Rolf I Skotheim

Abstract Bioinformatics tools for fusion transcript detection from RNA-sequencing data are in general developed for identification of novel fusions, which demands a high number of supporting reads and strict filters to avoid false discoveries. As our knowledge of bona fide fusion genes becomes more saturated, there is a need to establish their prevalence with high sensitivity. We present ScaR, a tool that uses a supervised scaffold realignment approach for sensitive fusion detection in RNA-seq data. ScaR detects a set of 130 synthetic fusion transcripts from simulated data at a higher sensitivity compared to established fusion finders. Applied to fusion transcripts potentially involved in testicular germ cell tumors (TGCTs), ScaR detects the fusions RCC1-ABHD12B and CLEC6A-CLEC4D in 9% and 28% of 150 TGCTs, respectively. The fusions were not detected in any of 198 normal testis tissues. Thus, we demonstrate high prevalence of RCC1-ABHD12B and CLEC6A-CLEC4D in TGCTs, and their cancer specific features. Further, we find that RCC1-ABHD12B and CLEC6A-CLEC4D are predominantly expressed in the seminoma and embryonal carcinoma histological subtypes of TGCTs, respectively. In conclusion, ScaR is useful for establishing the frequency of known and validated fusion transcripts in larger data sets and detecting clinically relevant fusion transcripts with high sensitivity.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3720 ◽  
Author(s):  
John P. Didion ◽  
Marcel Martin ◽  
Francis S. Collins

A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves significant increases in trimming accuracy while remaining competitive in execution times. Furthermore, Atropos maintains high accuracy even when trimming data with elevated rates of sequencing errors. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of Illumina, ABI SOLiD, and other current-generation short-read sequencing datasets. Atropos is open source and free software written in Python (3.3+) and available at https://github.com/jdidion/atropos.


2017 ◽  
Author(s):  
John P Didion ◽  
Marcel Martin ◽  
Francis S Collins

A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves significant increases in trimming accuracy while remaining competitive in execution times. Furthermore, Atropos maintains high accuracy even when trimming data with elevated rates of sequencing errors. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of Illumina, ABI SOLiD, and other current-generation short-read sequencing datasets. Availability. Atropos is open source and free software written in Python (3.3+) and available at https://github.com/jdidion/atropos.


2017 ◽  
Author(s):  
John P Didion ◽  
Marcel Martin ◽  
Francis S Collins

A key step in the transformation of raw sequencing reads into biological insights is the trimming of adapter sequences and low-quality bases. Read trimming has been shown to increase the quality and reliability while decreasing the computational requirements of downstream analyses. Many read trimming software tools are available; however, no tool simultaneously provides the accuracy, computational efficiency, and feature set required to handle the types and volumes of data generated in modern sequencing-based experiments. Here we introduce Atropos and show that it trims reads with high sensitivity and specificity while maintaining leading-edge speed. Compared to other state-of-the-art read trimming tools, Atropos achieves significant increases in trimming accuracy while remaining competitive in execution times. Furthermore, Atropos maintains high accuracy even when trimming data with elevated rates of sequencing errors. The accuracy, high performance, and broad feature set offered by Atropos makes it an appropriate choice for the pre-processing of Illumina, ABI SOLiD, and other current-generation short-read sequencing datasets. Availability. Atropos is open source and free software written in Python (3.3+) and available at https://github.com/jdidion/atropos.


2019 ◽  
Author(s):  
Sen Zhao ◽  
Andreas M. Hoff ◽  
Rolf I. Skotheim

AbstractBioinformatics tools for fusion transcript detection from RNA-sequencing data are in general developed for identification of novel fusions, which demands a high number of supporting reads and strict filters to avoid false discoveries. As our knowledge of bona-fide fusion genes becomes more saturated, there is a need to establish their prevalence with high sensitivity. We present ScaR, a tool that uses a scaffold realignment approach for sensitive fusion detection in RNA-seq data. ScaR detects a set of 50 synthetic fusion transcripts from simulated data at a higher sensitivity compared to established fusion finders. Applied to fusion transcripts potentially involved in testicular germ cell tumors (TGCTs), ScaR detects the fusions RCC1-ABHD12B and CLEC6A-CLEC4D in 9% and 28% of 150 TGCTs, respectively. The fusions were not detected in any of 198 normal testis tissues. Thus, we demonstrate high prevalence of RCC1-ABHD12B and CLEC6A-CLEC4D in TGCTs, and their cancer specific features. Further, we find that RCC1-ABHD12B and CLEC6A-CLEC4D are predominantly expressed in the seminoma and embryonal carcinoma histological subtypes of TGCTs, respectively. In conclusion, ScaR is useful for establishing the frequency of known fusion transcripts in larger data sets and detecting clinically relevant fusion transcripts with high sensitivity.Availabilityhttps://github.com/senzhaocode/ScaR


2014 ◽  
Vol 13s1 ◽  
pp. CIN.S13890 ◽  
Author(s):  
Changjin Hong ◽  
Solaiappan Manimaran ◽  
William Evan Johnson

Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/ .


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Dat Thanh Nguyen ◽  
Quang Thinh Trac ◽  
Thi-Hau Nguyen ◽  
Ha-Nam Nguyen ◽  
Nir Ohad ◽  
...  

Abstract Background Circular RNA (circRNA) is an emerging class of RNA molecules attracting researchers due to its potential for serving as markers for diagnosis, prognosis, or therapeutic targets of cancer, cardiovascular, and autoimmune diseases. Current methods for detection of circRNA from RNA sequencing (RNA-seq) focus mostly on improving mapping quality of reads supporting the back-splicing junction (BSJ) of a circRNA to eliminate false positives (FPs). We show that mapping information alone often cannot predict if a BSJ-supporting read is derived from a true circRNA or not, thus increasing the rate of FP circRNAs. Results We have developed Circall, a novel circRNA detection method from RNA-seq. Circall controls the FPs using a robust multidimensional local false discovery rate method based on the length and expression of circRNAs. It is computationally highly efficient by using a quasi-mapping algorithm for fast and accurate RNA read alignments. We applied Circall on two simulated datasets and three experimental datasets of human cell-lines. The results show that Circall achieves high sensitivity and precision in the simulated data. In the experimental datasets it performs well against current leading methods. Circall is also substantially faster than the other methods, particularly for large datasets. Conclusions With those better performances in the detection of circRNAs and in computational time, Circall facilitates the analyses of circRNAs in large numbers of samples. Circall is implemented in C++ and R, and available for use at https://www.meb.ki.se/sites/biostatwiki/circall and https://github.com/datngu/Circall.


2021 ◽  
Vol 11 (12) ◽  
pp. 5734
Author(s):  
Ammar Armghan

This paper investigates the effect of complementary metaresonator for evaluation of vegetable oils in C and X bands. Tremendously increasing technology demands the exploration of complementary metaresonators for high performance in the related bands. This research probes the complementary mirror-symmetric S resonator (CMSSR) that can operate in two bands with compact size and high sensitivity features. The prime motivation behind the proposed technique is to utilize the dual notch resonance to estimate the dielectric constant of the oil under test (OUT). The proposed sensor is designed on a compact 30×25 mm2 and 1.6 mm thick FR-4 substrate. A 50 Ω microstrip transmission line is printed on one side, while a unit cell of CMSSR is etched on the other side of the substrate to achieve dual notch resonance. A Teflon container is attached to CMSSR in the ground plane to act as a pool for the OUT. According to the simulated transmission spectrum, the proposed design manifested dual notch resonance precisely at 7.21 GHz (C band) and 8.97 GHz (X band). A prototype of complementary metaresonator sensor is fabricated and tested using CEYEAR AV3672D vector network analyzer. The comparison of measured and simulated data shows that the difference between the first resonance frequency is 0.01 GHz and the second is 0.04 GHz. Furthermore, a mathematical model is developed for the complementary metaresonator sensor to evaluate dielectric constant of the OUT in terms of the relevant, resonant frequency.


Sign in / Sign up

Export Citation Format

Share Document