paired read
Recently Published Documents


TOTAL DOCUMENTS

7
(FIVE YEARS 1)

H-INDEX

2
(FIVE YEARS 1)

GigaScience ◽  
2020 ◽  
Vol 9 (4) ◽  
Author(s):  
Aleksandr Morgulis ◽  
Richa Agarwala

Abstract Background Alignment of sequence reads generated by next-generation sequencing is an integral part of most pipelines analyzing next-generation sequencing data. A number of tools designed to quickly align a large volume of sequences are already available. However, most existing tools lack explicit guarantees about their output. They also do not support searching genome assemblies, such as the human genome assembly GRCh38, that include primary and alternate sequences and placement information for alternate sequences to primary sequences in the assembly. Findings This paper describes SRPRISM (Single Read Paired Read Indel Substitution Minimizer), an alignment tool for aligning reads without splices. SRPRISM has features not available in most tools, such as (i) support for searching genome assemblies with alternate sequences, (ii) partial alignment of reads with a specified region of reads to be included in the alignment, (iii) choice of ranking schemes for alignments, and (iv) explicit criteria for search sensitivity. We compare the performance of SRPRISM to GEM, Kart, STAR, BWA-MEM, Bowtie2, Hobbes, and Yara using benchmark sets for paired and single reads of lengths 100 and 250 bp generated using DWGSIM. SRPRISM found the best results for most benchmark sets with error rate of up to ∼2.5% and GEM performed best for higher error rates. SRPRISM was also more sensitive than other tools even when sensitivity was reduced to improve run time performance. Conclusions We present SRPRISM as a flexible read mapping tool that provides explicit guarantees on results.


2015 ◽  
Author(s):  
Kristoffer Sahlin ◽  
Mattias Frånberg ◽  
Lars Arvestad

Insert size distributions from paired read protocols are used for inference in bioinformatic applications such as genome assembly and structural variation detection. However, many of the models that are being used are subject to bias. This bias arises when we assume that all insert sizes within a distribution are equally likely to be observed, when in fact, size matters. These systematic errors exist in popular software even when the assumptions made about data are true. We have previously shown that bias occurs for scaffolders in genome assembly. Here, we generalize the theory and demonstrate that it is applicable in other contexts. We provide examples of bias in state-of the-art software and improve them using our model. One key application of our theory is structural variation detection using read pairs. We show that an incorrect null-hypothesis is commonly used in popular tools and can be corrected using our theory. Furthermore, we approximate the smallest size of indels that are possible to discover given an insert size distribution. Two other applications are inference of insert size distribution on \emph{de novo} genome assemblies and error correction of genome assemblies using mated reads. Our theory is implemented in a tool called GetDistr (\url{https://github.com/ksahlin/GetDistr}).


2015 ◽  
Author(s):  
Mitchell J Sullivan ◽  
Scott A Beatson

Over the last decade, the emergence of high-throughput sequencing has led to an increase in both the size and scope of genome sequencing projects. Although genome sequencing and analysis has changed dramatically during this time, the way read alignments are visualised has remained largely unchanged. To address the problem of visualising growing sequencing datasets, we have developed DiscoPlot, a tool for visualising read alignments using a two-dimensional scatterplot. DiscoPlot allows the user to quickly identify genomic rearrangements, misassemblies and sequencing artefacts by providing a scalable method for visualising large sections of the genome. It reads single-end or paired read alignments in SAM, BAM or standard BLAST tab format and creates a scatter plot of opaque crosses representing the alignments to a reference. DiscoPlot is freely available (under a GPL license) for download (Mac OS X, Unix and Windows) at https://mjsull.github.io/DiscoPlot.


2015 ◽  
Author(s):  
Mitchell J Sullivan ◽  
Scott A Beatson

Over the last decade, the emergence of high-throughput sequencing has led to an increase in both the size and scope of genome sequencing projects. Although genome sequencing and analysis has changed dramatically during this time, the way read alignments are visualised has remained largely unchanged. To address the problem of visualising growing sequencing datasets, we have developed DiscoPlot, a tool for visualising read alignments using a two-dimensional scatterplot. DiscoPlot allows the user to quickly identify genomic rearrangements, misassemblies and sequencing artefacts by providing a scalable method for visualising large sections of the genome. It reads single-end or paired read alignments in SAM, BAM or standard BLAST tab format and creates a scatter plot of opaque crosses representing the alignments to a reference. DiscoPlot is freely available (under a GPL license) for download (Mac OS X, Unix and Windows) at https://mjsull.github.io/DiscoPlot.


Genome ◽  
2010 ◽  
Vol 53 (11) ◽  
pp. 1017-1023 ◽  
Author(s):  
Chris Duran ◽  
Dominic Eales ◽  
Daniel Marshall ◽  
Michael Imelfort ◽  
Jiri Stiller ◽  
...  

Association mapping currently relies on the identification of genetic markers. Several technologies have been adopted for genetic marker analysis, with single nucleotide polymorphisms (SNPs) being the most popular where a reasonable quantity of genome sequence data are available. We describe several tools we have developed for the discovery, annotation, and visualization of molecular markers for association mapping. These include autoSNPdb for SNP discovery from assembled sequence data; TAGdb for the identification of gene specific paired read Illumina GAII data; CMap3D for the comparison of mapped genetic and physical markers; and BAC and Gene Annotator for the online annotation of genes and genomic sequences.


Sign in / Sign up

Export Citation Format

Share Document