scholarly journals LSCplus: a fast solution for improving long read accuracy by short read alignment

2016 ◽  
Vol 17 (1) ◽  
Author(s):  
Ruifeng Hu ◽  
Guibo Sun ◽  
Xiaobo Sun
PLoS ONE ◽  
2012 ◽  
Vol 7 (10) ◽  
pp. e46679 ◽  
Author(s):  
Kin Fai Au ◽  
Jason G. Underwood ◽  
Lawrence Lee ◽  
Wing Hung Wong

2021 ◽  
Author(s):  
Ryan R Wick ◽  
Kathryn E Holt

Long-read-only bacterial genome assemblies usually contain residual errors, most commonly homopolymer-length errors. Short-read polishing tools can use short reads to fix these errors, but most rely on short-read alignment which is unreliable in repeat regions. Errors in such regions are therefore challenging to fix and often remain after short-read polishing. Here we introduce Polypolish, a new short-read polisher which uses all-per-read alignments to repair errors in repeat sequences that other polishers cannot. In benchmarking tests using both simulated and real reads, we find that Polypolish performs well, and the best results are achieved by using Polypolish in combination with other short-read polishers.


2021 ◽  
Author(s):  
William J Bolosky ◽  
Arun Subramaniyan ◽  
Matei Zaharia ◽  
Ravi Pandya ◽  
Taylor Sittler ◽  
...  

Much genomic data comes in the form of paired-end reads: two reads that represent genetic material with a small gap between. We present a new algorithm for aligning both reads in a pair simultaneously by fuzzily intersecting the sets of candidate alignment locations for each read. This algorithm is often much faster and produces alignments that result in variant calls having roughly the same concordance as the best competing aligners.


2018 ◽  
Vol 17 (2) ◽  
pp. 237-240 ◽  
Author(s):  
Farzaneh Zokaee ◽  
Hamid R. Zarandi ◽  
Lei Jiang

2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Maryam AlJame ◽  
Imtiaz Ahmad

The evolution of technologies has unleashed a wealth of challenges by generating massive amount of data. Recently, biological data has increased exponentially, which has introduced several computational challenges. DNA short read alignment is an important problem in bioinformatics. The exponential growth in the number of short reads has increased the need for an ideal platform to accelerate the alignment process. Apache Spark is a cluster-computing framework that involves data parallelism and fault tolerance. In this article, we proposed a Spark-based algorithm to accelerate DNA short reads alignment problem, and it is called Spark-DNAligning. Spark-DNAligning exploits Apache Spark ’s performance optimizations such as broadcast variable, join after partitioning, caching, and in-memory computations. Spark-DNAligning is evaluated in term of performance by comparing it with SparkBWA tool and a MapReduce based algorithm called CloudBurst. All the experiments are conducted on Amazon Web Services (AWS). Results demonstrate that Spark-DNAligning outperforms both tools by providing a speedup in the range of 101–702 in aligning gigabytes of short reads to the human genome. Empirical evaluation reveals that Apache Spark offers promising solutions to DNA short reads alignment problem.


2011 ◽  
Vol 27 (10) ◽  
pp. 1351-1358 ◽  
Author(s):  
Jochen Blom ◽  
Tobias Jakobi ◽  
Daniel Doppmeier ◽  
Sebastian Jaenicke ◽  
Jörn Kalinowski ◽  
...  

Author(s):  
James Arram ◽  
Thomas Kaplan ◽  
Wayne Luk ◽  
Peiyong Jiang

Sign in / Sign up

Export Citation Format

Share Document