scholarly journals mosaicFlye: Resolving long mosaic repeats using long error-prone reads

Author(s):  
Anton Bankevich ◽  
Pavel Pevzner

AbstractLong-read technologies revolutionized genome assembly and enabled resolution of bridged repeats (i.e., repeats that are spanned by some reads) in various genomes. However, the problem of resolving unbridged repeats (such as long segmental duplications in the human genome) remains largely unsolved, making it a major obstacle towards achieving the goal of complete genome assemblies. Moreover, the challenge of resolving unbridged repeats is not limited to eukaryotic genomes but also impairs assemblies of bacterial genomes and metagenomes. We describe the mosaicFlye algorithm for resolving complex unbridged repeats based on differences between various repeat copies and show how it improves assemblies of the human genome as well as bacterial genomes and metagenomes. In particular, we show that mosaicFlye results in a complete assembly of both arms of the human chromosome 6.

2021 ◽  
Author(s):  
Arang Rhie ◽  
Ann Mc Cartney ◽  
Kishwar Shafin ◽  
Michael Alonge ◽  
Andrey Bzikadze ◽  
...  

Abstract Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Though derived from highly accurate sequencing, evaluation revealed that the initial T2T draft assembly had evidence of small errors and structural misassemblies. To correct these errors, we designed a novel repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly QV to 73.9. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both PacBio HiFi and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies


2021 ◽  
Author(s):  
Ann M Mc Cartney ◽  
Kishwar Shafin ◽  
Michael Alonge ◽  
Andrey V Bzikadze ◽  
Giulio Formenti ◽  
...  

Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Though derived from highly accurate sequencing, evaluation revealed that the initial T2T draft assembly had evidence of small errors and structural misassemblies. To correct these errors, we designed a novel repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly QV to 73.9. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both PacBio HiFi and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies.


2000 ◽  
Vol 279 (3) ◽  
pp. 879-883 ◽  
Author(s):  
Per-Anders Olsson ◽  
Beat C. Bornhauser ◽  
Laura Korhonen ◽  
Dan Lindholm

DNA Sequence ◽  
1997 ◽  
Vol 8 (3) ◽  
pp. 151-154 ◽  
Author(s):  
A. J. Mungall ◽  
S. J. Humphray ◽  
S. A. Ranby ◽  
C. A. Edwards ◽  
R. W. Heathcott ◽  
...  

2014 ◽  
Vol 44 (9) ◽  
pp. 2571-2576 ◽  
Author(s):  
Felipe Riaño ◽  
Mohindar M. Karunakaran ◽  
Lisa Starick ◽  
Jianqiang Li ◽  
Claus J. Scholz ◽  
...  

Genomics ◽  
1992 ◽  
Vol 14 (2) ◽  
pp. 225-231 ◽  
Author(s):  
Suk P. Oh ◽  
Reginald W. Taylor ◽  
Donald R. Gerecke ◽  
Julie M. Rochelle ◽  
Michael F. Seldin ◽  
...  

2002 ◽  
Vol 53 (4) ◽  
pp. 495-498
Author(s):  
L. Keresztury ◽  
A. Lászik ◽  
A. Falus ◽  
et al

Sign in / Sign up

Export Citation Format

Share Document