scholarly journals An optimized approach for local de novo assembly of overlapping paired-end RAD reads from multiple individuals

2018 ◽  
Vol 5 (2) ◽  
pp. 171589 ◽  
Author(s):  
Yu-Long Li ◽  
Dong-Xiu Xue ◽  
Bai-Dong Zhang ◽  
Jin-Xian Liu

Restriction site-associated DNA (RAD) sequencing is revolutionizing studies in ecological, evolutionary and conservation genomics. However, the assembly of paired-end RAD reads with random-sheared ends is still challenging, especially for non-model species with high genetic variance. Here, we present an efficient optimized approach with a pipeline software, RADassembler, which makes full use of paired-end RAD reads with random-sheared ends from multiple individuals to assemble RAD contigs. RADassembler integrates the algorithms for choosing the optimal number of mismatches within and across individuals at the clustering stage, and then uses a two-step assembly approach at the assembly stage. RADassembler also uses data reduction and parallelization strategies to promote efficiency. Compared to other tools, both the assembly results based on simulation and real RAD datasets demonstrated that RADassembler could always assemble the appropriate number of contigs with high qualities, and more read pairs were properly mapped to the assembled contigs. This approach provides an optimal tool for dealing with the complexity in the assembly of paired-end RAD reads with random-sheared ends for non-model species in ecological, evolutionary and conservation studies. RADassembler is available at https://github.com/lyl8086/RADscripts.

2012 ◽  
Vol 24 (2) ◽  
pp. 660-675 ◽  
Author(s):  
Anna Stengel ◽  
Irene L. Gügel ◽  
Daniel Hilger ◽  
Birgit Rengstl ◽  
Heinrich Jung ◽  
...  

2021 ◽  
Vol 18 (2) ◽  
pp. 170-175 ◽  
Author(s):  
Haoyu Cheng ◽  
Gregory T. Concepcion ◽  
Xiaowen Feng ◽  
Haowen Zhang ◽  
Heng Li
Keyword(s):  

Author(s):  
Guangtu Gao ◽  
Susana Magadan ◽  
Geoffrey C Waldbieser ◽  
Ramey C Youngblood ◽  
Paul A Wheeler ◽  
...  

Abstract Currently, there is still a need to improve the contiguity of the rainbow trout reference genome and to use multiple genetic backgrounds that will represent the genetic diversity of this species. The Arlee doubled haploid line was originated from a domesticated hatchery strain that was originally collected from the northern California coast. The Canu pipeline was used to generate the Arlee line genome de-novo assembly from high coverage PacBio long-reads sequence data. The assembly was further improved with Bionano optical maps and Hi-C proximity ligation sequence data to generate 32 major scaffolds corresponding to the karyotype of the Arlee line (2 N = 64). It is composed of 938 scaffolds with N50 of 39.16 Mb and a total length of 2.33 Gb, of which ∼95% was in 32 chromosome sequences with only 438 gaps between contigs and scaffolds. In rainbow trout the haploid chromosome number can vary from 29 to 32. In the Arlee karyotype the haploid chromosome number is 32 because chromosomes Omy04, 14 and 25 are divided into six acrocentric chromosomes. Additional structural variations that were identified in the Arlee genome included the major inversions on chromosomes Omy05 and Omy20 and additional 15 smaller inversions that will require further validation. This is also the first rainbow trout genome assembly that includes a scaffold with the sex-determination gene (sdY) in the chromosome Y sequence. The utility of this genome assembly is demonstrated through the improved annotation of the duplicated genome loci that harbor the IGH genes on chromosomes Omy12 and Omy13.


2018 ◽  
Vol 19 (2) ◽  
pp. 520 ◽  
Author(s):  
Le Zhao ◽  
Xinmei Zhang ◽  
Zhongying Qiu ◽  
Yuan Huang
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document