scholarly journals Highly Contiguous Nanopore Genome Assembly of Chlamydomonas reinhardtii CC-1690

2020 ◽  
Vol 9 (37) ◽  
Author(s):  
Samuel O’Donnell ◽  
Frederic Chaux ◽  
Gilles Fischer

ABSTRACT The current Chlamydomonas reinhardtii reference genome remains fragmented due to gaps stemming from large repetitive regions. To overcome the vast majority of these gaps, publicly available Oxford Nanopore Technology data were used to create a new reference-quality de novo genome assembly containing only 21 contigs, 30/34 telomeric ends, and a genome size of 111 Mb.

2021 ◽  
Author(s):  
Xinxin Yi ◽  
Jing Liu ◽  
Shengcai Chen ◽  
Hao Wu ◽  
Min Liu ◽  
...  

Cultivated soybean (Glycine max) is an important source for protein and oil. Many elite cultivars with different traits have been developed for different conditions. Each soybean strain has its own genetic diversity, and the availability of more high-quality soybean genomes can enhance comparative genomic analysis for identifying genetic underpinnings for its unique traits. In this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with chromsome contiguity and high accuracy. We annotated 52,840 gene models and reconstructed 74,054 high-quality full-length transcripts. We performed a genome-wide comparative analysis based on the reference genome of JD17 with three published soybeans (WM82, ZH13 and W05) , which identified five large inversions and two large translocations specific to JD17, 20,984 - 46,912 PAVs spanning 13.1 - 46.9 Mb in size, and 5 - 53 large PAV clusters larger than 500kb. 1,695,741 - 3,664,629 SNPs and 446,689 - 800,489 Indels were identified and annotated between JD17 and them. Symbiotic nitrogen fixation (SNF) genes were identified and the effects from these variants were further evaluated. It was found that the coding sequences of 9 nitrogen fixation-related genes were greatly affected. The high-quality genome assembly of JD17 can serve as a valuable reference for soybean functional genomics research.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 23-24
Author(s):  
Kimberly M Davenport ◽  
Derek M Bickhart ◽  
Kim Worley ◽  
Shwetha C Murali ◽  
Noelle Cockett ◽  
...  

Abstract Sheep are an important agricultural species used for both food and fiber in the United States and globally. A high-quality reference genome enhances the ability to discover genetic and biological mechanisms influencing important traits, such as meat and wool quality. The rapid advances in genome assembly algorithms and emergence of increasingly long sequence read length provide the opportunity for an improved de novo assembly of the sheep reference genome. Tissue was collected postmortem from an adult Rambouillet ewe selected by USDA-ARS for the Ovine Functional Annotation of Animal Genomes project. Short-read (55x coverage), long-read PacBio (75x coverage), and Hi-C data from this ewe were retrieved from public databases. We generated an additional 50x coverage of Oxford Nanopore data and assembled the combined long-read data with canu v1.9. The assembled contigs were polished with Nanopolish v0.12.5 and scaffolded using Hi-C data with Salsa v2.2. Gaps were filled with PBsuite v15.8.24 and polished with Nanopolish v0.12.5 followed by removal of duplicate contigs with PurgeDups v1.0.1. Chromosomes were oriented by identifying centromeres and telomeres with RepeatMasker v4.1.1, indicating a need to reverse the orientation of chromosome 11 relative to Oar_rambouillet_v1.0. Final polishing was performed with two rounds of a pipeline which consisted of freebayes v1.3.1 to call variants, Merfin to validate them, and BCFtools to generate the consensus fasta. The ARS-UI_Ramb_v2.0 assembly has improved continuity (contig N50 of 43.19 Mb) with a 19-fold and 38-fold decrease in the number of scaffolds compared with Oar_rambouillet_v1.0 and Oar_v4.0. ARS-UI_Ramb_v2.0 has greater per-base accuracy and fewer insertions and deletions identified from mapped RNA sequence than previous assemblies. This significantly improved reference assembly, public at NCBI GenBank under accession number GCA_016772045, will optimize the functional annotation of the sheep genome and facilitate improved mapping accuracy of genetic variant and expression data for traits relevant the sheep industry.


2019 ◽  
Vol 8 (30) ◽  
Author(s):  
Claudia R. Maynard ◽  
Kyle S. MacLea

The pigmented bacterium Deinococcus radiophilus, which is highly resistant to radiation exposure, was first isolated from irradiated lizardfish. We report a genome assembly of D. radiophilus UWO 1055T (=ATCC 27603T), with a predicted genome size of 2.7 Mbp (62.66% G+C content). A number of CRISPR-associated proteins and two CRISPR arrays were identified.


2020 ◽  
Author(s):  
Kishor Gaikwad ◽  
G. Ramakrishna ◽  
Harsha Srivastava ◽  
Swati Saxena ◽  
Tanvi Kaila ◽  
...  

AbstractClusterbean (Cyamopsis tetragonoloba (L.) Taub.), also known as Guar is a widely cultivated dryland legume of Western India and parts of Africa. Apart from being a vegetable crop, it is also an abundant source of a natural hetero-polysaccharide called guar gum or galactomannan which is widely used in cosmetics, pharmaceuticals, food processing, shale gas drilling etc. Here, for the first time we are reporting a chromosome-scale reference genome assembly of clusterbean, from a high galactomannan containing popular guar cultivar, RGC-936, by combining sequenced reads from Illumina, 10x Chromium and Oxford Nanopore technologies. The initial assembly of 1580 scaffolds with an N50 value of 7.12 Mbp was generated. Then, the final genome assembly was obtained by anchoring these scaffolds to a high density SNP map. Finally, a genome assembly of 550.31 Mbp was obtained in 7 pseudomolecules corresponding to 7 chromosomes with a very high N50 of 78.27 Mbp. We finally predicted 34,680 protein-coding genes in the guar genome. The high-quality chromosome-scale cluster bean genome assembly will facilitate understanding of the molecular basis of galactomannan biosynthesis and aid in genomics-assisted breeding of superior cultivars.


2019 ◽  
Vol 8 (46) ◽  
Author(s):  
Theresa Kuhl ◽  
Marius Felder ◽  
Thomas Nussbaumer ◽  
Doreen Fischer ◽  
Susanne Kublik ◽  
...  

Rhodococcus qingshengii RL1 was isolated from surface-sterilized leaves of Eruca sativa Mill. and shows plant growth-promoting (PGP) properties. The de novo genome assembly consists of one chromosome with 6,253,838 bp and two plasmids with 144,038 bp and 448,745 bp. Many genes could be identified reflecting its PGP potential.


Plants ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 270 ◽  
Author(s):  
Yun Gyeong Lee ◽  
Sang Chul Choi ◽  
Yuna Kang ◽  
Kyeong Min Kim ◽  
Chon-Sik Kang ◽  
...  

The whole genome sequencing (WGS) has become a crucial tool in understanding genome structure and genetic variation. The MinION sequencing of Oxford Nanopore Technologies (ONT) is an excellent approach for performing WGS and it has advantages in comparison with other Next-Generation Sequencing (NGS): It is relatively inexpensive, portable, has simple library preparation, can be monitored in real-time, and has no theoretical limits on reading length. Sorghum bicolor (L.) Moench is diploid (2n = 2x = 20) with a genome size of about 730 Mb, and its genome sequence information is released in the Phytozome database. Therefore, sorghum can be used as a good reference. However, plant species have complex and large genomes when compared to animals or microorganisms. As a result, complete genome sequencing is difficult for plant species. MinION sequencing that produces long-reads can be an excellent tool for overcoming the weak assembly of short-reads generated from NGS by minimizing the generation of gaps or covering the repetitive sequence that appears on the plant genome. Here, we conducted the genome sequencing for S. bicolor cv. BTx623 while using the MinION platform and obtained 895,678 reads and 17.9 gigabytes (Gb) (ca. 25× coverage of reference) from long-read sequence data. A total of 6124 contigs (covering 45.9%) were generated from Canu, and a total of 2661 contigs (covering 50%) were generated from Minimap and Miniasm with a Racon through a de novo assembly using two different tools and mapped assembled contigs against the sorghum reference genome. Our results provide an optimal series of long-read sequencing analysis for plant species while using the MinION platform and a clue to determine the total sequencing scale for optimal coverage that is based on various genome sizes.


2019 ◽  
Vol 8 (32) ◽  
Author(s):  
Yen-Te Liao ◽  
Yujie Zhang ◽  
Alexandra Salvador ◽  
Vivian C. H. Wu

Escherichia phage vB_EcoM-Sa45lw, a new member of the T4-like phages, was isolated from surface water in a produce-growing area. The phage, containing double-stranded DNA with a genome size of 167,353 bp and 282 predicted open reading frames (ORFs), is able to infect generic Escherichia coli and Shiga toxin-producing E. coli O45 and O157 strains.


2020 ◽  
Vol 9 (34) ◽  
Author(s):  
Anina Buchmann ◽  
Harald Gross

ABSTRACT We report an improved de novo draft genome sequence of the human-pathogenic strain Nocardia terpenica IFM 0706T. The resequencing unveiled that the genome size is larger than anticipated, reducing significantly the number of contigs and building a basis for comparison with the closely related strain N. terpenica IFM 0406.


2019 ◽  
Vol 11 (11) ◽  
pp. 3144-3157 ◽  
Author(s):  
Yutaka Satou ◽  
Ryohei Nakamura ◽  
Deli Yu ◽  
Reiko Yoshida ◽  
Mayuko Hamada ◽  
...  

Abstract Since its initial publication in 2002, the genome of Ciona intestinalis type A (Ciona robusta), the first genome sequence of an invertebrate chordate, has provided a valuable resource for a wide range of biological studies, including developmental biology, evolutionary biology, and neuroscience. The genome assembly was updated in 2008, and it included 68% of the sequence information in 14 pairs of chromosomes. However, a more contiguous genome is required for analyses of higher order genomic structure and of chromosomal evolution. Here, we provide a new genome assembly for an inbred line of this animal, constructed with short and long sequencing reads and Hi-C data. In this latest assembly, over 95% of the 123 Mb of sequence data was included in the chromosomes. Short sequencing reads predicted a genome size of 114–120 Mb; therefore, it is likely that the current assembly contains almost the entire genome, although this estimate of genome size was smaller than previous estimates. Remapping of the Hi-C data onto the new assembly revealed a large inversion in the genome of the inbred line. Moreover, a comparison of this genome assembly with that of Ciona savignyi, a different species in the same genus, revealed many chromosomal inversions between these two Ciona species, suggesting that such inversions have occurred frequently and have contributed to chromosomal evolution of Ciona species. Thus, the present assembly greatly improves an essential resource for genome-wide studies of ascidians.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Gokhan Yavas ◽  
Huixiao Hong ◽  
Wenming Xiao

Abstract Background Accurate de novo genome assembly has become reality with the advancements in sequencing technology. With the ever-increasing number of de novo genome assembly tools, assessing the quality of assemblies has become of great importance in genome research. Although many quality metrics have been proposed and software tools for calculating those metrics have been developed, the existing tools do not produce a unified measure to reflect the overall quality of an assembly. Results To address this issue, we developed the de novo Assembly Quality Evaluation Tool (dnAQET) that generates a unified metric for benchmarking the quality assessment of assemblies. Our framework first calculates individual quality scores for the scaffolds/contigs of an assembly by aligning them to a reference genome. Next, it computes a quality score for the assembly using its overall reference genome coverage, the quality score distribution of its scaffolds and the redundancy identified in it. Using synthetic assemblies randomly generated from the latest human genome build, various builds of the reference genomes for five organisms and six de novo assemblies for sample NA24385, we tested dnAQET to assess its capability for benchmarking quality evaluation of genome assemblies. For synthetic data, our quality score increased with decreasing number of misassemblies and redundancy and increasing average contig length and coverage, as expected. For genome builds, dnAQET quality score calculated for a more recent reference genome was better than the score for an older version. To compare with some of the most frequently used measures, 13 other quality measures were calculated. The quality score from dnAQET was found to be better than all other measures in terms of consistency with the known quality of the reference genomes, indicating that dnAQET is reliable for benchmarking quality assessment of de novo genome assemblies. Conclusions The dnAQET is a scalable framework designed to evaluate a de novo genome assembly based on the aggregated quality of its scaffolds (or contigs). Our results demonstrated that dnAQET quality score is reliable for benchmarking quality assessment of genome assemblies. The dnQAET can help researchers to identify the most suitable assembly tools and to select high quality assemblies generated.


Sign in / Sign up

Export Citation Format

Share Document