scholarly journals Improved Gossypium raimondii Genome Using a Hi-C-based Proximity-Guided Assembly

2021 ◽  
Author(s):  
Guoli song ◽  
Qiuhong Yang ◽  
Dongyun Zuo ◽  
Hailiang Cheng ◽  
Youping Zhang ◽  
...  

Abstract Introduction: Genome sequence plays an important role in both basic and applied studies. Gossypium raimondii, the putative contributor of the D subgenome of Upland cotton (G. hirsutum), highlights the need to improve the genome quality rapidly and efficiently. Methods: We performed Hi-C sequencing of G. raimondii and reassembled its genome based on a set of new Hi-C data and previously published scaffolds. We also compared the reassembled genome sequence with the previous published G. raimondii genomes for gene and genome sequence collinearity. Result: A total of 98.42% of scaffold sequence was clustered successfully, among which 99.72% of the clustered sequence was ordered and 99.92% of the ordered sequence was oriented with high-quality. Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than the previous one (Wang et al. 2012). Conclusion: This improvement in G. raimondii genome not only provides a better reference genome to increase study efficiency but also offers a new way to assemble cotton genomes. Furthermore, Hi-C data of G. raimondii may be used for 3D structure research or regulating analysis.

2020 ◽  
Author(s):  
Huoli Song ◽  
Qiuhong Yang ◽  
Dongyun Zuo ◽  
Qiaolian Wang ◽  
Hailiang Cheng ◽  
...  

Abstract Genome sequence plays an important role both in basic and applied studies. Gossypium raimondii, the putative contributor of the D subgenome of Upland cotton (G. hirsutum), highlights the need to improve the genome quality in a rapid and efficient way. Here, we performed Hi-C sequencing of G. raimondii and reassembled its genome based on new Hi-C data and previously published scaffolds. We identified and corrected errors of initial scaffolds before reassembled into chromosomes. In total 98.42% of sequence was clustered successfully, among which 99.72% of the clustered sequence was ordered and 99.92% of the ordered sequence was oriented with high-quality. Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than previous one. This improvement in G. raimondii genome not only provides a better reference genome to increase study efficiency, but also offers a new way to assemble cotton genomes. Furthermore, Hi-C data of G. raimondii may be used for 3D structure research or regulating analysis.


2020 ◽  
Author(s):  
Guoli Song ◽  
Qiuhong Yang ◽  
Dongyun Zuo ◽  
Qiaolian Wang ◽  
Hailiang Cheng ◽  
...  

Abstract Background: Genome sequence plays an important role in both the basic and applied studies. Gossypium raimondii, the putative contributor of the D subgenome of Upland cotton (Gossypium. hirsutum), highlights the need to improve the genome quality in a rapid and efficient way. Methods: we performed Hi-C sequencing of Gossypium raimondii and reassembled its genome based on a set of new Hi-C data and previously published scaffolds. We identified and corrected errors of initial scaffolds before reassembled into chromosomes. Result: A total of 98.42% of sequence was clustered successfully, among which 99.72% of the clustered sequence was ordered and 99.92% of the ordered sequence was oriented with high-quality. Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than previous one. Conclusion: This improvement in Gossypium raimondii genome not only provides a better reference genome to increase study efficiency, but also offers a new way to assemble cotton genomes. Furthermore, Hi-C data of Gossypium raimondii may be used for 3D structure research or regulating analysis.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Qiuhong YANG ◽  
Dongyun ZUO ◽  
Hailiang CHENG ◽  
Youping ZHANG ◽  
Qiaolian WANG ◽  
...  

Abstract Introduction Genome sequence plays an important role in both basic and applied studies. Gossypium raimondii, the putative contributor of the D subgenome of upland cotton (G. hirsutum), highlights the need to improve the genome quality rapidly and efficiently. Methods We performed Hi-C sequencing of G. raimondii and reassembled its genome based on a set of new Hi-C data and previously published scaffolds. We also compared the reassembled genome sequence with the previously published G. raimondii genomes for gene and genome sequence collinearity. Result A total of 98.42% of scaffold sequences were clustered successfully, among which 99.72% of the clustered sequences were ordered and 99.92% of the ordered sequences were oriented with high-quality. Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than the previous one (Nat Genet 44:98–1103, 2012). Conclusion This improvement in G. raimondii genome not only provides a better reference to increase study efficiency but also offers a new way to assemble cotton genomes. Furthermore, Hi-C data of G. raimondii may be used for 3D structure research or regulating analysis.


2021 ◽  
Author(s):  
Jie Wang ◽  
Shiming Li ◽  
Lei Lan ◽  
Mushan Xie ◽  
Shu Cheng ◽  
...  

Abstract Background: Setaria italica is the second-most widely planted species of millets in the world and an important model grain crop for the research of C4 photosynthesis and abiotic stress tolerance. Through three genomes assembly and annotation efforts, all genomes were based on next generation sequencing technology, which limited the genome continuity. Results: Here we report a high-quality whole-genome of new cultivar Huagu11, using single-molecule real-time sequencing and High-throughput chromosome conformation capture (Hi-C) mapping technologies. The total assembly size of the Huagu11 genome was 408.37 Mb with a scaffold N50 size of 45.89 Mb. Compared with the other three reported millet genomes based on the next generation sequencing technology, the Huagu11 genome had the highest genomic continuity. Intraspecies comparison showed about 94.97% and 94.66% of the Yugu1 and Huagu11 genomes, respectively, were able to be aligned as one-to-one blocks with four chromosome inversion. The Huagu11 genome contained approximately 19.43 Mb Presence/absence Variation (PAV) with 627 protein-coding transcripts, while Yugu1 genomes had 20.53 Mb PAV sequences encoding 737 proteins. Overall, 969,596 Single-nucleotide polymorphism (SNPs) and 156,282 insertion-deletion (InDels) were identified between these two genomes. The genome comparison between Huagu11 and Yugu1 should reflect the genetic identity and variation between the cultivars of foxtail millet to a certain extent. The Ser-626-Aln substitution in acetohydroxy acid synthase (AHAS) was found to be relative to the imazethapyr tolerance in Huagu11. Conclusions: A new improved high-quality reference genome sequence of Setaria italica was assembled, and intraspecies genome comparison determined the genetic identity and variation between the cultivars of foxtail millet. Based on the genome sequence, it was found that the Ser-626-Aln substitution in AHAS was responsible for the imazethapyr tolerance in Huagu11. The new improved reference genome of Setaria italica will promote the genic and genomic studies of this species and be beneficial for cultivar improvement.


Plant Disease ◽  
2020 ◽  
Author(s):  
Chengming Yu ◽  
Yufei Diao ◽  
Quan Lu ◽  
Jiaping Zhao ◽  
Shengnan Cui ◽  
...  

Botryosphaeria dothidea is a latent and important fungal pathogen on a wide range of woody plants. Fruit ring rot caused by B. dothidea is a major disease in China on apple. This study establishes a high quality, nearly complete and well annotated genome sequence of B. dothidea strain sdau11-99. The findings of this research provide a reference genome resource for further research on the apple fruit ring rot pathogen on apple and other hosts.


GigaScience ◽  
2020 ◽  
Vol 9 (9) ◽  
Author(s):  
Gina M Pham ◽  
John P Hamilton ◽  
Joshua C Wood ◽  
Joseph T Burke ◽  
Hainan Zhao ◽  
...  

Abstract Background Worldwide, the cultivated potato, Solanum tuberosum L., is the No. 1 vegetable crop and a critical food security crop. The genome sequence of DM1–3 516 R44, a doubled monoploid clone of S. tuberosum Group Phureja, was published in 2011 using a whole-genome shotgun sequencing approach with short-read sequence data. Current advanced sequencing technologies now permit generation of near-complete, high-quality chromosome-scale genome assemblies at minimal cost. Findings Here, we present an updated version of the DM1–3 516 R44 genome sequence (v6.1) using Oxford Nanopore Technologies long reads coupled with proximity-by-ligation scaffolding (Hi-C), yielding a chromosome-scale assembly. The new (v6.1) assembly represents 741.6 Mb of sequence (87.8%) of the estimated 844 Mb genome, of which 741.5 Mb is non-gapped with 731.2 Mb anchored to the 12 chromosomes. Use of Oxford Nanopore Technologies full-length complementary DNA sequencing enabled annotation of 32,917 high-confidence protein-coding genes encoding 44,851 gene models that had a significantly improved representation of conserved orthologs compared with the previous annotation. The new assembly has improved contiguity with a 595-fold increase in N50 contig size, 99% reduction in the number of contigs, a 44-fold increase in N50 scaffold size, and an LTR Assembly Index score of 13.56, placing it in the category of reference genome quality. The improved assembly also permitted annotation of the centromeres via alignment to sequencing reads derived from CENH3 nucleosomes. Conclusions Access to advanced sequencing technologies and improved software permitted generation of a high-quality, long-read, chromosome-scale assembly and improved annotation dataset for the reference genotype of potato that will facilitate research aimed at improving agronomic traits and understanding genome evolution.


2021 ◽  
Vol 10 (21) ◽  
Author(s):  
Jason E. Stajich ◽  
Andrea L. Vu ◽  
Howard S. Judelson ◽  
Gregory M. Vogel ◽  
Michael A. Gore ◽  
...  

The oomycete Phytophthora capsici is a destructive pathogen of a wide range of vegetable hosts, especially peppers and cucurbits. A 94.17-Mb genome assembly was constructed using PacBio and Illumina data and annotated with support from transcriptome sequencing (RNA-Seq) reads.


2020 ◽  
Vol 13 (3) ◽  
Author(s):  
Zhenqiao Song ◽  
Caicai Lin ◽  
Piyi Xing ◽  
Yuanyuan Fen ◽  
Hua Jin ◽  
...  

Plant Disease ◽  
2021 ◽  
Author(s):  
Yu-Ting Sheng Sheng ◽  
Xiao-Li Yu ◽  
Ting-Ting Mao ◽  
Juan Zhang ◽  
Xiao-Tong Guo ◽  
...  

Peanut scorch spot caused by Leptosphaerulina arachidicola is one of the most severe leaf diseases of peanut that causes significant yield loss. Here, we report the first high quality genome sequence of L. arachidicola JB313 isolated from an infected peanut leaf in China. The genome size is 47.66 Mb, consisting of 65 scaffolds (N50 length = 1.58 Mb) with a G+C content of 49.05%. The information in this report will provide a reference genome for future studies on peanut scorch spot pathogen in peanut.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jie Wang ◽  
Shiming Li ◽  
Lei Lan ◽  
Mushan Xie ◽  
Shu Cheng ◽  
...  

Abstract Background Setaria italica is the second-most widely planted species of millets in the world and an important model grain crop for the research of C4 photosynthesis and abiotic stress tolerance. Through three genomes assembly and annotation efforts, all genomes were based on next generation sequencing technology, which limited the genome continuity. Results Here we report a high-quality whole-genome of new cultivar Huagu11, using single-molecule real-time sequencing and High-throughput chromosome conformation capture (Hi-C) mapping technologies. The total assembly size of the Huagu11 genome was 408.37 Mb with a scaffold N50 size of 45.89 Mb. Compared with the other three reported millet genomes based on the next generation sequencing technology, the Huagu11 genome had the highest genomic continuity. Intraspecies comparison showed about 94.97 and 94.66% of the Yugu1 and Huagu11 genomes, respectively, were able to be aligned as one-to-one blocks with four chromosome inversion. The Huagu11 genome contained approximately 19.43 Mb Presence/absence Variation (PAV) with 627 protein-coding transcripts, while Yugu1 genomes had 20.53 Mb PAV sequences encoding 737 proteins. Overall, 969,596 Single-nucleotide polymorphism (SNPs) and 156,282 insertion-deletion (InDels) were identified between these two genomes. The genome comparison between Huagu11 and Yugu1 should reflect the genetic identity and variation between the cultivars of foxtail millet to a certain extent. The Ser-626-Aln substitution in acetohydroxy acid synthase (AHAS) was found to be relative to the imazethapyr tolerance in Huagu11. Conclusions A new improved high-quality reference genome sequence of Setaria italica was assembled, and intraspecies genome comparison determined the genetic identity and variation between the cultivars of foxtail millet. Based on the genome sequence, it was inferred that the Ser-626-Aln substitution in AHAS was responsible for the imazethapyr tolerance in Huagu11. The new improved reference genome of Setaria italica will promote the genic and genomic studies of this species and be beneficial for cultivar improvement.


Sign in / Sign up

Export Citation Format

Share Document