Improved Gossypium raimondii Genome Using a Hi-C-based Proximity-Guided Assembly
Abstract Background: Genome sequence plays an important role in both the basic and applied studies. Gossypium raimondii, the putative contributor of the D subgenome of Upland cotton (Gossypium. hirsutum), highlights the need to improve the genome quality in a rapid and efficient way. Methods: we performed Hi-C sequencing of Gossypium raimondii and reassembled its genome based on a set of new Hi-C data and previously published scaffolds. We identified and corrected errors of initial scaffolds before reassembled into chromosomes. Result: A total of 98.42% of sequence was clustered successfully, among which 99.72% of the clustered sequence was ordered and 99.92% of the ordered sequence was oriented with high-quality. Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than previous one. Conclusion: This improvement in Gossypium raimondii genome not only provides a better reference genome to increase study efficiency, but also offers a new way to assemble cotton genomes. Furthermore, Hi-C data of Gossypium raimondii may be used for 3D structure research or regulating analysis.