Improved Gossypium Raimondii Genome Using a Hi-C-based Proximity-Guided Assembly
Abstract Genome sequence plays an important role both in basic and applied studies. Gossypium raimondii, the putative contributor of the D subgenome of Upland cotton (G. hirsutum), highlights the need to improve the genome quality in a rapid and efficient way. Here, we performed Hi-C sequencing of G. raimondii and reassembled its genome based on new Hi-C data and previously published scaffolds. We identified and corrected errors of initial scaffolds before reassembled into chromosomes. In total 98.42% of sequence was clustered successfully, among which 99.72% of the clustered sequence was ordered and 99.92% of the ordered sequence was oriented with high-quality. Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than previous one. This improvement in G. raimondii genome not only provides a better reference genome to increase study efficiency, but also offers a new way to assemble cotton genomes. Furthermore, Hi-C data of G. raimondii may be used for 3D structure research or regulating analysis.