Improved Gossypium raimondii Genome Using a Hi-C-based Proximity-Guided Assembly
Abstract Introduction: Genome sequence plays an important role in both basic and applied studies. Gossypium raimondii, the putative contributor of the D subgenome of Upland cotton (G. hirsutum), highlights the need to improve the genome quality rapidly and efficiently. Methods: We performed Hi-C sequencing of G. raimondii and reassembled its genome based on a set of new Hi-C data and previously published scaffolds. We also compared the reassembled genome sequence with the previous published G. raimondii genomes for gene and genome sequence collinearity. Result: A total of 98.42% of scaffold sequence was clustered successfully, among which 99.72% of the clustered sequence was ordered and 99.92% of the ordered sequence was oriented with high-quality. Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than the previous one (Wang et al. 2012). Conclusion: This improvement in G. raimondii genome not only provides a better reference genome to increase study efficiency but also offers a new way to assemble cotton genomes. Furthermore, Hi-C data of G. raimondii may be used for 3D structure research or regulating analysis.