Genome Size Estimation and Full-length Transcriptome of Sphingonotus tsinlingensis: Genetic Background for the Drought-Adapted Grasshopper
AbstractSphingonotus Fieber, 1852 (Orthoptera: Acrididae) is a species-rich grasshopper genus with ~146 species. All species of this genus prefer dry environments, such as: desert, steppe, sand, and stony benchland. This genomic study aimed to understand the evolution and ecology of these grasshopper species. Here, the genome size of Sphingonotus tsinlingensis was estimated using flow cytometry and the first high-quality full-length transcriptome of this species is presented, which may serve as a reference genetic resource for the drought-adapted grasshopper species of Sphingonotus Fieber. The genome size of Sphingonotus tsinlingensis was ~12.8 Gb. Based on the 146.98 Gb Pacbio isoform sequencing data, 221.47 Mb full-length transcripts were assembled. Among these transcripts, 88,693 non-redundant isoforms were identified with an average length of 2,497 bp and an N50 value of 2,726 bp, which was much longer than the formal grasshopper transcriptome assemblies. A total of 48,502 protein coding sequences were determined, and 37,569 were annotated in public gene function databases. A total of 36,488 simple tandem repeats, 12,765 long non-coding RNAs, and 414 transcription factors were also identified. According to gene functions, 70 heat shock proteins and 61 P450 genes that may correspond to drought adaptation of S. tsinlingensis were identified. The genome of Sphingonotus tsinlingensis is an ultra-large and complex genome. Full-length transcriptome sequencing is an ideal strategy for genomic research. This is the first full-length transcriptome of the genus Sphingonotus. The assembly parameters were better than all known grasshopper transcriptomes. This full-length transcriptome may be used to understand its genetic background and the evolution and ecology of grasshoppers.