Full-length annotation with multi-strategy RNA-seq uncovers transcriptional regulation of lncRNAs in diploid cotton G. arboreum1
AbstractLong noncoding RNAs (lncRNAs) are crucial factors during plant development and environmental responses. High-throughput and accurate identification of lncRNAs is still lacking in plants. To build an accurate atlas of lncRNA in cotton, we combined Isoform-sequencing (Iso-seq), strand-specific RNA-seq (ssRNA-seq), cap analysis gene expression (CAGE-seq) with PolyA-seq and compiled a pipeline named plant full-length lncRNA (PULL) to integrate multi-omics data. A total of 9240 lncRNAs from 21 tissue samples of the diploid cotton Gossypium arboreum were identified. We revealed that alternative usage of transcription start site (TSS) and transcription end site (TES) of lncRNAs occurs pervasively during plant growth and responses to stress. We identified the lncRNAs which co-expressed or be linked to the protein coding genes (PCGs) or GWAS studied SNPs associated with ovule and fiber development. We also mapped the genome-wide binding sites of two lncRNAs with chromatin isolation by RNA purification sequencing (ChIRP-seq) and validated the trans transcriptional regulation of lnc-Ga13g0352 via virus induced gene suppression (VIGS) assay. These findings provide valuable research resources for plant community and broaden our understandings of biogenesis and regulation function of plant lncRNAs.One sentence summaryThe full-length annotation and transcriptional regulation of long noncoding RNAs in cotton.